What is Computer Vision? – Part 2: Human Vision

This is the second in a series of posts by Blippar's Strategic Planner Sam Ashken on computer vision for non-technical people. If that’s you, then read on! As a foundation, we’re going to start with a couple of posts on human vision. It’s easier to understand how computers see by comparing with how people see.


This is post 2 in a series of posts on computer vision for non-technical people. The first couple of posts are on human vision in order to have a point of comparison for later posts on computer vision. If you didn’t read the first post already, it’s probably worth doing that.

Did you guess what the image at the end of the post is? Here it is again.

What is this?
It’s the outline of Africa rotated through 90° anti-clockwise. It’s hard to recognize this, even though the shape itself is fairly familiar, because our visual system has a bias for the up-down orientation as a way to define shapes. Biases and assumptions are going to come up a few times in this post.

Having talked about depth perception as one of the ill-defined problems which the visual system has to solve, in this post I’m going to introduce another – colour perception – before talking about the role of probabilistic thinking in solving these problems.

Colour, as with vision in general, feel so intuitive that it can be hard to grasp why it would be difficult for the visual system to correctly identify a colour.

We experience colour because an object which light bounces off absorbs some parts of the spectrum and reflects others. What we see is the part which gets reflected.

Apple

Sound simple? Afraid not. How does the visual system infer what light frequencies an object absorbed when you don’t know what light frequencies hit it in the first place? A particular light frequency could have not arrived at your eye because it was absorbed by the object or because it wasn’t in the spectrum which hit the object in the first place.

The visual system solves this problem for the most part very successfully and it does so by making assumptions and using rules of thumb. One such rule of thumb is to ignore gradual changes in light level so that the colour of a surface can be understood without being misled by shadows. All sounds fair enough. Now look at this.

shadow1

Which square is darker? A or B? Obvious, A of course. Afraid not. They’re the same colour. No they really are.

Shadow2

There’s an excellent explanation for why this illusion is so powerful on Wikipedia, including factors like contrast and over-compensation for shadow. But really what this illusion shows is how good the visual system is at solving the colour problem.

I’ve talked a couple of times over the two posts about assumptions and rules of thumb used by the visual system, and another way of saying this is that the visual system “hates coincidences”. The Eiffel Tower photo in the previous post works, albeit briefly, as the visual system hates the coincidence that a grasping hand in the foreground would just happen to line up perfectly with the real world Eiffel Tower in the background. So the visual system briefly assumes it’s looking at a small Eiffel Tower model being clasped.

Eiffel Tower visual trick

You’ve probably seen photos like the Eiffel Tower one before, but maybe not the work of artist and psychologist Adelbert Ames, Jr. Check it out below.

Room

This is an image of a child and a blown-up giant version of the same child in a cube-shaped room, right? Afraid not.

It’s actually a room where the back wall slopes away into the distance, and with a low ceiling. The child on the right hasn’t been blown up, it’s that the child on the left is in fact in the distance. The room isn’t in fact cube-shaped, it’s just a trick of perspective which has made it look that way. See this video for a great explanation.

As with the Eiffel Tower image, the visual system makes a probability-based assumption which turns out to be false. In the video explanation I mentioned above, you will have seen a ball seem to shrink as it gets thrown from one person to the other. And, however improbable that is, your visual system’s probabilistic assumption will have been that that’s more likely than that a room that looks cube-shaped is not cube-shaped!

Hopefully these two posts have given you a feel for a couple of the ill-defined problems the visual system solves and the role of probabilistic-thinking in solving these problems. In the next post we will introduce computer vision by comparing and contrasting with human vision.

References: the idea that the visual system “hates coincidences”, and many of the other ideas in this post are from Chapter 4 “The Mind’s Eye” of Steven Pinker’s How the Mind Works.

Click here for part 3 of this series around Computer Vision.

If you'd like to learn more, please contact us via this form. We'd love to hear from you.

Sam Ashken

Author

Sam Ashken