Chapter 9 Visual search

From time to time, we all need to find things. Rummaging through our closet for a particular shirt, or wandering about the house trying to find our keys, or for some of us, groping about for our spectacles that we know we put down somewhere around here.

Our search performance patterns can reveal aspects of the bottlenecks in mental processing. Slow search may indicate a bottleneck is affecting processing. However, sometimes search is slow because the basic sensory signals are not good. For example, when I lose my spectacles, my vision is so poor that I have to bring my face close to each location in the room to check whether my glasses are there. Similarly, wandering about the house looking for one’s keys, if one is to evaluate all the rooms of the house for the presence of the keys, one has to visit each room.

Sometimes, even though something is right in front of our face and we have our glasses or contact lenses on, the sensory signals still aren’t good enough for it to be possible to know that the object is there. For example, try searching for the word “wilt” in the below image.

The first two pages of *Romeo and Juliet*.

Figure 9.1: The first two pages of Romeo and Juliet.

Did you find ‘wilt’ yet? To find it, your eyes have to move back and forth. The task is impossible to do without moving your eyes (‘wilt’ is about 3/4 of the way down the left page). The main reason you have to move your eyes is that the sensory signals provided by your retinas are only good enough to read small words near the center of your vision. So, you have to move your eyes.

To experience this sad fact about your vision more directly, try the following. Stare directly at the black cross and, while keeping your eyes fixed on the cross, try reading any of the words on the bottom of the page. You can’t do it. Not because of any bottleneck, or problem with selection, but simply because your photoreceptors are too widely spaced in the periphery. That is, outside of a central region, the spatial resolution of your vision is too low to see many details. The sensory signals from the periphery are coarse.

Thus, not only are overt attentional shifts (eye movements) often made when you want to attentionally select a region of space, but also sometimes covert attention isn’t sufficient - you have to move your eyes to have any chance of seeing certain things well enough to know what they are.

9.1 Information overload

A good way to assess whether there is a bottleneck in a system is to give it more and more things to process and see whether this degrades performance or whether the system can process them all just as quickly as when it is given only one. Psychologists did this for visual processing by giving people many stimuli to process, by adding more and more to a display. In doing this, however, they had to be careful to make sure that the brain had a chance, by making sure that a person could see each individual stimulus even when it wasn’t in the center of their vision (unlike in the Romeo & Juliet demonstration above). If the person couldn’t even see the stimuli well, then of course the brain wouldn’t process it sufficiently even if it didn’t have a bottleneck.

One of the tasks psychologists have used for this is called “visual search”. In a visual search experiment, people are shown a display with a particular number of stimuli and asked to find a target. This is discussed this in first-year psychology. The next section is, in part, a review of that.

9.3 Processing one thing at a time

Parallel search doesn’t happen in most cases for combinations of individual features. Instead, there is a bottleneck. To put that in context, first let’s remind ourselves of aspects of parallel versus serial processing.

Imagine you were in an art installation where the artist had hung many speakers from the ceiling, and each speaker played a different person’s voice, each telling a different story. That’s pretty weird, but is precisely the situation I was in one day when I visited a museum in Havana, Cuba. What I heard sounded like an incoherent jumble. I couldn’t follow any of the actual stories being told by the voices until I moved my ear up against an individual speaker. In other words, I could only process a single auditory stimulus at a time, and to do so, I had to select it using overt attention.

A forest of speakers is not a situation you are likely to encounter! It does illustrate, however, one possibility for sensory processing - for certain things, you may be unable to process multiple signals at once. In that case, you need to select one stimulus to concentrate on it.

An art installation in Havana, Cuba

Fortunately, our visual brain can process certain aspects of the visual scene in parallel. But for combinations of features, you are in much the same boat as I was that day in Havana, having to select individual locations to evaluate an aspect of what is present - specifically, the combination of features there.

9.6 Visual search and blank-screen sandwiches

Remember the blank-screen sandwich change detection animations of Chapter 6? In a typical blank-screen sandwich experiment, people are timed for how long they take to find the change happening in a photo of a natural scene. To better assess what is happening with attention, one can use a carefully crafted visual search display instead of a natural scene (Rensink 2000).

As schematized above, the participant was shown blank screen sandwiches with one object changing, and how long it took them to indicate the location of the changing object was recorded. The displays were shown for 800 ms and the blank screen was shown for 120 ms.

Recall from Chapter 6 that with the blank screen in the animation, the flicker/motion detectors provide no clue as to the location of the change. Do you think viewers could use their feature selection ability to help them find the change?

Because the changing object could be any of the objects in the scene, there is no particular feature people can use to find the changing object. All people are left with seems to be a more cognitive process of being able to note that an object one is attending to is no longer what it was - a change has occurred.

Rensink’s hypothesis was that this kind of evaluation process exists after the bottleneck, in the realm of very limited-capacity cognition. So he expected that evaluating whether a change is present can only be done for one or a few items at a time, by attentionally selecting that location so that its features would make their way to cognition where a comparison process could be done over time.

In other words, one has to select the objects in a region with one’s attention to store their appearance before and after the blank screen and then compare them to detect whether they changed. In that case, what do you predict should be the effect on number of items in the display on time to find the object?

Because only a limited number of objects can be simultaneously evaluated for change, the more objects that there are, the longer the task of finding the lone changing object should take. This was borne out by the results:

Results when searching for a lone changing object in a blank-screen sandwich, from Rensink (2000).

Figure 9.7: Results when searching for a lone changing object in a blank-screen sandwich, from Rensink (2000).

On some trials, the target was absent (the unfilled triangles). In that situation, the participants should not respond until they had evaluated every object in the display so they could be sure nothing was changing, hence the longer response times ( on some occasions, some participants probably got tired of searching and responded prematurely).

9.7 Estimating the processing capacity of cognition’s change detector

A stunning difference between Figure 9.7 and those of the earlier search results graphs (like 9.6) is how much longer the response times are. The response times are all well over a second! Why is that?

Consider what it takes to detect a change. The second picture is not presented until almost a second (920 ms) after the trial begins, so there is no change until then and no way to know what changed until then. The response time of about 1.3 seconds (1300 ms) with only two objects in the display is about as fast as you could expect people to respond - maybe one second before they can detect the change and three-tenths of a second (300 ms) to press the button to indicate they had found the changing stimulus.

Of most interest, then, is not the overall amount of time before a response, but rather the elevation in response time caused by adding more distractors (non-changing objects). If people were able to evaluate all the objects in a display (no capacity limit on change detection), then response times could be the same when there are 10 objects in the display as when there are just 2 objects in the display. Instead, the filled triangles show a steep increase in search time with number of objects in the display.

It’s impossible to evaluate selected objects for change until they change (every 920 milliseconds in this blank screen sandwich). The data plot indicates that on average, 10 objects have to be added to the display to elevate the response time by 920 milliseconds. Because you can find the target on average by searching half the distractors (as mentioned above) this indicates that people could evaluate about 5 objects at a time for whether any one of them was changing. In other words, if on each cycle of the blank screen sandwich they attentionally selected 5 objects and were able to detect whether one of them was changing, this predicts the increase in response time observed in the results.

Let’s sum things up. In Chapter 6 you were reminded of how long it can take people to find a changing object when the flicker/motion does not signal the changing object’s location. The ingenious blank screen sandwich experiment of Rensink (2000) indicates that the reason it takes so long is that people only have the capacity to evaluate about five objects at a time for change. This reflects the very limited processing capacity of the cognitive processing we have to rely on when flicker/motion and feature selection doesn’t help us. Note that Rensink (2000) used very simple objects - oriented lines. The capacity limit may be even worse for more complicated objects.

9.8 Exercises

  • Why do people need to move their eyes for many searches?
  • What factors can make visual search slow?
  • Describe how the kinds of selection connect to visual search performance for different types of display - learning outcome #5 (2).
  • How does the finding for visual search performance for feature conjunctions relate to the rate limit found for pairing simultaneous features in the previous chapter?

References

Rensink, Ronald. 2000. “Visual Search for Change: A Probe into the Nature of Attentional Processing.” Visual Cognition 7 (1): 345–76. https://doi.org/10.1080/135062800394847.