Chapter 5 Bottom-up and top-down attention
In the previous chapter we learned that at any one time the sensory signals from only a few objects are being fully processed through the bottleneck(s) and thus are likely to enter memory.
Given the existence of the bottlenecks, we really need ways to prioritise what gets selected for high-level processing. How does your brain decide which objects to attentionally select?
Well, where and when attentional selection happens reflects a combination of factors. In the case of this text, you must have decided to read it. That is, your attentional selection of this text occurred because you gave yourself a task of reading it. While your brain is only able to read one, or at most two, words at a time, your eyes and attention hop along to select and fully process the successive words in this line of text.
We call this kind of selection top-down attention.
Top-down attention is typically voluntary, and thus guided by your expectations and desires, as represented by this inspector intentionally scrutinising individual bits of a crime scene.
Bottom-up attention is quite different - it’s when something in the world grabs your attention. This can sometimes happen even against your will when you are trying to concentrate on something else. The reason for bottom-up attention is a bit like why you have CTRL-C, ESC, or “Force quit” on your computer. After you give your computer a task, sometimes you need to interrupt it. Indeed, every responsive system needs interrupt signals to take them off task when something that might be even more important crops up. That is, no matter how strongly a person is concentrating on a task, there should always be a possibility for unexpected information to trigger attention so that the person remains responsive to unexpected dangers.
If you hear a sudden loud sound, your attention is likely to be taken off, at least momentarily, the task you are performing. This was useful throughout evolutionary history to ensure that our ancestors evaluated sudden movements or sounds that might mark the arrival of another animal such as a predator. Similarly, if someone taps on your shoulder, or another body part, that’s pretty likely to get your attention. We have evolved to be quite vigilant regarding possible threats to our body.
The art of concentration, and studying well, is in part knowledge of what distracts one’s attention, and placing oneself in situations where your attention won’t be distracted.
Unique visual objects in a scene also elicit bottom-up attention. For example, look at the image below - does something in it attract your attention?
If you aren’t colorblind, the object with the unique color should have attracted your attention. This is an example of bottom-up attention. When choosing a car to buy, some people deliberately pick an unusual color because they know that when they go shopping, if they forget where they parked their car, they will have little trouble finding it. A pink car will “stick out” conspicuously even in a sea of other cars, if those cars have the more typical paint jobs of black, white, grey, and dark colors.
An object with unique motion direction can also summon attention, as you can see here.
However, not all unique objects in a scene will attract attention. In the below image, the animal whose back you see in the foreground is an elk. Can you find the mountain lion that is stalking it?
It’s extremely dfficult to find and see for our limited brains. Click here to see the lion circled - you might still need to zoom in to see it! Mountain lions and other animals have evolved to have an appearance, and engage in behaviors, that won’t attract the attention of other animals. The next few chapters will be, in part, about what does and doesn’t attract attention.
5.1 Bottom-up attention and top-down attention, together forever
The signals of bottom-up and top-down attention must be somehow combined to determine where your attention ends up going.
As described in the previous section, top-down attention reflects one’s current goals and task. Bottom-up attention reflects things in a scene that might grab our attention during almost any task, like a unique color. We don’t fully understand how these work together. Sometimes top-down and bottom-up factors compete with each other. This can be seen in the results of an experiment described by Theeuwes (2010).
In the experiment, participants searched for a green diamond presented among a variable number of circles and had to respond to the orientation (horizontal or vertical) of the line segment presented within the diamond shape. So, their task was to find the diamond and pay attention to only it.
In some of the displays, Theeuwes (2010) included a circle that was different in color from all of the rest of the items on the display. The uniqueness of this color tended to attract attention. Because that meant attention was attracted away from the diamond, the results was an elevation in response time for reporting the orientation of the line segment in the diamond.

Figure 5.1: Mean correct response time for detecting a change. The black line represents children with ASD and the dashed gray line typically developing children. Error bars are one standard error.
The graph of the results above shows the average time to indicate what orientation was in the diamond. The horizontal axis shows that the more stimuli that are presented, the longer it takes people to respond. This suggests that the more objects there were, the longer it took to find the diamond. Also notice that the red line is above the black line. This was the most important result - trials with a uniquely-colored distractor slowed response time.
- By how much, approximately, were responses slowed?
This slowing is sometimes called attentional capture. Objects or signals that grab, or capture, bottom-up attention are sometimes called salient distractors or exogenous cues. Due to the changing influences of bottom-up and top-down attention, attention may rapidly shift among different stimuli, depending on a combination of task factors and salience of the items. What we attend to, then, reflects a combination of what is important for our task and extraneous attention-capturing signals.
Some researchers think that top-down and bottom-up attention combine at a “priority map” mediated by a distributed network involving frontal, partial, temporal areas. Top-down signals largely reflect frontal and parietal areas, while bottom-up attention reflects sensory brain areas. These brain areas’ signals feed into the priority map (possibly within the FEF), which ultimately determines selection.

A schematic, created by Theeuwes and Failing (2020), indicating brain areas that mediate bottom-up attention, top-down attention, and a priority map.
5.2 Bottom-up and top-down attention when reading
Web designers, game designers, and graphic designers all need to have some understanding of what attract the attention of people when they look at a display. One of the principles that they learn is that unique colors attract attention. But there’s more to it than that.
The above image illustrates some of the factors that affect the attention of people when they look at text. The content of the image claims to successfully predict the order in which you will read the different printed phrases. I don’t know that there’s been any scientific test of this, but the comments on Reddit provide some anecdata suggesting that it did work for many people:
How was the designer of the image able to fairly accurately predict reading order? Some of that comes from the principles of bottom-up attention. The “First, you read this” text is an odd color, which you know will attract attention. Two other factors that help are that it has a large font and it occupies a central position. For various sorts of objects, all other things being equal, people will tend to look at the center image. This may be particularly true of designed images with a frame like this one, because we come to learn that designers often position the most important information in the center of a frame.
“Then you will read this” is actually slightly larger than the “First, you read this” text, but because it is at the bottom of the image and it is not in an odd color, the designer could safely bet that most people wouldn’t look at it first. Why is it (often) the next text looked at? Its size certainly helps, but another factor is that it is below the text that most people read first. Once you put a person into reading mode, they tend to shift their attention according to reading order: in English, from left to right and from top to bottom.
You’d have to expect that when people open the pages of a book, or land on a webpage with a lot of paragraphs, laid out like prose. To make sense of a paragraph, we of course will read the words from left to right and the lines from top to bottom. I was interested in whether participants would do this even when they were looking at just two individual letters, rather than a bunch of text.
In experiments in my laboratory, we investigated this by testing several hundred students who were taking PSYC1002 at the University of Sydney :) We flashed two widely-spaced letters at people and asked them to identify both of them. If participants attended to one location before another, we reasoned, their performance would be higher on the letter that occupied the position attended to first.
One configuration we tested was much like the below, with one letter placed above another, like this:
People are extremely good at reading letters, so we knew that if participants were to ever report a letter incorrectly, we would have to flash the two letters extremely briefly. But even when the letters were flashed for only about 2 hundredths of a second, we found that participants still performed above 90% correct! To reduce performance enough, down to around 75% correct, we had to present the two letters very briefly, with low luminance contrast, and immediately afterward flash a bright “mask” that helps limit the amount of time one’s brain has to process the letters. In slow motion, then, a trial looked like this movie.
Prior to the presentation of the two letters, the participants were looking directly at the center of the screen. The presentation was so brief that they did not have time to move their attention to either location, so any difference in performance for the upper versus lower location would be a result of covert attention (assuming that both parts of the visual field are equally good, which had been confirmed).
To examine whether there was any difference between the top and bottom positions, we subtracted each participant’s performance on the bottom letter from that of the upper letter. Each dot in the below plot represents the difference score for one of the one hundred and thirty first-year students.
The average difference score of the 130 participants is 0.13, meaning that participants were thirteen percentage points more accurate at reporting the top letter compared to the bottom letter. Notice that this upper-letter bias was not true of everyone - eight people actually identified the bottom letter correctly more often than the top letter. It appears, however, that the overwhelming majority of participants did have an upper bias with this display.
A second condition of our experiment found evidence that this upper bias reflects an appraisal by the mind of what the correct order should be for reading text. In this condition of the experiment, a different set of the participants had to identify two letters that were rotated anti-clockwise by 90 degrees, so that they faced upward:
The question was whether this would affect the upper bias documented in the basic condition. With the two letters facing upward, what you might call the implicit reading order should favor the bottom letter. That is, if you were to rotate the page of a book in this way, you would then start from the bottom and read toward the top.
The above plot shows the results for the two conditions side-by-side. You can see that the average top bias found previously is much smaller in the facing-up condition. Indeed, plenty of participants now showed a bottom bias rather than a top bias.
Remember, these letters were presented so briefly that there was no time for participants to move their eyes. It was so brief, in fact, that participants are very unlikely to have even been able to move their attention while the letters were being presented. But participants seem to have allocated more of their attention to one part of the display to another, which led them to perform better for the letter in that location.
This finding is an example of how biases in our attention that we often aren’t even aware of contribute to what we notice and what we can report. This particular bias is not a bottom-up effect like an odd feature such as color or size. Instead, this attentional bias is one that we learn from our experience reading English text. In other experiments, we found that English/Arabic bilinguals had different biases when asked to report two briefly-presented English letters than when they were asked to report two briefly-presented Arabic letters (Ransley et al. 2018). As learned biases, these phenomena are more like top-down attention than bottom-up.
5.3 Exercises
Answer these questions and relate them to the learning outcomes ( 2 )