Visual Attention Lab

Publications | Presentations |Posters

Click on the titles to view and download the posters.

VSS 2024

Advantages and Disadvantages of Sequential vs. Simultaneous Search in Simulated Breast Cancer Screening
Injae Hong¹, Chirag Maheshwari², Jeremy M. Wolfe^1,3 ; ¹Brigham and Women's Hospital, ²Cypress Ridge High School, ³Harvard Medical School
Radiologists, screening mammograms for breast cancer, are required to search for different signs of cancer such as masses, calcifications, and structural distortions. This search should be fast, accurate, and complete. Search for more than one type of target is known as “hybrid search”. Hybrid search can impose a cost on performance that increases with the number of different types of possible targets. Accuracy might be improved by splitting a hybrid search into multiple simple searches for a single target type. This study investigated whether splitting search might be a useful intervention to improve target detection in breast cancer screening. Non-experts searched for either masses or calcifications or both in simulated 2D mammograms (Experiment 1) or while scrolling through 3D volumes of simulated digital breast tomosynthesis images (Experiment 2). Masses and calcifications were independently present with 60 % prevalence. If the target was present, participants clicked on the target and labeled the item as a mass or a calcification using a key press. They received feedback after pressing the space bar to complete the trial. There were four types of task: search for calcifications alone, search for masses alone, search for both types of targets simultaneously, or both types but one after the other, sequentially. The results showed that sequential search was advantageous compared to simultaneous search. In particular, there was a reduced level of “satisfaction of search” errors in rare cases when more than one target was present. SoS errors declined from 60% to 22% (Experiment 1) and 67% to 32% (Experiment 2). However, reduction of SoS errors on a few trials comes at substantial 20% and 28% increase in time for all trials. The standard simultaneous method of searching might be adequate, but splitting hybrid search into several simple searches may reduce errors in some, important cases.
Acknowledgements: CA207490
Stepping into the Same River Twice: Are Miss Errors in Visual Search Deterministic or Stochastic?
Aoqi Li¹ (aoqi.li@manchester.ac.uk), Johan Hulleman¹, Wentao (Taylor) Si², Jeremy Wolfe^3,4; ¹University of Manchester, ²Bates College, ³Brigham and Women's Hospital, ⁴Harvard Medical School
Observers make errors in visual search, whether in a lab experiment or a real-life task. Those errors can be categorized as “deterministic” or “stochastic”. If errors are deterministic, errors committed once will definitely be repeated again. Alternatively, errors can be “stochastic”: occurring randomly with some probability. An error would lie in between these extremes if it is likely, but not guaranteed to occur a second time. To identify the nature of miss errors in a simple T-vs-L visual search task, we presented each search display twice in random sequence. The miss rate, P1, for the first copy of the display and the miss rate, P2, for the second copy were calculated, as was the proportion of cases where both copies were missed, P12. Purely stochastic errors would predict that P12=P1*P2. Purely deterministic errors will lead to P12=min⁡(P1,P2). If errors are a mix of stochastic and deterministic, P12 will fall between these two predictions. In Experiment 1 where the letters were clearly visible, the errors were almost completely stochastic. An error made on the first appearance of a display did not predict that an error would be repeated on the second appearance. In Experiments 2a and 2b where the visibility of the letters was manipulated, the errors became a mix of stochastic and deterministic. Lower contrast targets produced more deterministic errors. In Experiments 3a, 3b and 3c, we tested several interventions with the goal of finding a 'mindless' intervention that could effectively reduce errors without needing to know the answer in advance. An almost mindless intervention that knew the location but not the identity of items (Exp 3c), succeeded in reducing deterministic errors. This gives some insights into possible methods for reducing errors in important real-life visual search tasks, where search items may not be clearly defined and visible.
Acknowledgements: JMW was supported by NIH-NEI: EY017001, NSF: 2146617, and NIH-NCI: CA207490. AL and JH were supported by UKRI grant ES/X000443/1.
Don’t talk to me! Relevant sound disrupts visual search, irrelevant sound does not
Jan Philipp Röer¹ (jan.roeer@uni-wh.de), Ian M. Thornton², Ava Mitra³, Nathan Trinkl³, Jeremy M. Wolfe^3,4; ¹Witten/Herdecke University, ²University of Malta, ³Brigham and Women's Hospital, ⁴Harvard Medical School
Visual search experiments are usually conducted in quiet environments to ensure that participants can fully concentrate on the task. The real world, however, is rarely as quiet as the laboratory. We are more or less constantly exposed to auditory information, some of which we choose to attend to, some of which we try our best to ignore. In three experiments, we examined the effects of background sound on visual search. We used the Multi-Item LOcalization (MILO) task, in which participants clicked through items labeled 1-8 in numerical order as quickly as possible while hearing auditory information through headphones. In Experiment 1, participants needed to engage with the auditory information. In the “listening” condition, participants listened to a news report while performing the MILO task. They were subsequently quizzed about the news. In the “counting” condition, participants counted how many times a specific number was mentioned during a sports commentary. Both conditions significantly disrupted visual search performance compared to a quiet control condition. In Experiment 2, auditory distractors were meaningless sequences of random words that had previously been shown to disrupt visual-verbal working memory. Participants were informed that any background sound was irrelevant and asked to ignore it. It appears that they were able to do so, because there was no effect of auditory distraction on visual search performance. In Experiment 3, we increased the difficulty of the search task by using a “shuffle” manipulation in which the subsequent items in a sequence were randomly repositioned after each localizing response. Even so, search performance again proved to be robust against irrelevant sound. The overall pattern of results suggests that visual search performance can be effectively shielded from auditory distraction, but only if we can choose to ignore the sound and not if we actively listen to it.
Acknowledgements: NEI EY017001
Idiosyncratic Search: Biases in the deployment of covert attention.
Nathan Trinkl¹ (ntrinkl@bwh.harvard.edu), Ava Mitra¹ , Jeremy Wolfe^1,2 ; ¹Brigham and Womens Hospital, ²Harvard Medical School
Eye tracking of visual search tasks shows that the probability that the eyes will move from the current fixation to a nearby target on the next saccade is only ~50%. How can observers fail to find clearly identifiable targets close to fixation (even if they find it later)? One possibility is that processing within the Functional Visual Field (FVF) around fixation is not homogenous. If so, is that inhomogeneity random or systematic? To answer this question, we asked observers to move their eyes to a cue. 300 msec after cue onset, a ring of 7 black Ls and one black T was briefly flashed. Observers made 4AFC decisions about the orientation of Ts. After response, a new fixation location appeared, and this process repeated for two blocks of 360 trials. We found reliably idiosyncratic patterns of accuracy as a function of radial angle (10 of 16 observers were significantly different from normalized group average accuracy, assessed by Chi-sq, p<0.001. Four more p<0.05). Is idiosyncratic accuracy a function of idiosyncratic deployment of attention or retinotopic variation in basic visual processing? To test this, we made the T red, allowing it to summon attention without need for search. Duration was staircased to produce ~25% errors. This eliminated systematic idiosyncrasies in accuracy (Only 1 of 20 observers with p<0.05). Did the original idiosyncrasies depend on making successive saccades to new fixation points? We repeated the experiment with fixation held at a single location. Idiosyncratic patterns were seen, though they seem weaker than with a moving fixation (7 of 20 observers with p<0.001. Three more with p<0.05). We do not yet know if the idiosyncratic patterns for one observer would be the same with saccades and with steady fixation. These results suggest that attentional deployment is systematically inhomogeneous in the immediate vicinity of fixation.
Acknowledgements: National Science Foundation (NSF), Grant #2146617 & Economic and Social Research Council (ESRC) — UK Research and Innovation (UKRI), Grant: ES/X000443/1
Inattentional blindness for a salient target in visual search: Finding a surprisingly easy target can be surprisingly hard
Daniel Ernst¹ (daniel.ernst@uni-bielefeld.de), Gernot Horstmann¹, Johan Hulleman², Jeremy Wolfe³; ¹Bielefeld University, ²The University of Manchester, ³Brigham and Women’s Hospital/Harvard Medical School
In natural search settings, observers often do not know how difficult it will be to find the next target. The next security threat or the next potential cancer in a lung CT might be quite salient or very subtle. In the lab, visual search experiments typically involve multiple successive trials of constant difficulty. This allows participants to anticipate the salience of the next target. How will observers respond if the current target salience suddenly deviates from the target salience of all preceding trials? It would be unremarkable to find that it is harder to detect a low salience target when the observer expects a high salience one. More interestingly, here we report that observers are impaired when they are surprised with a high salience target after a series of difficult searches. In Experiment 1, observers searched for a hard-to-detect O target (always present, compound search) among C distractors with small gaps for 32 trials. On the 33rd trial, the gaps of the C distractors were large, making the O target much more salient. Yet, search efficiency was considerably more inefficient on this surprise trial (147 msec/item) than on subsequent trials with the same high target-distractor dissimilarity (66 msec/item). In Experiment 2, observers reported the presence or absence of an O in a display with short presentation duration. After multiple hard trials, observers frequently showed inattentional blindness towards an unexpectedly salient target that was reported almost perfectly when the identical target was presented on repeated trials. Gaze data suggests that observers adopted a tightly focused attentional window during the initial, hard search. This strategy made them surprisingly ‘blind’ to targets that were unexpectedly highly detectable. The wrong attentional set may be one explanation for situations where we “look but fail to see” obvious stimuli.
Acknowledgements: Daniel Ernst was supported by grant ER 962/2-1; Johan Hulleman was supported by UKRI grant ES/X000443/1; Jeremy Wolfe was supported by grants NEI EY017001, NSF 2146617, NCI CA207490
VowelWorld 2.0: Using artificial scenes to study semantic and syntactic scene guidance
Yuri Markov¹ (yuamarkov@gmail.com), Melissa Le-Hoa Vo¹, Jeremy M Wolfe²; ¹Goethe University Frankfurt, Scene Grammar Lab, Germany, ²Brigham and Women’s Hospital, Harvard Medical School
Scene guidance is difficult to investigate in realistic scenes because it is hard to systematically control complex, realistic images. Parameters like set size are often ambiguous in real or even VR scenes. We created a new version of VowelWorld 2.0 (Vo & Wolfe, 2013), where we control various parameters of a highly artificial “scene”. Scenes are 20x20 grids of colored cells with 120 cells containing letters. Participants search for a vowel, present on 67% of trials. Each scene contained three big disks (2x2 cells) with consonants on them. These served as “anchor objects” which are known to predict target locations in real-world searches (Vo, 2021). An additional 96 cells featured rings which were grouped into larger analogs of surfaces. A vowel’s placement could follow three rules. Color rule (semantic): certain targets were associated with one background color “gist” (e.g., A’s appear in red scenes). Structure rule (syntactic): vowels were placed near or inside the small rings. Anchor rule (syntactic): vowels were close to a big circle containing a neighboring consonant (e.g., “B” implies “A”). Two vowels followed all three rules, two vowels followed color and surface rules, and one vowel was placed randomly. On half of the trials, participants were precued with a specific vowel. Otherwise, participants searched for any vowel. For the first three blocks, participants attempted to learn the rules from experience. Then, we explained the rules. Participants failed to fully learn rules but did benefit from the learned anchor rule (shorter RTs). Knowing rules markedly speeded performance for vowels that followed only color and surface rules. Anchor rule vowels showed less improvement over initial learning. Knowing rules had a major impact on ending absent trials. Future work will systematically vary the predictability of different rules to test under which circumstances rule learning becomes more or less optimal.
Acknowledgements: This work was supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation), project number 222641018 SFB/TRR 135 TP C7 granted to MLHV and the Hessisches Ministerium für Wissenschaft und Kunst (HMWK; project ‘The Adaptive Mind’).
What, where, when did I find this? Associative learning in hybrid search.
Iris Wiegand¹ (iris.wiegand@ru.nl), Jeremy M. Wolfe², Joseph R. Maes¹, Roy P.C. Kessels¹; ¹Donders Institute for Brain, Cognition and Behaviour, ²Brigham and Women's Hospital, Harvard University
In “hybrid” visual and memory search, observers look for multiple, previously memorized target objects among distractors. Hybrid search is akin to many real-world searches, such as looking for items on your mental shopping list in the grocery store. Thus, hybrid searches occur in spatial and temporal contexts that we encounter repeatedly. In several experiments, we investigated if observers would incidentally learn and utilize spatial and temporal associations in hybrid search. Specifically, we examined learning of four different types of regularities: 1) target item sequences (e.g., the banana always follows the yoghurt), 2) target location sequences (e.g., a target in the lower left corner always follows a target in the upper right corner), 3) target item-location associations (the banana is always in the upper right corner), and 4) target item-location sequences (the banana in the upper right corner always follows the yoghurt in the lower left corner). Learning would be reflected in a decrease in search times. Our results show only weak incidental learning for the temporal sequences of target items or target locations alone, even after many repetitions of the sequence. By contrast, learning of target item-location associations was fast and effectively reduced search times. Furthermore, the experiments show a reliable effect of temporal sequence learning for target item-location associations. These findings suggest that spatiotemporal learning in hybrid search is hierarchical and conditional: Only if spatial and non-spatial target features are bound do temporal associations bias attention, pointing the observer to the task-relevant features expected to occur next.

VSS 2023

"The FORAGEKID Game: Using Hybrid Foraging to Study Executive Functions and Search Strategies During Development"
Beatriz Gil-Gómez de Liaño¹ (bgil.gomezdelianno@uam.es), Jeremy M. Wolfe² ; ¹Universidad Autónoma de Madrid, ²BWH-Harvard Medical School
Searching for friends in the park, finding specific Lego blocks for a building project, or looking for recipe ingredients in the fridge; Each of these is a "hybrid search" typical of everyday life. Hybrid search is searching for instances of multiple targets held in memory. Hybrid Foraging (HF) is a continuous version in which observers search for multiple exemplars of those multiple target types. HF draws on a wide array of cognitive functions beyond those studied in classic search and can be used as a "one-stop shop" to study those functions within a single task as they develop and interact over the lifespan. We study cognitive development using our FORAGEKID-HF video game. Observers search through diverse moving real-world toys or simpler colored squares and circles. They are asked to collect targets from a memorized set as quickly as possible while not clicking on distractors. We have tested large samples of children, adolescents, and young adults (4-25 years old) running different versions of FORAGEKID. Foraging rate data can be used to assess the development of selective attention under different memory target-load conditions (here, 2 versus 7 targets). Cognitive flexibility and search strategies can be measured by analyzing switch costs when observers change from collecting one target type to collecting another. The organization of search can be studied by examining target-search paths using different measures; e.g., best-r, inter-target distances, etc. Finally, decision-making processes are illustrated by quitting rules; When do observers choose to move from one screen to a fresh screen? Changes in "travel-costs" (time to move from one screen to the next) impact quitting rules differentially across the lifespan. Here, we show data supporting FORAGEKID as a serious but enjoyable game that can effectively assess and potentially train a range of attentional and executive functions over the lifespan.
Acknowledgements: European Union’s Horizon 2020, Marie Sklodowska-Curie Action FORAGEKID 793268 & Ministerio de Ciencia e Innovación de España: PID2021-122621OB-I00.
"Research on re-search: Foraging in the same patch twice"
Injae Hong¹ (ihong1@bwh.harvard.edu), Jeremy M. Wolfe^{1, 2} ; ¹Brigham and Women's Hospital, ²Harvard Medical School
When humans forage for multiple targets in the succession of ‘patches,’ the optimal strategy is to leave the patch when the instantaneous rate of return falls below the average rate of return (Marginal Value Theory: Charnov, 1976). Human behavior has been shown to be, on average, near optimal in basic foraging tasks. Suppose, however, that foragers are allowed to return to previously foraged patches. What strategy would the foragers take when they revisit patches that they left previously, either compulsorily or voluntarily? Our computer-screen patches contained “ripe” and “unripe” berries, each defined by overlapping color distributions (d’ = 2.5). Observers attempted to collect ripe berries as fast as possible. One group of observers was forced to leave each patch after 10 seconds and then brought back to forage those patches for an additional 5 minutes. A second group foraged and moved to new patches when they wished to, before being brought back to pick the “leftovers” for 5 minutes. A control group foraged at will with no revisiting. The observers who were forced to leave the patches behaved like control observers, continuing where they left off when brought back to the patch and ending at about the same rate. Observers who had already voluntarily left a patch did continue to pick when brought back to the patch. However, the patches having been depleted, that picking was less productive. There appeared to be a small jump in the rate of foraging when these observers returned to their patches. It would be interesting to see if those observers would have bothered to pick on the second visit if they were not required to do so.
Acknowledgements: NSF 2146617
"How Blind is Inattentional Blindness in Mixed Hybrid search?"
Ava Mitra¹ (amitra@bwh.harvard.edu), Jeremy M. Wolfe^{1, 2} ; ¹Brigham and Women's Hospital, ²Harvard Medical School
In day-to-day visual search tasks, we may search for instances of multiple types of targets (e.g., searching for specific road signs while also scanning for pedestrians, animals, and traffic cones). In the lab, the “mixed hybrid search task” is a model system developed to study such tasks, where you are looking for general categories of items (e.g., things you don’t want to hit with your car) alongside specific items (e.g., the sign for your exit). Previous hybrid visual search studies have shown that observers are much more likely to miss more general “categorical” targets than specific targets, even though it is quite clear that categorical and specific items are equally likely to be attended in this paradigm. If an item is attended but missed, do observers have any access to the information that may have been accumulating about that target? Twelve participants searched arrays for two specific items (e.g., this shoe and this table) while also searching for unambiguous instances of two categorical target types (e.g., ANY animal and ANY car). In order to look for the existence of sub-threshold information about missed targets, we borrowed methods from the inattentional blindness literature. We asked two, 2AFC questions after every miss error and after 5% of target-absent trials. Question 1: Do you think you missed an item? Question 2: If you did miss something, which of these two items was it? On trials where participants asserted that they had NOT missed an item (“No” to question one), participants correctly selected the right item ~63% of the time against a 50% chance level (p<0.018). Interestingly, this ability to identify the missed target was only seen following missed categorical targets, not missed specific targets. Knowledge about the target’s identity can lurk behind after that target is missed.
Acknowledgements: NSF grant 2146617, NIH-NEI grant EY017001, NIH-NCI grant CA207490
"Image memorability modulates image recognition, but not image localization in space and time"
Nathan Trinkl¹ (ntrinkl@bwh.harvard.edu), Jeremy M. Wolfe^{1, 2} ; ¹Brigham and Women's Hospital, ²Harvard Medical School
We know that observers can typically discriminate old images from new ones with over 80% accuracy even if after seeing hundreds of objects for just 2-3 seconds each (“Massive Memory”). What do they know about WHERE and WHEN they saw each object? From previous work, we know that observers can remember the locations of 50-100 out of 300 items (Spatial Massive Memory – SMM). In a different study, observers could mark temporal locations within 10% of the actual time of the item's original appearance (Temporal Massive Memory - TMM). Are SMM and TMM related? In new experiments, 64 observers saw 50 items, each sequentially presented in random locations in a 7x7 grid. They subsequently saw 100 items (50 old). Four sets of instructions were used: (1) Mere Identity instruction asked 16 observers just to remember the items. (2) Spatial instruction asked 16 observers to also remember item locations. (3) Temporal instruction asked 14 observers to remember when items appeared. (4) Full instruction (13 observers) combined Spatial and Temporal instructions. At test, observers in all conditions were told to click on the original location of old items and to indicate when they saw it on a time bar. ~12% of observers appeared to guess on the spatial task and ~50%(!) guessed on the timing task. Interestingly, just 6% guessed on both, exactly as would be predicted if the choice to guess was independent for space and time. Overall, space and time scores were strongly correlated for Full Instructions (r-sq=.64, p=0.001), Temporal (r-sq=.31, p=0.04), and marginally correlated for Spatial (r-sq=.20, p=0.08). The Mere Identity correlation was insignificant (r-sq=.03, p=0.40). Effects of instruction on performance were generally insignificant. Observers can have quite good memory for when and where they saw an object. Those memories seem to be modestly correlated with each other.
Acknowledgements: National Science Foundation (NSF), Grant #1848783
"Modestly related memories for when and where an object was seen in a Massive Memory paradigm."
Jeremy Wolfe^{1, 2} (jwolfe@bwh.harvard.edu), Claire Wang³, Nathan Trinkl¹, Wanyi Lyu⁴ ; ¹Brigham and Womens Hospital, ²Harvard Medical School, ³Phillips Academy, Andover, MA, ⁴York U, Toronto
We know that observers can typically discriminate old images from new ones with over 80% accuracy even if after seeing hundreds of objects for just 2-3 seconds each (“Massive Memory”). What do they know about WHERE and WHEN they saw each object? From previous work, we know that observers can remember the locations of 50-100 out of 300 items (Spatial Massive Memory – SMM). In a different study, observers could mark temporal locations within 10% of the actual time of the item's original appearance (Temporal Massive Memory - TMM). Are SMM and TMM related? In new experiments, 64 observers saw 50 items, each sequentially presented in random locations in a 7x7 grid. They subsequently saw 100 items (50 old). Four sets of instructions were used: (1) Mere Identity instruction asked 16 observers just to remember the items. (2) Spatial instruction asked 16 observers to also remember item locations. (3) Temporal instruction asked 14 observers to remember when items appeared. (4) Full instruction (13 observers) combined Spatial and Temporal instructions. At test, observers in all conditions were told to click on the original location of old items and to indicate when they saw it on a time bar. ~12% of observers appeared to guess on the spatial task and ~50%(!) guessed on the timing task. Interestingly, just 6% guessed on both, exactly as would be predicted if the choice to guess was independent for space and time. Overall, space and time scores were strongly correlated for Full Instructions (r-sq=.64, p=0.001), Temporal (r-sq=.31, p=0.04), and marginally correlated for Spatial (r-sq=.20, p=0.08). The Mere Identity correlation was insignificant (r-sq=.03, p=0.40). Effects of instruction on performance were generally insignificant. Observers can have quite good memory for when and where they saw an object. Those memories seem to be modestly correlated with each other.
Acknowledgements: This work was supported by NSF grant 1848783 to JMW