What draws people's attention and gaze when they view visual displays? How does it influence the choices they make?
On July 6, 2010, I started a new job as a Research Scientist at Yahoo! Research (Santa Clara). I am excited about the new job, and the possibility of exploring how neuro-cognitive models of attention and choice developed in controlled settings might apply to the large-scale context of the web. Stay tuned for interesting updates!
My research aims at developing a behavioral and neuro-computational understanding of how people's attention, gaze and choice are influenced by various factors such as task demands, visual salience and reward incentives or value preferences. I explore these questions using a mix of applied theory (Bayesian statistics, Signal Detection theory, Statistical Decision theory, models of visual saliency) and experiments (eye tracking, psychophysics).
Christof Koch (advisor,Caltech), Pietro Perona (advisor,Caltech), Antonio Rangel (Caltech), Wei Ji Ma (Baylor), Jeff Beck (Gatsby,London), Alex Pouget (U.Rochester), Riccardo Pedersini (Harvard), Todd Horowitz (Harvard), Jeremy Wolfe (Harvard), Mili Milosavljevic (Caltech), Laurent Itti (PhD advisor,USC)
V. Navalpakkam, C. Koch, A. Rangel, P. Perona, Optimal reward harvesting in complex perceptual environments , In press: PNAS.
Abstract: The ability to choose rapidly among multiple targets embedded in a complex perceptual environment is key to survival. Targets may differ both in their reward value as well as in their low-level perceptual properties (e.g., visual saliency). Previous studies investigated separately the impact of either value or saliency on choice, thus it is not known how the brain combines these two variables during decision-making. We addressed this question with three experiments in which human subjects attempted to maximize their monetary earnings by rapidly choosing items from a brief display. Each display contained several worthless items (distractors) as well as two targets, whose value and saliency were varied systematically. We compared the behavioral data to the predictions of three computational models which assume that: (1) subjects seek the most valuable item in the display, (2) subjects seek the most easily detectable item, (3) subjects behave as an ideal Bayesian observer who combines both factors to maximize expected reward within each trial. We find that, regardless of the type of motor response used to express the choices, decisions are influenced by both value and feature-contrast in a way that is consistent with the ideal Bayesian observer, even when the targetsí feature-contrast is varied unpredictably between trials. This suggests that individuals are able to harvest rewards optimally and dynamically under time pressure while seeking multiple targets embedded in perceptual clutter.
V. Navalpakkam, C. Koch, P. Perona, Homo Economicus in Visual Search , In: Journal of Vision, 9(1):31, 1-16, 2009.
Abstract: How do reward outcomes affect early visual performance? Previous studies found a suboptimal influence, but they ignored the non-linearity in how subjects perceived the reward outcomes. In contrast, we find that when the non-linearity is accounted for, humans behave optimally and maximize expected reward. Our subjects were asked to detect the presence of a familiar target object in a cluttered scene. They were rewarded according to their performance. We systematically varied the target frequency and the reward/penalty policy for detecting/missing the targets. We find that 1) Decreasing the target frequency will decrease the detection rates, in accordance with the literature. 2) Contrary to previous studies, increasing the target detection rewards will compensate for target rarity and restore detection performance. 3) A quantitative model based on reward-maximization accurately predicts human detection behavior in all target frequency and reward conditions; thus, reward schemes can be designed to obtain desired detection rates for rare targets. 4) Subjects quickly learn the optimal decision strategy; we propose a neurally plausible model that exhibits the same properties. Potential applications include designing reward schemes to improve detection of life-critical, rare targets (e.g., cancers in medical images).
V. Navalpakkam, L. Itti, Search goal tunes visual features optimally , In: Neuron, Vol. 53, No. 4, pp. 605-617, Feb 2007.
Abstract: How does a visual search goal modulate the activity of neurons encoding different visual features (e.g., color, direction of motion)? Previous research suggests that goal-driven attention enhances the gain of neurons representing the target's visual features. Here, we present mathematical and behavioral evidence that this strategy is suboptimal and that humans do not deploy it. We formally derive the optimal feature gain modulation theory, which combines information from both the target and distracting clutter to maximize the relative salience of the target. We qualitatively validate the theory against existing electrophysiological and psychophysical literature. A surprising prediction is that it is sometimes optimal to enhance nontarget features. We provide experimental evidence toward this through psychophysics experiments on human subjects, thus suggesting that humans deploy the optimal gain modulation strategy.
See spotlight entitled "Paying Attention to Neurons with Discriminating Taste" by A. Pouget and D. Bavelier, In: Neuron 2007, Vol. 53, No. 4, pp. 473-475, Feb 2007.
See Faculty of 1000 Biology evaluation
V. Navalpakkam, L. Itti, An Integrated Model of Top-down and Bottom-up Attention for Optimal Object Detection , In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), 2006.
Abstract: Integration of goal-driven, top-down attention and image-driven, bottom-up attention is crucial for visual search. For instance, in robot navigation, it is important to detect goal-relevant targets like road signs and landmarks, and to simultaneously notice unexpected visual events like sudden obstacles and accidents. Yet, previous research has mostly focused on models that are purely top-down or bottom-up. Here, we propose a new model that combines both. The bottom-up component computes the visual salience of scene locations in different feature maps extracted at multiple spatial scales. The top-down component uses accumulated statistical knowledge of the visual features of the desired search target and background clutter, to optimally tune the bottom-up maps so as to maximize target detection speed. The results of testing on 600 artificial search arrays and 300 natural scenes show that the model's predictions are consistent with a large body of available literature on human psychophysics of visual search. The promising results suggest that our model may provide good approximation to how humans combine bottom-up and top-down cues such as to optimize visual search behavior.
V. Navalpakkam, L. Itti, Top-down Attention Selection is Fine-grained, In: Journal of Vision, Vol. 6, No. 11, pp. 1180-1193, Oct 2006.
Abstract: Although much is known about the sources and modulatory effects of top-down attentional signals, the information capacity of these signals is less known. Here, we investigate the granularity of top-down attentional signals. Previous theories in psychophysics have provided conflicting evidence on whether top-down guidance is coarse grained (i.e., one gain control term per feature dimension) or fine grained (i.e., multiple gain control terms per dimension). We resolve the conflict by designing new experiments that disentangle top-down from bottom-up contributions, thereby avoiding confounds existing in previous studies. The results of our eye-tracking experiments show that subjects can selectively saccade to items belonging to the relevant feature interval compared with irrelevant intervals within a dimension. This suggests that top-down signals can specify not only the relevant feature dimension but also the relevant feature interval within a dimension. We conclude that top-down signals are fine grained and can specify multiple gain control terms per dimension.
V. Navalpakkam, L. Itti, Optimal cue selection strategy, In: Neural Information Processing Systems (NIPS), 2005.
V. Navalpakkam, M. A. Arbib, L. Itti, Attention and Scene Understanding, In: Neurobiology of Attention, (L. Itti, G. Rees, J. K. Tsotsos Ed.), pp. 197-203, San Diego, CA:Elsevier, 2005.
V. Navalpakkam, L. Itti, Modeling the influence of task on attention, Vision Research, Vol. 45, No. 2, pp. 205-231, 2005. (TOP cited paper in Vision Research since 2005)
Abstract: We propose a computational model for the task-specific guidance of visual attention in real-world scenes. Our model emphasizes four aspects that are important in biological vision: determining task-relevance of an entity, biasing attention for the low-level visual features of desired targets, recognizing these targets using the same low-level features, and incrementally building a visual map of task-relevance at every scene location. Given a task definition in the form of keywords, the model first determines and stores the task-relevant entities in working memory, using prior knowledge stored in long-term memory. It attempts to detect the most relevant entity by biasing its visual attention system with the entitys learned low-level features. It attends to the most salient location in the scene, and attempts to recognize the attended object through hierarchical matching against object representations stored in longterm memory. It updates its working memory with the task-relevance of the recognized entity and updates a topographic taskrelevance map with the location and relevance of the recognized entity. The model is tested on three types of tasks: single-target detection in 343 natural and synthetic images, where biasing for the target accelerates target detection over twofold on average; sequential multiple-target detection in 28 natural images, where biasing, recognition, working memory and long term memory contribute to rapidly finding all targets; and learning a map of likely locations of cars from a video clip filmed while driving on a highway. The models performance on search for single features and feature conjunctions is consistent with existing psychophysical data. These results of our biologically-motivated architecture suggest that the model may provide a reasonable approximation to many brain processes involved in complex task-driven visual behaviors.
V. Navalpakkam, L. Itti, A Goal Oriented Attention Guidance Model, Lecture Notes in Computer Science, Vol. 2525, pp. 453-461, Nov 2002.
V.Navalpakkam, P.Perona, "Modeling speed and accuracy in visual search".
W.J.Ma*, V.Navalpakkam*, J.Beck*, R.v.d.Berg & A.Pouget, "Optimal visual search under uncertainty with probabilistic population codes" (*equal contribution).
R.Pedersini*, V.Navalpakkam*, T.Horowitz & J.Wolfe, "Value maximization explains and cures the prevalence effect in visual search" (*equal contribution).
M.Milosavljevic, V.Navalpakkam, C.Koch, A.Rangel, "Why Candy Wrappers Should Be Red and Bright: The Effects of Irrelevant Perceptual Features On Choice".