Humans excel at finding objects in complex natural scenes, but the features that guide this behaviour have proved elusive. We used computational modeling to measure the contributions of target, nontarget, and coarse scene features towards object detection in humans. In separate experiments, participants detected cars or people in a large set of natural scenes. For each scene, we extracted target-associated features, annotated the presence of nontarget objects (e.g., parking meter, traffic light), and extracted coarse scene structure from the blurred image. These scene-specific values were then used to model human reaction times for each novel scene. As expected, target features were the strongest predictor of detection times in both tasks. Interestingly, target detection time was additionally facilitated by coarse scene features but not by nontarget objects. In contrast, nontarget objects predicted target-absent responses in both person and car tasks, with contributions from target features in the person task. In most cases, features that speeded up detection tended to slow down rejection. Taken together, these findings demonstrate that humans show systematic variations in object detection that can be understood using computational modeling.
Keywords: Categorization; Object Recognition; Scene Perception.