A collaboration between the laboratories of Profs. Tomaso Poggio, Jim DiCarlo and Earl Miller at MIT, David Ferster at Northwestern University, Christof Koch at Caltech, and Maximilian Riesenhuber at Georgetown University is exploring and evaluating the hypotheses that the cortical organization and the neural mechanisms of visual recognition can be explained by a coherent theoretical framework built on two existing computational models for recognition and attention and, secondly, that a combination of physiological work on monkeys and cats, together with visual psychophysics can be used to test and refine the theory.
The research is organized into three main projects. The work at MIT and Georgetown is guided by a quantitative hierarchical model of recognition, probing the relations between identification and categorization and the properties of selectivity and invariance of the neural mechanisms in IT cortex. The work at Northwestern University is testing a key prediction of the model about the nature of the pooling operation (a max operation vs. a linear sum) performed by complex cells in V1. The experiments are done in the anesthetized cat, intracellularly, to allow for a characterization of the underlying circuit and biophysical mechanisms. Finally, work at Caltech is extending the basic model of recognition by integrating it with a saliency-based attentional model. The computational component of this work, centered around the development of a quantitative model of visual recognition, constitutes the primary tool to enforce interactions between the investigators: the model suggests experiments and guides planning and interpreting new experiments.