Visual Thinking with Graph Network

Google Tech Talks
March, 13 2008


Many visual perception tasks are fundamentally NP-hard computational problems. Solving these problems robustly requires thinking through combinatorially many hypothesis. Despite this, our human visual system performs these tasks effortlessly. How is this done? I would like to make two points on this topic. First, formulating visual thinking as NP-hard computation tasks has an important advantage: visual routines can be analyzed precisely to identify their behaviors independently of their implementations. Second, I will show there is a class of graph optimization problems which can be implemented using a distributed network system with physical (and plausible biological) interpretation.

I will demonstrate this graph based approach for: 1) image segmentation using Normalized Cuts with explanations for illusory contours, visual pop out and attention; 2) salient contour grouping

Speaker: Jianbo Shi
Jianbo Shi was born in Shanghai, China. Since then he has been moving. He studied Computer Science and Mathematics as an undergraduate at Cornell University where he received his B.A. in 1994. He received his Ph.D. degree in Computer Science from University of California at Berkeley in 1998, for his thesis on Normalize Cuts image segmentation algorithm. He joined The Robotics Institute at Carnegie Mellon University in 1999 as a research faculty, where he lead the Human Identification at Distance(HumanID) project, developing vision techniques for human identification and activity inference. In January 2003, he joined the Department of Computer & Information Science at University of Pennsylvania as an Assistant Professor.

His current research focus on human behavior analysis and image recognition-segmentation. His other research interests include image/video retrieval, and vision based desktop computing. His long-term interests center around a broader area of machine intelligence, he wishes to develop a “visual thinking” module that allows computers not only to understand the environment around us, but also to achieve higher level cognitive abilities such as machine memory and learning.