Yuri A. Ivanov

Mitsubishi Electric Research Labs,
201 Broadway, Cambridge, MA 02139
ph: (617) 621-7592,
fax: (617) 621-7550,
e-mail: yivanov@merl.com
_____________________________________________________


Research Interests: Statistical models of visual and auditory perception, visual motion and speech processing,
machine learning, reinforcement learning, computer vision, on-line algorithms.

Education: 
PhD, MIT, 2001
MS in Media Arts and Sciences, MIT, 1998
MS, BA in EECS, State Academy of Air and Space, Leningrad (St. Petersburg), Russia, 1992

Professional Experience:

2005 - present: Principal Member of Technical Staff     Mitsubishi Electric Research Labs, Cambridge, MA
- Medical Image Registration for particle beam cancer treatment equipment;
- Visualization of medical data;
- HCI for medical data manipulation;
- Devising algorithms for activity analysis in large scale sensor networks and building monitoring systems;
- Development of novel monitoring systems usinng heterogeneous Sensor Networks;
- Researching methods for action and event recognition in video;
- Developing algorithms for event data mining in perceptual networks with ultra-low resolution sensors;
- Heading the "Human-in-the-loop" initiative for surveillance and security applications;
- Developing algorithms for low-level event detection;
- Running an Event Detection and Recognition interest group.

2004 - present: Member of the Board                     Immersion Music, Cambridge, MA
- Helping run a music technology non-profit organization;
- Planning, organizing and running performance and public exhibitions and music technology projects;
- Logistics and organization of the data collection from the conductor, musicians and the audience
  during live performance of Boston Symphony Orchestra, conducted by Keith Lockhart;
- Designing and developing software for "UBS Virtual Orchestra" - a musical conducting system. The system is on the World tour for the 2008 concert season. Venues include Boston Symphony Hall, Philadelphia Orchestra Kimmel Center, Seattle Symphony, The Cleveland Orchestra, Ravinia Festival, Montreux Jazz Festival.
   

2002 - 2005: Sr. Research Scientist                     Honda Research Instutite US, Boston, MA
- Developing multi-view gesture and motion models for action recognition system;
- Designed and implemented a distributed multi-modal human identification system (MMHID) to run on
  ASIMO, the Honda Humanoid Robot;
- Developed a technique for approximate Bayesian classifier combination;
- Implemented a variety of classifiers for individual modalities for MMHID including human speech,
  as well as numerous video features;
- Researched classifier combination techniques applied to face recognition;
- Developed an autonomous surveillance and event logging system for Multimodal Human ID project;
- Created a multi-modal media database maintenance toolkit that facilitates Lab-wide access to the
  multimodal data collected by the event logging system;
- Developed an algorithm for vision-based automatic rear-view mirror stabilization;
- Supervised interns on a number of research projects;
- Participated in HRI-US strategic development meetings, designing Vision and Learning aspects of
  future Honda Humanoid Robots.

2002 - 2007: Visiting Scientist, Lecturer               MIT, Dept. of Brain and Cognitive Science, CBCL
- Advised graduate students with Prof. Poggio, serving as their thesis supervisor;
- Worked on a multi-modal data mining problems on a Human ID database;
- For three years co-organized and taught a semester-long graduate level class on "Pattern
  Recognition for Computer Vision". The class got 6.5/7.0 student rating. Class materials are
  posted on the MIT OpenCourseware web site. 

1996 - 2002: Research Assistant, PhD Student            MIT, Media Lab, VisMod, Synthetic Characters
- Studied computer vision and machine learning;
- Co-developed a fast disparity-based visual touch detection algorithm, which is generalized to
  an arbitrary 3D manifold. The advantage of the algorithm is that it does not require a physical
  surface to be present - it can be used to analytically specify an arbitrary surface in user
  coordinates, where occupancy is detected in the real time;
- Developed on-line learning algorithms for learning perceptual organization in the action-perception
  cycle for autonomous agents with audio and video interfaces. These algorithms extend to Partially
  Observable Markov Decision Processes and explore methods of reward-based statistical estimation.  
  They achieve their fundamental advantage by balancing unsupervised and reward-based costs in order
  to achieve a faster convergence;
- Developed an on-line command utterance learning algorithm used in several autonomous learning systems;
- Co-developed a system for speech and discourse analysis for the "Facilitator Room" project. The system
  is built for identification of the topic of a conversation in a group of people;
- Conducted research in the area of Stochastic Context-Free Grammars (SCFG) and languages for action
  recognition and motion analysis. SCFGs are well suited for modeling complex dependencies in the input
  stream. The work focused on possibilities of their use in machine vision for identification of high-level
  structured activities, such as musical conducting, drawing, and surveillance, where the expert knowledge
  can be encoded in the statistical model by formulating a set of grammar rules;
- Built an automatic video surveillance system based on the Stochastic Context-Free Grammar parser
  for DARPA VSAM feasibility demo. The system used the SCFG as a model of activities and multi-object
  interactions in a parking lot.  A novel parsing algorithm was developed to model multi-object events;
- Developed a fast lighting independent background subtraction algorithm. The geometry-based approach is
  based on a fast and efficient method for segmentation of foreground objects violating the pre-computed
  stereo model of the background and delivers a real-time performance in situations with changing
  lighting conditions;   
- Participated in the development of the KidsRoom project, which was exhibited at the Millennium Dome, London.

1995 - 1996: Senior Consultant                          Albion Intl., Sprint Intl., Washington DC
Responsibilities included a large-scale router management system design and developer training, as well
as implementation of a script compiler for the large-scale router testing.

1994 - 1995: Research Engineer                          Magnet Interactive Studios, Washington DC
Conducted research in wavelet-based image compression and multi-resolution mesh optimization for 3D
modeling; explored algorithms for distributed graphics rendering; designed and developed a cross-platform
game and media engine for PC, Mac, SGI, Sony PlayStation and 3DO game platform.

1992 - 1994: Sr. Software Engineer                      ECTA Corporation, Ambler, PA
Managed data acquisition and GUI teams of a financial services database project; designed the record
caching mechanism and a memory management scheme; developed embedded data validation mechanisms.

1984 - 1992: Software Engineer, Intern                  Russian Academy of Sciences, Leningrad
Participated in the research of algorithms for behavior modeling; developed a data visualization system
for a custom sensor array; took part in developing a LOGO interpreter and LOGAME - an educational system
for a high school course on programming languages; implemented OS shell and disk management tools for a
new microcomputer family; interned with a team developing software for the Space Flight Control Center.

Teaching Experience:
- Fall 2002-2006, MIT: "Pattern Recognition for Computer Vision", Lecturer. Developed and taught a
  semester-long graduate level class with Dr. Heisele and Prof. Poggio. We have been asked to teach
  it as a permanent course;
- Feb 2004, 2005 Harvard Extension School: "Robotics, Learning and Making Decisions", CS invited lecture;
- 1998, 1999, MIT: "High-Level Machine Vision", Teaching Assistant.


Recent Awards and Invited Talks:
1. Looking at People: A year in the life of a Research Lab, Keynote Address, ICMI 2007, Nagoya, Japan
2. Best Paper: Y. Ivanov, C. Wren, A. Sorokin, I. Kaur, Visualizing the History of
   Living Spaces, InfoVis 2007
3. Computer Vision and Challenges of HCI, Invited Talk, Japanese-American Symposium on
   Fronteers of Engineering, 2007


Selected publications:
Journals:
1. F. Polrikli, Y. Ivanov, T. Haga, Robust Abandoned Object Detection using Dual Foregrounds,
   EURASIP, 2007
2. Y. Ivanov, R. Hamid, Weighted Ensemble Boosting for Robust Activity Recognition in Video,
   Intl Journal on Machine Graphics and Vision, 2006 Vol 15, No. 3/4, Warsaw, Poland
3. Y. A. Ivanov, A. F. Bobick, Recognition of Visual Activities and Interactions by Stochastic Parsing,
   Transactions on Pattern Analysis and Machine Intelligence 22(8), 852-872 (2000).
4. Y. Ivanov, A. Bobick, J. Liu, Fast Lighting Independent Background Subtraction, International Journal
   of Computer Vision 37(2), 199-207 (2000).
5. A. F. Bobick, S. S. Intille, J. W. Davis, F. Baird, C. S. Pinhanez, L. W. Campbell, Y. A. Ivanov,
   A. Schutte and A. Wilson, Perceptual User Interfaces: The KidsRoom, Communications of the ACM 43(3),
   60-61 (2000).
6. A. F. Bobick, S. S. Intille, J. W. Davis, F. Baird, L. W. Campbell, Y. Ivanov, C. S. Pinhanez,
   A. Schutte and A. Wilson, The KidsRoom: A perceptually-based interactive and immersive story
   environment, PRESENCE: Teleoperators and Virtual Environments 8(4), 367-391 (1999).

Conferences:
1.  Daniel Wigdor, Yuri A. Ivanov, and Christopher R. Wren, Soda Pop Zombies: Soft Drink Consumption
    and Motion, 2007 ICMI Workshop on Massive Datasets, November 15, 2007. Nagoya, Japan
2.  Christopher R. Wren, Yuri A. Ivanov, Darren Leigh, Jonathan Westhues, The MERL Motion Detector Dataset:       2007 Workshop on Massive Datasets", Workshop on Massive Datasets, November 15, 2007. Nagoya, Japan
3.  Y. Ivanov, Looking at People: A year in the life of a Research Lab, Keynote Address, ICMI 2007,
    Nagoya, Japan
4.  Y. Ivanov, C. Wren, A. Sorokin, I. Kaur, Visualizing the History of Living Spaces,
    InfoVis 2007, Best Paper
5.  Y. Ivanov, Computer Vision and Challenges of HCI, Invited Talk, Japanese-American Symposium on
    Fronteers of Engineering, 2007
6.  C. Wren, Y. Ivanov, I. Kaur, A. Sorokin, SocialMotion: Measuring the Hidden Social Life of a
    Building, LoCA, Sep 2007
7.  C. Wren., M. Buchin, Y. Ivanov, D. Leigh, T.-P.Tian, P. Turaga, J. Westhues, Buzz: Measuring
    and Visualizing Conference Crowds, Siggraph 2007 ETech contribution
8.  Y. Ivanov, C. Wren, Towards Spatial Queries for Spatial Surveillance Tasks, Pervasive/PTA 2006
9.  Y. Ivanov, A. Sorokin, C. Wren, I. Kaur, Tracking people in Mixed Modality Systems, VCIP 2007
10. Y. Ivanov, C. Wren, Markov Stump Boosting, Snowbird Learning Workshop, 2006
11. Y. Ivanov, Multi-modal Human Identification System, WACV, Breckenridge (2005)
12. Y. Ivanov, T. Serre, J. Bouvrie, Error-Weighted Classifier Combination for Multi-modal Human
    Identification, In Submission (2004)
13. Y. Ivanov, B. Heisele, T. Serre, Using Component Features for Face Classification, International
    Conference on Automatic Face and Gesture Recognition, Seoul, Korea (2004).
14. A. Kapoor, R. Picard, Y. Ivanov, Probabilistic Combination of Multiple Modalities to Detect
    Interest, Intl. Conference on Pattern Recognition, Cambridge, UK (2004)
15. B. Kim, Y. Ivanov, T. Poggio, Temporal Integration for Multi-modal Human Identification, Snowbird
    Learning Workshop, Snowbird, UT (2003).
16. Y. Ivanov, B. Blumberg, Learning Perception in Autonomous Agents, NASA JPL Workshop on Radical Agent
    Concepts, McLane, VA (2002).
17. Y. Ivanov, B. Blumberg, Developmental Learning of Memory-based Perceptual Models, Intl. Conference
    on Developmental Learning, Cambridge, MA (2002).
18. C. Wren, Y. Ivanov, Volumetric Operations with Surface Margins, IEEE Computer Vision and Pattern
    Recognition Technical Sketches, Kauai, Hawaii (2001).
19. Y. Ivanov, On-line Learning with Reduction Rules, Learning Workshop, Snowbird, UT (2002).
20. Y. Ivanov, Cluster Analysis for Weakly Labeled Data, Accepted to Recent Developments in Mixture
    Modeling, Mixtures 2001, Hamburg, Germany (2001).
21. Y. Ivanov, B.Blumberg, A. Pentland, Expectation-Maximization for Weakly Labeled Data, 18th
    International conference on Machine Learning 2001, Williamstown, MA (2001).
22. R. Burke, D. Isla, M. Downie, Y. Ivanov, B. Blumberg. CreatureSmarts: The Art and Architecture
    of a Virtual Brain, Game Developers Conference 01, pp. 147-166, San Jose, CA (2001)
23. Y. Ivanov, B. Blumberg, A. Pentland, EM For Perceptual Coding and Reinforcement learning Tasks,
    8th International Symposium on Intelligent Robotic Systems, Reading, UK (2000).
24. T. Jebara, Y. Ivanov, A. Rahimi, A. Pentland, Tracking Conversational Context for Machine Mediation
    of Human Discourse, Socially Intelligent Agents 2000 - The Human in the Loop, Fallmouth, MA (AAAI, 2000).
25. Y. Ivanov, A. Bobick, Recognition of Multi-Agent Interaction in Video Surveillance, ICCV 99,
    Corfu, Greece (1999).
26. Y. Ivanov, C. Stauffer, A. Bobick, W. E. L. Grimson, Video Surveillance of Interactions, CVPR 99,
    Workshop on Video Surveillance, Ft. Collins, CO (1999).
27. A. F. Bobick, Y. A. Ivanov, Action Recognition using Probabilistic Parsing, CVPR 98,
    Santa Barbara, CA (1998).
28. Y. A. Ivanov, A. F. Bobick, J. Liu, Fast Lighting Independent Background Subtraction,
    ICCV 99, Workshop on Video Surveillance, Bombay, India (1998).

Patent Applications:
1. System and Method for Modeling Movement of Objects Using Probabilistic Graphs Obtained From
   Surveillance Data
2. Method for Detecting Objects Left-Behind in a Scene
3. System and Method for Determining Geometries of Scenes
4. Tactile Output Device
5. Method for Processing Queries for Surveillance Tasks
6. Surveillance System and Method for Tracking and Identifying objects in Environments
7. Probabilistic Graphs for Modeling and Predicting Ambiguous Human Behavior
8. Weighted Ensemble Boosting Method for Classifier Combination and Feature Selection
9. Vision-Based Touch System
10. Integrated Learning for Interactive Synthetic Characters
11. Using Component Features for Face Recognition
12. Error-Weighted Classifier Combination for Multi-Modal Human Identification
13. Automatic Rear-View Mirror Stabilization
14. Computer Vision Depth Segmentation Using Virtual Surface
15. Adaptive Tracking for Gesture Interfaces

Press:
Ambient Intelligence:
1. New Scientist, April 2008, 'Big brother' buildings offer less invasive security
2. Slashdot,  Movement Sensors a Less Invasive Alternative To CCTV


Virtual Orchestra:
1. Minnesota Public Radio: "Orchestral fans take a swing at conducting a virtual orchestra"
2. Boston Channel TV: "Local Students Lead Virtual Orchestra"

3. Boston Magazine insert
4. USA Today 
5. Minnesota Orchestra announcement 
6. Associated Press: "Touring classical-music video game makes everyone an orchestral conductor"

Boston Symphony:

1. Boston Globe 
2. CBS4 Boston Top News 
3. Montreal Gazette