Object Recognition

In Zurich I have been mostly working on object detection and learning how to detect objects

In Aachen I worked on a variety of methods for object recognition and detection in cluttered scenes. A summary and comparison of these approaches is given in my dissertation in chapter 4/section 11.

The individual methods were published in the papers listed in the following:

Applications of several of these techniques to applications:

Video Retargeting

The increase in variety of commonly used display devices requires adapting visual media to different resolutions, aspect ratios, colors and frame-rates - a process called “retargeting”.

In CVPR08 we proposed a method that uses machine learning techniques to determine the most important image regions and then optimizes a smooth sequence of subimages to display to the user.

A demo-video is available here as quicktime and here as streamed flash video.

Image Retrieval

During my diploma thesis I started devloping the content-based image retrieval system FIRE: Flexible Image Retrieval Engine.

A summary of my efforts regarding image retrieval is given in my dissertation in chapter 3.

Regarding image retrieval I investigated

Discriminative Modelling

Discriminative models are a popular approach to classifcation.

We investigated incorporating additional invariances for speech and image recognition:

Natural Language Processing and Speech Recognition

Doing my PhD in a natural language processing lab led to some collaboration and some newly sparked projects of mine.

  • Deep belief networks for translation: WMT09
  • Speech recognition with nearest neighbors: ICSLP07
  • Discriminative models for speech recognition: ICML08 ICASSP08

Image Deformation Modelling

We investigated image deformation modeling for

  • optical character recognition: DAGM04k PAMI07
  • medical image classification: PRL08b
  • deformation-aware discriminative models DAGM09

Evaluation of Image Retrieval

I have been involved with the organization of ImageCLEF from 2005 to 2008. ImageCLEF is the Cross Language Image Retrieval Track of CLEF.

Publications resulting from this:

Sign Language and Gesture Recognition and Tracking

Bringing together ideas from computer vision and speech recognition led to working on sign language recognition in Aachen.

Jointly with Philippe Dreuw and some others I worked on