Object Recognition
In Zurich I have been mostly working on object detection and learning how to detect objects
- Weakly supervised learning: ECCV10, also related to: CVPR10 ICML10
- Unsupervised segmentation: ECCV10b
- Generic object detection: CVPR10
- Global self-similarity descriptors: CVPR10b
In Aachen I worked on a variety of methods for object recognition and detection in cluttered scenes. A summary and comparison of these approaches is given in my dissertation in chapter 4/section 11.
The individual methods were published in the papers listed in the following:
- Nearest neighbor based approaches: DAGM04
- Bag-of-Words approaches: CVPR05 DAGM05 DAGM06
- Geometric matching: ELCVIA07
- Gaussian mixtures: BMVC06 PR10
- Log-linear mixtures: BMVC09
- Real-time recognition using random forests: CVPR07
Applications of several of these techniques to applications:
- Medical image annotation PRL08b
- Porn classification ICPR08a
In December 2004, the popular science TV show “Planetopia” contacted us about the technical possibilities to automatically detector porn. A video of Daniel and me explaining it is available here. - Face recognition for robot vision ICPR08b
Video Retargeting
The increase in variety of commonly used display devices requires adapting visual media to different resolutions, aspect ratios, colors and frame-rates - a process called “retargeting”.
In CVPR08 we proposed a method that uses machine learning techniques to determine the most important image regions and then optimizes a smooth sequence of subimages to display to the user.
A demo-video is available here as quicktime and here as streamed flash video.
Image Retrieval
During my diploma thesis I started devloping the content-based image retrieval system FIRE: Flexible Image Retrieval Engine.
A summary of my efforts regarding image retrieval is given in my dissertation in chapter 3.
Regarding image retrieval I investigated
- Visual descriptors: IR08 DAGM04 ICPR04
- Perceptual properties of visual descriptors: JASIST08
- Combination of visual and textual retrieval: CLEF05
- User interaction: ICPR08d MLMI08
Discriminative Modelling
Discriminative models are a popular approach to classifcation.
We investigated incorporating additional invariances for speech and image recognition:
- object recognition: BMVC06 BMVC09 PR10
- speech recognition: ICML08 ICASSP08
- deformation-aware discriminative models DAGM09
Natural Language Processing and Speech Recognition
Doing my PhD in a natural language processing lab led to some collaboration and some newly sparked projects of mine.
- Deep belief networks for translation: WMT09
- Speech recognition with nearest neighbors: ICSLP07
- Discriminative models for speech recognition: ICML08 ICASSP08
Image Deformation Modelling
We investigated image deformation modeling for
- optical character recognition: DAGM04k PAMI07
- medical image classification: PRL08b
- deformation-aware discriminative models DAGM09
Evaluation of Image Retrieval
I have been involved with the organization of ImageCLEF from 2005 to 2008. ImageCLEF is the Cross Language Image Retrieval Track of CLEF.
Publications resulting from this:
- Medical image annotation and retrieval: IJCV07 PRL08a CLEF08m CLEF07m CLEF06m CLEF05 SPIE06 ACMM05
- Object recognition and retrieval: CLEF08o CLEF07o CLEF06o CLEF05
- Photo retrieval: CLEF06o LREC06 CLEF05
Sign Language and Gesture Recognition and Tracking
Bringing together ideas from computer vision and speech recognition led to working on sign language recognition in Aachen.
Jointly with Philippe Dreuw and some others I worked on