Today, automatic object detection in image data is usually performed using machine-learning approaches relying on a holistic object model and the sliding window principle. A major concern with holistic object detection is the insufficient tolerance to deformation, partial occlusion, and rotation. Part-based object detection can potentially overcome these limitations. However, the creation of part-based object models currently requires a human designer specifying the number, locations and extents of object parts.
In this thesis, a novel method is introduced, that allows deriving part-based object models solely from training data. It automatically establishes the number as well as the locations and extents of the object parts. This is possible by employing a semi-supervised machine learning technique on local image features to detect clusters of feature locations that are subsequently used as parts of the object model.
The modeling process is exemplarily implemented for human faces. An evaluation on three known datasets shows that the automatically generated object models achieve recall and precision rates comparable to state of the art manually defined part-based models.