Speaker: Sébastien Poullot (NII & JFLI)
Date: 14th December 2012
Place: room 214, Faculty of Science Bldg. 7, Hongo Campus, The University of Tokyo
My latest works are focus on the representation/separation of videos in items (such as people, objects, background, etc). At first i worked on the separation of animated objects from their environment. Based on the assumption that in videos the objects of attention (foreground objects) are moving while ”the world” (background) is still, we propose to perform a camera work estimation on couple of successive frames, using matching of local features and RANSAC. Then the frames can be aligned and background substraction performed so as to isolate the areas of activity. Each area is potentially a (set of) foreground object(s). Therefore, after a morphological and probabilistic filtering, the grabcut algorithm is applied on remaining locations in order to smartly segment the objects. We first show that this intuitive but naive approach achieved surprisingly competitive performance compared to the state of the art methods. However, we also observe that the method fails when camera work estimation does not work well, because camera work estimation and background region estimation are inextricable processes. To circumvent this, we propose an iterative method between the matching of local features and the camera work estimation. The method is fully unsupervised and does not need any prior knowledge. It can handle professional as home made videos. Our simple is fast, thus it gives us a good hope to achieve real time unsupervised foreground object segmentation of streaming videos with parallel cores or GPUs.