Scene image classification and segmentation with quantized local descriptors and latent aspect modeling

Pedro Manuel Da Silva Quelhas
2007
Non-EPFL thesis

Abstract

The ever increasing number of digital images in both public and private collections urges on the need for generic image content analysis systems. These systems need to be capable to capture the content of images from both scenes and objects, in a compact way that allows for fast search and comparison. Modeling images based on local invariant features computed at interest point locations has proven in recent years to achieve such capabilities and to provide a robust and versatile way to perform wide-baseline matching and search for both scene and object images. In this thesis we explore the use of local descriptors for image representation in the tasks of scene and object classification, ranking, and segmentation. More specifically, we investigate the combined use of text modeling methods and local invariant features. Firstly, our work attempts to elucidate whether a text like bag-of-visterms representation (histogram of quantized local visual features) is suitable for scene and object classification, and whether some analogies between discrete scene representations and text documents exist. We further explore the bag-of-visterms approach in a fusion framework, combining texture and color information for natural scene classification. Secondly, we investigate whether unsupervised, latent space models can be used as feature extractors for the classification task and to discover patterns of visual co-occurrence. In this direction, we show that Probabilistic Latent Semantic Analysis (PLSA) generates a compact scene representation, discriminative for accurate classification, and more robust than the bagof-visterms representation when less labeled training data is available. Furthermore, we show through aspect-based image ranking experiments, the ability of PLSA to automatically extract visually meaningful scene patterns, making such representation useful for browsing image collections. Finally, we further explore the use of the latent aspect modeling in an image segmentation task. By extending the representation resulting from the latent aspect modeling, we are able to introduce contextual information for image segmentation that goes beyond the traditional regional contextual modeling found for instance in Markov Random Field approaches.

Official source

https://infoscience.epfl.ch/record/146269?ln=en

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Scene image classification and segmentation with quantized local descriptors and latent aspect modeling

Graph Chatbot

Chat with Graph Search

Detecting Anomalies and Obstacles in Road Scenes

CLIP the Gap: A Single Domain Generalization Approach for Object Detection

Class Specific Feature Disentanglement and Text Embeddings for Multi-label Generalized Zero Shot CXR Classification

Detecting Anomalies and Obstacles in Road Scenes

Class Specific Feature Disentanglement and Text Embeddings for Multi-label Generalized Zero Shot CXR Classification

CLIP the Gap: A Single Domain Generalization Approach for Object Detection