Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
Finding relations between image semantics and image characteristics is a problem of long standing in computer vision and related fields. Despite persistent efforts and significant advances in the field, today’s computers are still strikingly unable to achieve the same complex understanding of semantic image content as human users do with ease. This is a problem when large sets of images have to be interpreted or somehow processed by algorithms. This problem becomes increasingly urgent with the rapid proliferation of digital image content due to the massive spreading of digital imaging devices such as smartphone cameras. This thesis develops a statistical framework to relate image keywords to image characteristics and is based on a large database of annotated images. The design of the framework respects two equally important properties. First, the output of the framework, i.e. a relatedness measure, is compact and easy-to-use for subsequent applications. We achieve this by using a simple, yet effective significance test. It measures a given keyword’s impact on a given image characteristic, which results in a significance value that serves as input for successive applications. Second, the framework is of very low complexity in order to scale well to large datasets. The test can be implemented very efficiently so that the statistical framework easily scales to millions of images and thousands of keywords The first application we present is semantic image enhancement. The enhancement framework takes two independent inputs, which are an image and a keyword, i.e. a semantic concept. The algorithm then re-renders the image to match the semantic concept. We implement this framework for four different tasks: tone-mapping, color enhancement, color transfer and depth-of-field adaptation. Unlike conventional image enhancement algorithms, our proposed approach is able to re-render a single input image for different semantic concepts, producing different image versions at the output to reflect the image context. We evaluate the proposed semantic image enhancement with two psychophysical experiments. The first experiment comprises almost 30’000 image comparisons of the original and the enhanced images. Due to the large scale, we crowdsourced the experiment on Amazon Mechanical Turk. The majority of the enhanced images was proven to be significantly better than the original images. The second experiment contains images that were enhanced for two different keywords. We compare our proposed algorithm against histogram equalization, Photoshop auto-contrast and the original. Our proposed method outperforms the others by a factor of at least 2.5. The second application is color naming, which aims at relating color values to color names and vice versa. Whereas conventional color naming depends on psychophysical experiments, we are able to solve this task fully automatically using the significance values. We first demonstrate the usefulness of our approach with an example of 50 color names and then extend it to the estimation of memory colors and color values for arbitrary semantic expressions. In a second study, we use a list of over 900 English color names and translate it to 9 other European and Asian languages. We estimate color values for these over 9000 color names and analyze the results from a language and color science point of view. Overall, we present a statistical framework that relates image keywords to image characteristics and apply it to two common imaging applications that benefit from a semantic understanding of the images. Further we outline the applicability of the framework to other applications and areas.
Tiago André Pratas Borges, Anja Fröhlich
, , ,