Publication

Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers

Related concepts (35)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Object co-segmentation

In computer vision, object co-segmentation is a special case of , which is defined as jointly segmenting semantically similar objects in multiple images or video frames. It is often challenging to extract segmentation masks of a target/object from a noisy collection of images or video frames, which involves object discovery coupled with . A noisy collection implies that the object/target is present sporadically in a set of images or the object/target disappears intermittently throughout the video of interest.

Visual culture

Visual culture is the aspect of culture expressed in . Many academic fields study this subject, including cultural studies, art history, critical theory, philosophy, media studies, Deaf Studies, and anthropology. The field of visual culture studies in the United States corresponds or parallels the Bildwissenschaft ("image studies") in Germany. Both fields are not entirely new, as they can be considered reformulations of issues of photography and film theory that had been raised from the 1920s and 1930s by authors like Béla Balázs, László Moholy-Nagy, Siegfried Kracauer and Walter Benjamin.

Optical character recognition

Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of s of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image (for example: from a television broadcast).

Visual acuity

Visual acuity (VA) commonly refers to the clarity of vision, but technically rates a person's ability to recognize small details with precision. Visual acuity depends on optical and neural factors. Optical factors of the eye influence the sharpness of an image on its retina. Neural factors include the health and functioning of the retina, of the neural pathways to the brain, and of the interpretative faculty of the brain. The most commonly referred-to visual acuity is distance acuity or far acuity (e.g.

Multimodal sentiment analysis

Multimodal sentiment analysis is a technology for traditional text-based sentiment analysis, which includes modalities such as audio and visual data. It can be bimodal, which includes different combinations of two modalities, or trimodal, which incorporates three modalities. With the extensive amount of social media data available online in different forms such as videos and images, the conventional text-based sentiment analysis has evolved into more complex models of multimodal sentiment analysis, which can be applied in the development of virtual assistants, analysis of YouTube movie reviews, analysis of news videos, and emotion recognition (sometimes known as emotion detection) such as depression monitoring, among others.

Information extraction

Information extraction (IE) is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents and other electronically represented sources. In most of the cases this activity concerns processing human language texts by means of natural language processing (NLP). Recent activities in multimedia document processing like automatic annotation and content extraction out of images/audio/video/documents could be seen as information extraction Due to the difficulty of the problem, current approaches to IE (as of 2010) focus on narrowly restricted domains.

Active-pixel sensor

An active-pixel sensor (APS) is an , which was invented by Peter J.W. Noble in 1968, where each pixel sensor unit cell has a photodetector (typically a pinned photodiode) and one or more active transistors. In a metal–oxide–semiconductor (MOS) active-pixel sensor, MOS field-effect transistors (MOSFETs) are used as amplifiers. There are different types of APS, including the early NMOS APS and the now much more common complementary MOS (CMOS) APS, also known as the CMOS sensor.

Diachrony and synchrony

Synchrony and diachrony are two complementary viewpoints in linguistic analysis. A synchronic approach (from συν- "together" and χρόνος "time") considers a language at a moment in time without taking its history into account. Synchronic linguistics aims at describing a language at a specific point of time, often the present. In contrast, a diachronic (from δια- "through" and χρόνος "time") approach, as in historical linguistics, considers the development and evolution of a language through history.

Natural language processing

Natural language processing (NLP) is an interdisciplinary subfield of linguistics and computer science. It is primarily concerned with processing natural language datasets, such as text corpora or speech corpora, using either rule-based or probabilistic (i.e. statistical and, most recently, neural network-based) machine learning approaches. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them.

Predictive analytics

Predictive analytics is a form of business analytics applying machine learning to generate a predictive model for certain business applications. As such, it encompasses a variety of statistical techniques from predictive modeling and machine learning that analyze current and historical facts to make predictions about future or otherwise unknown events. It represents a major subset of machine learning applications; in some contexts, it is synonymous with machine learning.