**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Lecture# Information Retrieval Indexing: Part 2

Description

This lecture covers the construction of an inverted file for information retrieval indexing, addressing granularity levels, and the use of tries in index construction. It explains the process of searching the inverted file, vocabulary search, and manipulation of occurrences. The lecture also discusses index compression, index merging, and the map-reduce programming model for constructing the inverted file. Additionally, it explores the applications of map-reduce frameworks in various tasks, such as graph processing and learning probabilistic models.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

In course

CS-423: Distributed information systems

This course introduces the foundations of information retrieval, data mining and knowledge bases, which constitute the foundations of today's Web-based distributed information systems.

Instructor

Related concepts (102)

Graph rewriting

In computer science, graph transformation, or graph rewriting, concerns the technique of creating a new graph out of an original graph algorithmically. It has numerous applications, ranging from software engineering (software construction and also software verification) to layout algorithms and picture generation. Graph transformations can be used as a computation abstraction. The basic idea is that if the state of a computation can be represented as a graph, further steps in that computation can then be represented as transformation rules on that graph.

Graph (abstract data type)

In computer science, a graph is an abstract data type that is meant to implement the undirected graph and directed graph concepts from the field of graph theory within mathematics. A graph data structure consists of a finite (and possibly mutable) set of vertices (also called nodes or points), together with a set of unordered pairs of these vertices for an undirected graph or a set of ordered pairs for a directed graph. These pairs are known as edges (also called links or lines), and for a directed graph are also known as edges but also sometimes arrows or arcs.

Graph database

A graph database (GDB) is a database that uses graph structures for semantic queries with nodes, edges, and properties to represent and store data. A key concept of the system is the graph (or edge or relationship). The graph relates the data items in the store to a collection of nodes and edges, the edges representing the relationships between the nodes. The relationships allow data in the store to be linked together directly and, in many cases, retrieved with one operation.

Graph theory

In mathematics, graph theory is the study of graphs, which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of vertices (also called nodes or points) which are connected by edges (also called links or lines). A distinction is made between undirected graphs, where edges link two vertices symmetrically, and directed graphs, where edges link two vertices asymmetrically. Graphs are one of the principal objects of study in discrete mathematics.

Information retrieval

Information retrieval (IR) in computing and information science is the process of obtaining information system resources that are relevant to an information need from a collection of those resources. Searches can be based on full-text or other content-based indexing. Information retrieval is the science of searching for information in a document, searching for documents themselves, and also searching for the metadata that describes data, and for databases of texts, images or sounds.

Related lectures (23)

Graph Neural Networks: Interconnected World

Explores learning from interconnected data with graphs, covering modern ML research goals, pioneering methods, interdisciplinary applications, and democratization of graph ML.

Learning from the Interconnected World with Graphs

Explores learning from interconnected data using graphs, covering challenges, GNN design, research landscapes, and democratization of Graph ML.

Information Retrieval: Indexing and RetrievalCS-423: Distributed information systems

Covers indexing techniques, distributed retrieval algorithms, and challenges in large-scale web indexing.

Belief Propagation in Random GraphsPHYS-512: Statistical physics of computation

Explores belief propagation in random graphs and Bethe free entropy.

Counterfactuals: SEM and D-SeparationMGT-416: Causal inference

Explores counterfactuals in SEMs and D-Separation in graphical models.