**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Lecture# Generative Models: Self-Attention and Transformers

Description

This lecture covers generative models focusing on self-attention and transformers. Topics include autoencoders, Boltzmann machines, masked training, attention mechanisms, and maximum entropy principle. The slides discuss sampling methods, empirical means, and correlations in detail.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

In course

Instructor

PHYS-467: Machine learning for physicists

Machine learning and data analysis are becoming increasingly central in sciences including physics. In this course, fundamental principles and methods of machine learning will be introduced and practi

Related concepts (87)

Simple random sample

In statistics, a simple random sample (or SRS) is a subset of individuals (a sample) chosen from a larger set (a population) in which a subset of individuals are chosen randomly, all with the same probability. It is a process of selecting a sample in a random way. In SRS, each subset of k individuals has the same probability of being chosen for the sample as any other subset of k individuals. A simple random sample is an unbiased sampling technique. Simple random sampling is a basic type of sampling and can be a component of other more complex sampling methods.

Linear regression

In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression. This term is distinct from multivariate linear regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable.

Sampling bias

In statistics, sampling bias is a bias in which a sample is collected in such a way that some members of the intended population have a lower or higher sampling probability than others. It results in a biased sample of a population (or non-human factors) in which all individuals, or instances, were not equally likely to have been selected. If this is not accounted for, results can be erroneously attributed to the phenomenon under study rather than to the method of sampling.

Regression analysis

In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one or more independent variables (often called 'predictors', 'covariates', 'explanatory variables' or 'features'). The most common form of regression analysis is linear regression, in which one finds the line (or a more complex linear combination) that most closely fits the data according to a specific mathematical criterion.

Diagonal matrix

In linear algebra, a diagonal matrix is a matrix in which the entries outside the main diagonal are all zero; the term usually refers to square matrices. Elements of the main diagonal can either be zero or nonzero. An example of a 2×2 diagonal matrix is , while an example of a 3×3 diagonal matrix is. An identity matrix of any size, or any multiple of it (a scalar matrix), is a diagonal matrix. A diagonal matrix is sometimes called a scaling matrix, since matrix multiplication with it results in changing scale (size).

Related lectures (529)

Document Analysis: Topic ModelingDH-406: Machine learning for DH

Explores document analysis, topic modeling, and generative models for data generation in machine learning.

Boltzmann MachinePHYS-467: Machine learning for physicists

Introduces the Boltzmann Machine, covering expectation consistency, data clustering, and probability distribution functions.

Machine Learning FundamentalsDH-406: Machine learning for DH

Introduces fundamental machine learning concepts, covering regression, classification, dimensionality reduction, and deep generative models.

Linear Regression: Basics and EstimationPHYS-467: Machine learning for physicists

Covers the basics of linear regression and how to solve estimation problems using least squares and matrix notation.

Introduction to Machine Learning: Supervised LearningCS-233(a): Introduction to machine learning (BA3)

Introduces supervised learning, covering classification, regression, model optimization, overfitting, and kernel methods.