Publication

Gradient estimates of return

Related publications (32)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Learning Robust and Adaptive Representations: from Interactions, for Interactions

Yuejiang Liu

Interactions are ubiquitous in our world, spanning from social interactions between human individuals to physical interactions between robots and objects to mechanistic interactions among different components of an intelligent system. Despite their prevale ...

EPFL2023

Memory of Motion for Initializing Optimization in Robotics

Teguh Santoso Lembono

Many robotics problems are formulated as optimization problems. However, most optimization solvers in robotics are locally optimal and the performance depends a lot on the initial guess. For challenging problems, the solver will often get stuck at poor loc ...

EPFL2022

Feature distribution learning by passive exposure

David Pascucci, Gizay Ceylan

Humans can rapidly estimate the statistical properties of groups of stimuli, including their average and variability. But recent studies of so-called Feature Distribution Learning (FDL) have shown that observers can quickly learn even more complex aspects ...

ELSEVIER2022

Proximal Point Imitation Learning

Volkan Cevher, Luca Viano, Igor Krawczuk, Angeliki Kamoutsi

This work develops new algorithms with rigorous efficiency guarantees for infinite horizon imitation learning (IL) with linear function approximation without restrictive coherence assumptions. We begin with the minimax formulation of the problem and then o ...

2022

Product of experts for robot learning from demonstration

Emmanuel Pignat

Adaptability and ease of programming are key features necessary for a wider spread of robotics in factories and everyday assistance. Learning from demonstration (LfD) is an approach to address this problem. It aims to develop algorithms and interfaces such ...

EPFL2021

Byzantine Fault-Tolerant Distributed Machine Learning with Norm-Based Comparative Gradient Elimination

Nirupam Gupta, Shuo Liu

This paper considers the Byzantine fault-tolerance problem in distributed stochastic gradient descent (D-SGD) method - a popular algorithm for distributed multi-agent machine learning. In this problem, each agent samples data points independently from a ce ...

IEEE COMPUTER SOC2021

Inference on the Angular Distribution of Extremes

Claudio Andri Semadeni

The spectral distribution plays a key role in the statistical modelling of multivariate extremes, as it defines the dependence structure of multivariate extreme-value distributions and characterizes the limiting distribution of the relative sizes of the co ...

EPFL2020

Multi-armed Bandits in Action

Farnood Salehi

Making decisions is part and parcel of being human. Among a set of actions, we want to choose the one that has the highest reward. But the uncertainty of the outcome prevents us from always making the right decision. Making decisions under uncertainty can ...

EPFL2020

A t-distribution based operator for enhancing out of distribution robustness of neural network classifiers

Philip Neil Garner

Neural Network (NN) classifiers can assign extreme probabilities to samples that have not appeared during training (out-of-distribution samples) resulting in erroneous and unreliable predictions. One of the causes for this unwanted behaviour lies in the us ...

IEEE2020

Multivariate extremes over a random number of observations

Simone Padoan, Stefano Rizzelli

The classical multivariate extreme-value theory concerns the modeling of extremes in a multivariate random sample, suggesting the use of max-stable distributions. In this work, the classical theory is extended to the case where aggregated data, such as max ...

WILEY2020