Publication

Learning search polices from humans in a partially observable context

Aude Billard, Guillaume Pierre Luc De Chambrier
2014
Journal paper

Abstract

Decision making and planning for which the state information is only partially available is a problem faced by all forms of intelligent entities they being either virtual, synthetic or biological. The standard approach to mathematically solve such a decisional problem is to formulate it as a partially observable decision process (POMDP) and apply the same optimisation techniques used in the Markov decision process (MDP). However, applying naively the same methodology to solve MDPs as with POMDPs makes the problem computationally intractable. To address this problem, we take a programming by demonstration approach to provide a solution to the POMDP in continuous state and action space. In this work, we model the decision making process followed by humans when searching blindly for an object on a table. We show that by representing the belief of the human’s position in the environment by a particle filter (PF) and learning a mapping from this belief to their end effector velocities with a Gaussian mixture model (GMM), we can model the human’s search process and reproduce it for any agent. We further categorize the type of behaviours demonstrated by humans as being either risk-prone or risk-averse and find that more than 70% of the human searches were considered to be risk-averse. We contrast the performance of this human-inspired search model with respect to greedy and coastal navigation search methods. Our evaluation metric is the distance taken to reach the goal and how each method minimises the uncertainty. We further analyse the control policy of the coastal navigation and GMM search models and argue that taking into account uncertainty is more efficient with respect to distance travelled to reach the goal.

Official source

https://infoscience.epfl.ch/record/203426?ln=en

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Aude Billard, Guillaume Pierre Luc De Chambrier
2014
Journal paper

Abstract

Official source

https://infoscience.epfl.ch/record/203426?ln=en

About this result

Related concepts (41)

Related publications (38)

Related MOOCs (20)

Learning search polices from humans in a partially observable context

Graph Chatbot

Chat with Graph Search

Quantifying the Unknown: Data-Driven Approaches and Applications in Energy Systems

Multi-robot task allocation for safe planning against stochastic hazard dynamics

Ride-hail vehicle routing (RIVER) as a congestion game

Multi-robot task allocation for safe planning against stochastic hazard dynamics

Quantifying the Unknown: Data-Driven Approaches and Applications in Energy Systems

Ride-hail vehicle routing (RIVER) as a congestion game