**Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?**

Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur GraphSearch.

Publication# Mechanical intelligence for learning embodied sensor-object relationships

Résumé

Intelligence involves processing sensory experiences into representations useful for prediction. Understanding sensory experiences and building these contextual representations without prior knowledge of sensor models and environment is a challenging unsupervised learning problem. Current machine learning methods process new sensory data using prior knowledge defined by either domain knowledge or datasets. When datasets are not available, data acquisition is needed, though automating exploration in support of learning is still an unsolved problem. Here we develop a method that enables agents to efficiently collect data for learning a predictive sensor model-without requiring domain knowledge, human input, or previously existing data-using ergodicity to specify the data acquisition process. This approach is based entirely on data-driven sensor characteristics rather than predefined knowledge of the sensor model and its physical characteristics. We learn higher quality models with lower energy expenditure during exploration for data acquisition compared to competing approaches, including both random sampling and information maximization. In addition to applications in autonomy, our approach provides a potential model of how animals use their motor control to develop high quality models of their sensors (sight, sound, touch) before having knowledge of their sensor capabilities or their surrounding environment. Information-based search strategies are relevant for the learning of interacting agents dynamics and usually need predefined data. The authors propose a method to collect data for learning a predictive sensor model, without requiring domain knowledge, human input, or previously existing data.

Official source

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Concepts associés

Chargement

Publications associées

Chargement

Publications associées (65)

Chargement

Chargement

Chargement

Concepts associés (25)

Contrôle moteur

En neurosciences, le contrôle moteur est la capacité de faire des ajustements posturaux dynamiques et de diriger le corps et les membres dans le but de faire un mouvement déterminé. Le mouvement volo

Donnée

Une donnée est ce qui est connu et qui sert de point de départ à un raisonnement ayant pour objet la détermination d'une solution à un problème en relation avec cette donnée. Cela peut être une descr

Connaissance

La connaissance est une notion aux sens multiples, à la fois utilisée dans le langage courant et objet d'étude poussée de la part des sciences cognitives et des philosophes contemporains.
Les connais

Artificial intelligence (AI) and machine learning (ML) have become de facto tools in many real-life applications to offer a wide range of benefits for individuals and our society. A classic ML model is typically trained with a large-scale static dataset in an offline manner. Therefore, it can not quickly capture new knowledge in non-stationary environments, and it is difficult to maintain long-term memory for knowledge learned earlier. In practice, many ML systems often need to learn new knowledge (e.g., domains, tasks, distributions, etc.) as more data and experiences are collected, which is referred to as a lifelong ML paradigm in this thesis. We focus on two fundamental challenges to achieve lifelong learning. The first challenge is to quickly learn new knowledge with a small number of observations, and we refer to it as data efficiency. The second challenge is to prevent an ML system from forgetting the old knowledge it has previously learned, and we refer to this challenge as knowledge retention. These two capabilities are crucial for applying ML to most practical applications. In this thesis, we study three important applications with these two challenges, including recommendation systems, task-oriented dialog systems, and the image classification task.
First, we propose two approaches to improve data efficiency for task-oriented dialog systems. The first proposed approach is based on Meta-learning, aiming to learn a better model parameter initialization from training data. It can quickly reach a good parameter region of new domains or tasks with a small number of labeled data. The second proposal takes a semi-supervised self-training approach to iteratively train a better model using sufficient unlabeled data when only a limited number of labeled data are available. We empirically demonstrate that both approaches effectively improve data efficiency to learn new knowledge. The second self-training method even consistently improves state-of-the-art large-scale pre-trained models.
Second, we tackle the knowledge retention challenge to mitigate the detrimental catastrophic forgetting issue when neural networks learn new knowledge sequentially. We formulate and investigate the ``continual learning'' setting for task-oriented dialog systems and recommendation systems. Through extensive empirical evaluation and analysis, we demonstrate the importance of (1) exemplar replay: storing representative historical data and replaying them to the model while learning new knowledge; (2) dynamic regularization: applying a dynamic regularization term to put flexible constraints on not forgetting previously learned knowledge in each model update cycle.
Lastly, we conduct several initial attempts to achieve both data efficiency and knowledge retention in a unified framework. In the recommendation scenario, we propose two approaches using different non-parametric memory modules to retain long-term knowledge. More importantly, the two proposed non-parametric predictions computed on top of them help learn and memorize new knowledge in a data-efficient manner. Apart from the recommendation scenario, we propose a probabilistic evaluation protocol in the widely studied image classification domain. It is general and versatile to simulate a wide range of realistic lifelong learning scenarios that require both knowledge retention and data efficiency for studying different techniques. Through experiments, we also demonstrate the benefit

Machine Learning is a modern and actively developing field of computer science, devoted to extracting and estimating dependencies from empirical data. It combines such fields as statistics, optimization theory and artificial intelligence. In practical tasks, the general aim of Machine Learning is to construct algorithms able to generalize and predict in previously unseen situations based on some set of examples. Given some finite information, Machine Learning provides ways to exract knowledge, describe, explain and predict from data. Kernel Methods are one of the most successful branches of Machine Learning. They allow applying linear algorithms with well-founded properties such as generalization ability, to non-linear real-life problems. Support Vector Machine is a well-known example of a kernel method, which has found a wide range of applications in data analysis nowadays. In many practical applications, some additional prior knowledge is often available. This can be the knowledge about the data domain, invariant transformations, inner geometrical structures in data, some properties of the underlying process, etc. If used smartly, this information can provide significant improvement to any data processing algorithm. Thus, it is important to develop methods for incorporating prior knowledge into data-dependent models. The main objective of this thesis is to investigate approaches towards learning with kernel methods using prior knowledge. Invariant learning with kernel methods is considered in more details. In the first part of the thesis, kernels are developed which incorporate prior knowledge on invariant transformations. They apply when the desired transformation produce an object around every example, assuming that all points in the given object share the same class. Different types of objects, including hard geometrical objects and distributions are considered. These kernels were then applied for images classification with Support Vector Machines. Next, algorithms which specifically include prior knowledge are considered. An algorithm which linearly classifies distributions by their domain was developed. It is constructed such that it allows to apply kernels to solve non-linear tasks. Thus, it combines the discriminative power of support vector machines and the well-developed framework of generative models. It can be applied to a number of real-life tasks which include data represented as distributions. In the last part of the thesis, the use of unlabelled data as a source of prior knowledge is considered. The technique of modelling the unlabelled data with a graph is taken as a baseline from semi-supervised manifold learning. For classification problems, we use this apporach for building graph models of invariant manifolds. For regression problems, we use unlabelled data to take into account the inner geometry of the input space. To conclude, in this thesis we developed a number of approaches for incorporating some prior knowledge into kernel methods. We proposed invariant kernels for existing algorithms, developed new algorithms and adapted a technique taken from semi-supervised learning for invariant learning. In all these cases, links with related state-of-the-art approaches were investigated. Several illustrative experiments were carried out on real data on optical character recognition, face image classification, brain-computer interfaces, and a number of benchmark and synthetic datasets.

Machine Learning is a modern and actively developing field of computer science, devoted to extracting and estimating dependencies from empirical data. It combines such fields as statistics, optimization theory and artificial intelligence. In practical tasks, the general aim of Machine Learning is to construct algorithms able to generalize and predict in previously unseen situations based on some set of examples. Given some finite information, Machine Learning provides ways to exract knowledge, describe, explain and predict from data. Kernel Methods are one of the most successful branches of Machine Learning. They allow applying linear algorithms with well-founded properties such as generalization ability, to non-linear real-life problems. Support Vector Machine is a well-known example of a kernel method, which has found a wide range of applications in data analysis nowadays. In many practical applications, some additional prior knowledge is often available. This can be the knowledge about the data domain, invariant transformations, inner geometrical structures in data, some properties of the underlying process, etc. If used smartly, this information can provide significant improvement to any data processing algorithm. Thus, it is important to develop methods for incorporating prior knowledge into data-dependent models. The main objective of this thesis is to investigate approaches towards learning with kernel methods using prior knowledge. Invariant learning with kernel methods is considered in more details. In the first part of the thesis, kernels are developed which incorporate prior knowledge on invariant transformations. They apply when the desired transformation produce an object around every example, assuming that all points in the given object share the same class. Different types of objects, including hard geometrical objects and distributions are considered. These kernels were then applied for images classification with Support Vector Machines. Next, algorithms which specifically include prior knowledge are considered. An algorithm which linearly classifies distributions by their domain was developed. It is constructed such that it allows to apply kernels to solve non-linear tasks. Thus, it combines the discriminative power of support vector machines and the well-developed framework of generative models. It can be applied to a number of real-life tasks which include data represented as distributions. In the last part of the thesis, the use of unlabelled data as a source of prior knowledge is considered. The technique of modelling the unlabelled data with a graph is taken as a baseline from semi-supervised manifold learning. For classification problems, we use this apporach for building graph models of invariant manifolds. For regression problems, we use unlabelled data to take into account the inner geometry of the input space. To conclude, in this thesis we developed a number of approaches for incorporating some prior knowledge into kernel methods. We proposed invariant kernels for existing algorithms, developed new algorithms and adapted a technique taken from semi-supervised learning for invariant learning. In all these cases, links with related state-of-the-art approaches were investigated. Several illustrative experiments were carried out on real data on optical character recognition, face image classification, brain-computer interfaces, and a number of benchmark and synthetic datasets.