Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
Adapting statistical learning models online with large scale streaming data is a challenging problem. Bayesian non-parametric mixture models provide flexibility in model selection, however, their widespread use is limited by the computational overhead of existing sampling-based and variational techniques for inference. This paper analyses the online inference problem in Bayesian non-parametric mixture models under small variance asymptotics for large scale applications. Direct application of small variance asymptotic limit with isotropic Gaussians does not encode important coordination patterns/variance in the data. We apply the limit to discard only the redundant dimensions in a non-parametric manner and project the new datapoint in a latent subspace by online inference in a Dirichlet process mixture of probabilistic principal component analyzers (DP-MPPCA). We show its application in teaching a new skill to the Baxter robot online by teleoperation, where the number of clusters and the subspace dimension of each cluster is incrementally adapted with the streaming data to efficiently encode the acquired skill.
,