In this paper, we develop a stochastic-gradient learning algorithm for situations involving streaming data that arise from an underlying clustered structure. In such settings, the variance of gradient noise can be decomposed into the in-cluster variance sigma(2)(in) in plus the between-cluster variance sigma(2)(bet). We develop a cluster-based online variance-reduced method (COVER) to eliminate sigma(2)(bet) and improve the MSD performance of stochastic- gradient descent (SGD) to the order of O(sigma(2)(in)). We establish the convergence property of COVER and derive a tight closed-form mean-square deviation (MSD) performance expression. Our simulations illustrate the improved performance of COVER in terms of steady- state performance.
Volkan Cevher, Efstratios Panteleimon Skoulakis, Luca Viano