We prove that every online learnable class of functions of Littlestone dimension d admits a learning algorithm with finite information complexity. Towards this end, we use the notion of a globally stable algorithm. Generally, the information complexity of ...
How does the geometric representation of a dataset change after the application of each randomly initialized layer of a neural network? The celebrated Johnson-Lindenstrauss lemma answers this question for linear fully-connected neural networks (FNNs), stat ...
This work provides an additional step in the theoretical understanding of neural networks. We consider neural networks with one hidden layer and show that when learning symmetric functions, one can choose initial conditions so that standard SGD training ef ...
This paper considers "delta-almost Reed-Muller codes", i.e., linear codes spanned by evaluations of all but a delta fraction of monomials of degree at most d. It is shown that for any delta > 0 and any epsilon > 0, there exists a family of delta-almost Ree ...