Stochastic gradient descent (SGD) and randomized coordinate descent (RCD) are two of the workhorses for training modern automated decision systems. Intriguingly, convergence properties of these methods are not well-established as we move away from the spec ...
EPFL2021