Martin Jaggi, Thijs Vogels
We study lossy gradient compression methods to alleviate the communication bottleneck in data-parallel distributed optimization. Despite the significant attention received, current compression schemes either do not scale well, or fail to achieve the target ...
NEURAL INFORMATION PROCESSING SYSTEMS (NIPS)2019