Faster Distributed Machine Learning for Free
Xiaorui Liu
Artificial Intelligence, Machine Learning, Data Science
It is promising to parallelize the data analysis and machine learning tasks via distributed computing systems such that the computation power of massive computing devices can be utilized collaboratively for a potential speedup. For instance, ideally, we desire to finish the computing task ten times faster if using ten computers. However, our research reveals the fact that the synchronization cost in coordinating many devices is the major bottleneck for achieving such speedup. Specifically, the synchronization requires information communication between devices and can be notably slow. To conquer this challenge, we developed two generally useful strategies, namely communication decentralization and compression, which lead to a very efficient distributed learning algorithm. In particular, communication compression allows to compress information during transmission, and to reduce over 95% of the communication bits, say from 1 GB to 50 MB. Decentralization enables the learning algorithms with flexible communication topologies, which can further reduce the synchronization cost. Importantly, these solutions bring remarkable speedup and make machine learning at scale practical without hurting the effectiveness of the learning tasks. The proposed algorithm achieves state-of-art performance both theoretically and empirically, compared to existing algorithms in the literature. The paper has been published in the top-tier machine learning conference (ICLR 2021), and its impact is covered by a recent newsletter from ICER at MSU.
Enter the password to open this PDF file.
-
-
-
-
-
-
-
-
-
-
-
-
-
-