The growth in size and computational requirements in training Neural Networks (NN) over the past few years has led to an increase in their sizes. In many cases, the networks can grow so large that can no longer fit on a single machine. A model parallel approach, backed by partitioning of Neural Networks and placement of operators on devices in a distributed system, provides a better distributed solution to this problem. In this thesis, we motivate the case for device placement in Neural Networks. We propose, analyze and evaluate mSCT, a polynomial time algorithmic solution to this end. Additionally, we formulate an exponential time optimal ILP solution that models the placement problem. We summarize our contributions as:1. We propose a theoretical solution to the memory constrained placement problem with makespan and approximation ratio guarantees.2. We compare and contrast m-SCT with other state of the art scheduling algorithms in a simulation environment and show that it consistently performs well on real world graphs across a variety of network bandwidths and memory constraints.3. We lay the foundation for the experimental evaluation of the proposed solutions in existing Machine Learning frameworks.
【 预 览 】
附件列表
Files
Size
Format
View
Exploring model parallelism in distributed scheduling of neural network frameworks