Applied Sciences | |
A Method for Gradient Differentiable Network Architecture Search by Selecting and Clustering Candidate Operations | |
Ha Yoon Song1  | |
[1] Department of Computer Engineering, Hongik University, Seoul 04066, Korea; | |
关键词: DARTS; DG-DARTS; neural architecture search; operation clustering; vote dispersion problem; | |
DOI : 10.3390/app112311436 | |
来源: DOAJ |
【 摘 要 】
The current evolution of deep learning requires further optimization in terms of accuracy and time. From the perspective of new requirements, AutoML is an area that could provide possible solutions. AutoML has a neural architecture search (NAS) field. DARTS is a widely used approach in NAS and is based on gradient descent; however, it has some drawbacks. In this study, we attempted to overcome some of the drawbacks of DARTS by improving the accuracy and decreasing the search cost. The DARTS algorithm uses a mixed operation that combines all operations in the search space. The architecture parameter of each operation comprising a mixed operation is trained using gradient descent, and the operation with the largest architecture parameter is selected. The use of a mixed operation causes a problem called vote dispersion: similar operations share architecture parameters during gradient descent; thus, there are cases where the most important operation is disregarded. In this selection process, vote dispersion causes DARTS performance to degrade. To cope with this problem, we propose a new algorithm based on DARTS called DG-DARTS. Two search stages are introduced, and the clustering of operations is applied in DG-DARTS. In summary, DG-DARTS achieves an error rate of 2.51% on the CIFAR10 dataset, and its search cost is 0.2 GPU days because the search space of the second stage is reduced by half. The speed-up factor of DG-DARTS to DARTS is 6.82, which indicates that the search cost of DG-DARTS is only 13% that of DARTS.
【 授权许可】
Unknown