Optimizing Training Resources Used in Neural Architecture Search- A Balancing Act Between Performance and Training Units.

Recent works have introduced novel training methods in neural network architecture search (NAS) aimed at reducing computational costs. So, Le and Liang (ICML 2019) introduced the PDH Evolution (PDH-E) method for training the Evolved Transformer, on the computationally expensive WMT 2014 English-German translation task, and reported significant reductions in the number of models evaluated. Additionally, other studies have explored partial training, which involves training candidate models with fewer training steps, resulting in notable improvements in computational efficiency. However, a comparative analysis of the performance and computational efficiency of these methods against full training approaches, which utilize the entire dataset to train candidate models, is yet to be thoroughly explored in theliterature. In this study, we employ a micro-genetic algorithm to compare the performance of three training approaches: partial training, full training, and PDH-E. We define TU as the cumulative number of training steps required for NAS models to be evaluated to determine the best candidate. Using the MNIST dataset, we demonstrate that PDH-E, compared to full training, can achieve a performance improvement of 0.2% with an accuracy of 98.71%, while also realizing a 48% reduction in TU. However, we also highlight that poorly chosen training step configurations in PDH-E may result in the utilization of more TUs than necessary, with a 3.6% surplus compared to full training, achieving an accuracy of 98.74%. Moreover, we illustrate that partial training can achieve an accuracy of 98.05% with a configuration that yields a 96% reduction in TU.

Keywords

Artificial neural networks

Evolutionary algorithms

Machine Learning Optimization

Machine Learning Training

Model selection

Neural Network Architecture Search

Description

University of Minnesota M.S. thesis. May 2024. Major: Computer Science. Advisor: Andrew Sutton. 1 computer file (PDF); i, 42 pages.

Collections

Master's Theses (Plan A and Professional Engineering Design Projects)

Suggested citation

Armah, Lovis. (2024). Optimizing Training Resources Used in Neural Architecture Search- A Balancing Act Between Performance and Training Units.. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/269178.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.

University Digital Conservancy

Optimizing Training Resources Used in Neural Architecture Search- A Balancing Act Between Performance and Training Units.

View/Download File

Persistent link to this item

Statistics

Journal Title

Journal ISSN

Volume Title

Title

Alternative title

Authors

Published Date

Publisher

Type

Abstract

Keywords

Description

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation

University Digital Conservancy

University of Minnesota Twin Cities

Optimizing Training Resources Used in Neural Architecture Search- A Balancing Act Between Performance and Training Units.

View/Download File

Persistent link to this item

Statistics

Journal Title

Journal ISSN

Volume Title

Title

Alternative title

Authors

Published Date

Publisher

Type

Abstract

Keywords

Description

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation