Optimizing Training Resources Used in Neural Architecture Search- A Balancing Act Between Performance and Training Units.

Loading...
Thumbnail Image

Persistent link to this item

Statistics
View Statistics

Journal Title

Journal ISSN

Volume Title

Title

Optimizing Training Resources Used in Neural Architecture Search- A Balancing Act Between Performance and Training Units.

Alternative title

Published Date

2024-05

Publisher

Type

Thesis or Dissertation

Abstract

Recent works have introduced novel training methods in neural network architecture search (NAS) aimed at reducing computational costs. So, Le and Liang (ICML 2019) introduced the PDH Evolution (PDH-E) method for training the Evolved Transformer, on the computationally expensive WMT 2014 English-German translation task, and reported significant reductions in the number of models evaluated. Additionally, other studies have explored partial training, which involves training candidate models with fewer training steps, resulting in notable improvements in computational efficiency. However, a comparative analysis of the performance and computational efficiency of these methods against full training approaches, which utilize the entire dataset to train candidate models, is yet to be thoroughly explored in theliterature. In this study, we employ a micro-genetic algorithm to compare the performance of three training approaches: partial training, full training, and PDH-E. We define TU as the cumulative number of training steps required for NAS models to be evaluated to determine the best candidate. Using the MNIST dataset, we demonstrate that PDH-E, compared to full training, can achieve a performance improvement of 0.2% with an accuracy of 98.71%, while also realizing a 48% reduction in TU. However, we also highlight that poorly chosen training step configurations in PDH-E may result in the utilization of more TUs than necessary, with a 3.6% surplus compared to full training, achieving an accuracy of 98.74%. Moreover, we illustrate that partial training can achieve an accuracy of 98.05% with a configuration that yields a 96% reduction in TU.

Description

University of Minnesota M.S. thesis. May 2024. Major: Computer Science. Advisor: Andrew Sutton. 1 computer file (PDF); i, 42 pages.

Related to

Replaces

License

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation

Armah, Lovis. (2024). Optimizing Training Resources Used in Neural Architecture Search- A Balancing Act Between Performance and Training Units.. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/269178.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.