Optimizing Training Resources Used in Neural Architecture Search- A Balancing Act Between Performance and Training Units.
2024-05
Loading...
View/Download File
Persistent link to this item
Statistics
View StatisticsJournal Title
Journal ISSN
Volume Title
Title
Optimizing Training Resources Used in Neural Architecture Search- A Balancing Act Between Performance and Training Units.
Alternative title
Authors
Published Date
2024-05
Publisher
Type
Thesis or Dissertation
Abstract
Recent works have introduced novel training methods in neural network architecture search (NAS) aimed at reducing computational costs. So, Le and Liang (ICML 2019) introduced the PDH Evolution (PDH-E) method for training the Evolved Transformer, on the computationally expensive WMT 2014 English-German translation task, and reported significant reductions in the number of models evaluated. Additionally, other studies have explored partial training, which involves training candidate models with fewer training steps, resulting in notable improvements in computational efficiency. However, a comparative analysis of the performance and computational efficiency of these methods against full training approaches, which utilize the entire dataset to train candidate models, is yet to be thoroughly explored in theliterature. In this study, we employ a micro-genetic algorithm to compare the performance of three training approaches: partial training, full training, and PDH-E. We define TU as the cumulative number of training steps required for NAS models to be evaluated to determine the best candidate. Using the MNIST dataset, we demonstrate that
PDH-E, compared to full training, can achieve a performance improvement of 0.2% with an accuracy of 98.71%, while also realizing a 48% reduction in TU. However, we also highlight that poorly chosen training step configurations in PDH-E may result in the utilization of more TUs than necessary, with a 3.6% surplus compared to full training, achieving an accuracy of 98.74%. Moreover, we illustrate that partial training can achieve an accuracy of 98.05% with a configuration that yields a 96% reduction in TU.
Description
University of Minnesota M.S. thesis. May 2024. Major: Computer Science. Advisor: Andrew Sutton. 1 computer file (PDF); i, 42 pages.
Related to
Replaces
License
Series/Report Number
Funding information
Isbn identifier
Doi identifier
Previously Published Citation
Other identifiers
Suggested citation
Armah, Lovis. (2024). Optimizing Training Resources Used in Neural Architecture Search- A Balancing Act Between Performance and Training Units.. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/269178.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.