Armah, Lovis2025-01-072025-01-072024-05https://hdl.handle.net/11299/269178University of Minnesota M.S. thesis. May 2024. Major: Computer Science. Advisor: Andrew Sutton. 1 computer file (PDF); i, 42 pages.Recent works have introduced novel training methods in neural network architecture search (NAS) aimed at reducing computational costs. So, Le and Liang (ICML 2019) introduced the PDH Evolution (PDH-E) method for training the Evolved Transformer, on the computationally expensive WMT 2014 English-German translation task, and reported significant reductions in the number of models evaluated. Additionally, other studies have explored partial training, which involves training candidate models with fewer training steps, resulting in notable improvements in computational efficiency. However, a comparative analysis of the performance and computational efficiency of these methods against full training approaches, which utilize the entire dataset to train candidate models, is yet to be thoroughly explored in theliterature. In this study, we employ a micro-genetic algorithm to compare the performance of three training approaches: partial training, full training, and PDH-E. We define TU as the cumulative number of training steps required for NAS models to be evaluated to determine the best candidate. Using the MNIST dataset, we demonstrate that PDH-E, compared to full training, can achieve a performance improvement of 0.2% with an accuracy of 98.71%, while also realizing a 48% reduction in TU. However, we also highlight that poorly chosen training step configurations in PDH-E may result in the utilization of more TUs than necessary, with a 3.6% surplus compared to full training, achieving an accuracy of 98.74%. Moreover, we illustrate that partial training can achieve an accuracy of 98.05% with a configuration that yields a 96% reduction in TU.enArtificial neural networksEvolutionary algorithmsMachine Learning OptimizationMachine Learning TrainingModel selectionNeural Network Architecture SearchOptimizing Training Resources Used in Neural Architecture Search- A Balancing Act Between Performance and Training Units.Thesis or Dissertation