Classification with Mixture of Experts Models

Mooney, James2023-02-162023-02-162022-12https://hdl.handle.net/11299/252470University of Minnesota M.S. thesis. December 2022. Major: Computer Science. Advisor: Dongyeop Kang. 1 computer file (PDF); vi, 35 pages.Mixture of experts (MoE) layers allow for an increase in model parameters without acorresponding increase in computational cost by utilizing sparse dynamic computation across “expert” modules during both inference and training. In this work we study whether these sparse activations of expert modules are semantically meaningful in classification tasks; in particular, we investigate whether experts develop specializations that reveal semantic relationships among classes. This work replaces the classification head of selected deep networks on classification tasks with an MoE layer. MoE layers allow for the experts to specialize in ways that are qualitatively intuitive, and quantitatively match structural descriptions of their relationships better than the classification heads in the original networks.enClassification with Mixture of Experts ModelsThesis or Dissertation