Diagnostics, Cooperation, and Model Selection for Modern Machine Learning
2023-05
Loading...
View/Download File
Persistent link to this item
Statistics
View StatisticsJournal Title
Journal ISSN
Volume Title
Title
Diagnostics, Cooperation, and Model Selection for Modern Machine Learning
Alternative title
Authors
Published Date
2023-05
Publisher
Type
Thesis or Dissertation
Abstract
Rapid developments in data collection, modeling, and computation tools have offered unlimited opportunities for problem-solving through data-driven modeling. However, data can be very complicated, noisy, or manipulated, which bring many difficulties to modeling in machine learning. To address this issue, we develop tools to handle the challenges in three fundamental and interconnected aspects of machine learning: diagnostics, improvement, and model selection.For “diagnostics,” we focus on classification problems. To our best knowledge, no existing method can assess the goodness-of-fit of general classification procedures, including Random Forest, Boosting, and neural networks. Indeed, the lack of a parametric assumption makes it challenging to construct proper tests. To overcome this difficulty, we propose a model-free goodness-of-fit assessment tool called BAGofT based on data-splitting.
For “improvement,” we focus on model training with external assistance. The advancements in data collection methods have brought opportunities for model im- improvement from an external party with different but related datasets. Nevertheless, communication between different parties can be costly and restricted due to the large data size, bandwidth limitation, and privacy regulations. To facilitate modeling in this scenario, we develop a decentralized learning framework called additive-effect assisted learning.
For “selection,” we focus on model selection problems where the distribution of the training data may be different from the one we want to evaluate. To address this issue, we develop a model selection method named targeted CV, with a problem-specific weighting function.
Keywords
Description
University of Minnesota Ph.D. dissertation. May 2023. Major: Statistics. Advisors: Yuhong Yang, Jie Ding. 1 computer file (PDF); x, 202 pages.
Related to
Replaces
License
Collections
Series/Report Number
Funding information
Isbn identifier
Doi identifier
Previously Published Citation
Other identifiers
Suggested citation
Zhang, Jiawei. (2023). Diagnostics, Cooperation, and Model Selection for Modern Machine Learning. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/258701.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.