Robustness and safety of deep learning models.
Loading...
View/Download File
Persistent link to this item
Statistics
View StatisticsJournal Title
Journal ISSN
Volume Title
Authors
Published Date
Publisher
Abstract
Deep learning (DL) refers to a data-driven machine learning technique in which neural networks with many layers are used to model and understand complex patterns and relationships in the data. DL models have revolutionized numerous complex real-world tasks ranging from image recognition to natural language processing, demonstrating significant performance gain over the `traditional' approaches. Despite the impressive performance of DL models, concerns about their robustness and reliability persist: DL models are noted to be sensitive to adversarial attacks, data distribution shifts, and other perturbations, which can lead to significant performance degradation.
As a result, the adoption of DL models in high-stakes real-world applications is still limited today, and addressing the robustness issue of DL models is an emerging and critical research area. In this thesis, we present our findings on the robustness issue of DL models. First, we point out the robustness challenge on DL classifiers --- current adversarial robustness evaluation may not be rigorous, and the robustness conclusions drawn based on such evaluation may not be trustworthy. Based on our analysis, we express our pessimistic view that universal robustness for DL classifiers is a goal too ambitious to achieve. Next, we discuss the robustness challenge in the application of DL-based watermarks. Despite that existing DL based watermarking systems are shown to be robust to traditional digital corruptions (e.g., jpeg compression, additive noise), we show that small but carefully crafted perturbations can easily break existing watermarking systems, requiring no knowledge about the watermarking system itself. We also show that incorporating low-frequency component in the image watermark is necessary to robust image watermarks. Then, we discuss the idea of selective classification (prediction with a reject option) to accept the imperfection of DL models and make best use of them. We propose a confidence score using raw logit output from the DL classifiers and show its better potential in performing selective classification to reduce the liability of mistakes made by DL models. Lastly, we discuss future research directions based on our work, including potential ways to make DL classifiers more robust, how to develop more reliable DL-based watermarking systems, and ways to achieve reliable selective classification in practice.
Description
University of Minnesota Ph.D. dissertation. May 2025. Major: Electrical/Computer Engineering. Advisor: Ju Sun. 1 computer file (PDF); xx, 150 pages.
Related to
Replaces
License
Collections
Series/Report Number
Funding information
Isbn identifier
Doi identifier
Previously Published Citation
Other identifiers
Suggested citation
Liang, Hengyue. (2025). Robustness and safety of deep learning models.. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/275904.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.