Enhancing Summarization and Causal Discovery: Topic Awareness, Normalizing Flows, and Hierarchical Ensembles
2023-06
Loading...
View/Download File
Persistent link to this item
Statistics
View StatisticsJournal Title
Journal ISSN
Volume Title
Title
Enhancing Summarization and Causal Discovery: Topic Awareness, Normalizing Flows, and Hierarchical Ensembles
Authors
Published Date
2023-06
Publisher
Type
Thesis or Dissertation
Abstract
This doctoral thesis delves into the realms of abstractive summarization and causal discovery within complex systems. I present a set of new methods that counter prevailing challenges, uncovering the significant roles that topic awareness, normalizing flows, and hierarchical ensemble techniques can play in enhancing text summarization and causal discovery, respectively. The first part of the thesis investigates abstractive summarization, introducing PA-TAM, a model that employs a hierarchical approach to incorporate topic information at both document and sentence levels and a penalized attention mechanism to reduce textual repetitions. The application of these techniques results in the generation of coherent and informative summaries. Furthermore, I propose FlowSUM, a normalizing flows-based variational encoder-decoder framework tailored for Transformer-based summarization models. FlowSUM mitigates challenges related to capturing complex semantic structures and dealing with posterior collapse during training, thereby enriching the latent posterior distribution and improving summary quality. FlowSUM is also shown to possess great potential for transferring knowledge from large language models. The second part of the thesis focuses on causal discovery, particularly targeting the wafer manufacturing domain. I propose a hierarchical ensemble approach that leverages temporal and domain constraints, simultaneously handling challenges such as high-dimensional, mixed, and imbalanced data, as well as irregular missing patterns. The efficacy of this approach is substantiated through simulations and a real-world application to Seagate Technology's wafer manufacturing data, providing valuable insights for process optimization and real-time root cause tracing.
Description
University of Minnesota Ph.D. dissertation. May 2023. Major: Statistics. Advisor: Xiaotong Shen. 1 computer file (PDF); x, 132 pages.
Related to
Replaces
License
Collections
Series/Report Number
Funding information
Isbn identifier
Doi identifier
Previously Published Citation
Other identifiers
Suggested citation
Yang, Yu. (2023). Enhancing Summarization and Causal Discovery: Topic Awareness, Normalizing Flows, and Hierarchical Ensembles. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/258692.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.