Enhancing Summarization and Causal Discovery: Topic Awareness, Normalizing Flows, and Hierarchical Ensembles

2023-06
Loading...
Thumbnail Image

Persistent link to this item

Statistics
View Statistics

Journal Title

Journal ISSN

Volume Title

Title

Enhancing Summarization and Causal Discovery: Topic Awareness, Normalizing Flows, and Hierarchical Ensembles

Authors

Published Date

2023-06

Publisher

Type

Thesis or Dissertation

Abstract

This doctoral thesis delves into the realms of abstractive summarization and causal discovery within complex systems. I present a set of new methods that counter prevailing challenges, uncovering the significant roles that topic awareness, normalizing flows, and hierarchical ensemble techniques can play in enhancing text summarization and causal discovery, respectively. The first part of the thesis investigates abstractive summarization, introducing PA-TAM, a model that employs a hierarchical approach to incorporate topic information at both document and sentence levels and a penalized attention mechanism to reduce textual repetitions. The application of these techniques results in the generation of coherent and informative summaries. Furthermore, I propose FlowSUM, a normalizing flows-based variational encoder-decoder framework tailored for Transformer-based summarization models. FlowSUM mitigates challenges related to capturing complex semantic structures and dealing with posterior collapse during training, thereby enriching the latent posterior distribution and improving summary quality. FlowSUM is also shown to possess great potential for transferring knowledge from large language models. The second part of the thesis focuses on causal discovery, particularly targeting the wafer manufacturing domain. I propose a hierarchical ensemble approach that leverages temporal and domain constraints, simultaneously handling challenges such as high-dimensional, mixed, and imbalanced data, as well as irregular missing patterns. The efficacy of this approach is substantiated through simulations and a real-world application to Seagate Technology's wafer manufacturing data, providing valuable insights for process optimization and real-time root cause tracing.

Description

University of Minnesota Ph.D. dissertation. May 2023. Major: Statistics. Advisor: Xiaotong Shen. 1 computer file (PDF); x, 132 pages.

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation

Yang, Yu. (2023). Enhancing Summarization and Causal Discovery: Topic Awareness, Normalizing Flows, and Hierarchical Ensembles. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/258692.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.