Incorporating Clinical Domain Knowledge into Causality Deep Learning


Persistent link to this item

View Statistics

Journal Title

Journal ISSN

Volume Title


Incorporating Clinical Domain Knowledge into Causality Deep Learning


Published Date




Thesis or Dissertation


Applying Deep Learning (DL) models to graphical causal learning has brought outstanding effectiveness and efficiency. But it seems odd that their relative applications are still far from widespread practical use in domain sciences like health informatics. As the author of this thesis, I focus on the research of causality hidden in healthcare data for years during my Ph.D. program, at the same time, with a strong interest in seeking the underlying reason behind the ``gap'' between statistical analytics and advanced Machine Learning (ML). In the research on EHR (Electronic Healthcare Records), I realize some confounding bias inherently exists in data, such that DL cannot automatically detect or adjust it. The underlying reason eventually traces back to the Geometric Meaning of Causal Graphs - Causal graphs can be multi-dimensional because of the relatively independent timelines. Various individual-level features can lead to inter-timeline inconsistency and ultimately result in confounding bias. This bias is initially named Causal Representation Bias (CRB) in this thesis. However, its existence has rarely been noticed. More importantly, such a blind spot has formed the fundamental obstacle for cross-applications over the two fields.DL models tend to overlook CRBs and thus stop in front of model Generalizability, while statistical methods lack a geometric global view and are blinded to the inter-timeline-transformation defined model Individualization. This thesis focuses on investigating CRB's existence and its underlying theoretical scheme. It starts with an introduction (Chapter 1), including a discussion of the dilemma situation in causal research, a geometric explanation of CRB, and a proposal of Causal Representation Learning (CRL). Next, Chapter 2 reviews causal learning methods in classical statistics and state-of-the-art ML. The following three chapters demonstrate my relative works, ordered by their relevance: Work in Chapter 3 aims to experimentally verify the existence of ``invisible'' CRB to DL models; Chapter 4 focuses on recovering the causally missed data values considering the potentially existing CRBs; Chapter 5 analyzes an instance of CRB from the health informatics perspective. In the end, in Chapter 6, I will share my thinking flow of investigating CRB, conclude this work's value, and summarize directions for future work.


University of Minnesota Ph.D. dissertation. January 2023. Major: Computer Science. Advisors: Vipin Kumar, Shashi Shekhar. 1 computer file (PDF); viii, 124 pages.

Related to




Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Suggested citation

Li, Jia. (2023). Incorporating Clinical Domain Knowledge into Causality Deep Learning. Retrieved from the University Digital Conservancy,

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.