Deep learning-based instrumental variable regression for nonlinear causal inference in Transcriptome-Wide Association Studies

Loading...
Thumbnail Image

Persistent link to this item

Statistics
View Statistics

Journal Title

Journal ISSN

Volume Title

Published Date

Publisher

Abstract

Instrumental Variable (IV) regression is a foundational approach to causal inference, particularly in contexts where randomized experiments are not feasible. An important and novel application of IV regression is to Transcriptome-Wide Association Studies (TWAS) to identify causal genes (as exposures) for a trait (as an outcome) using genetic variants as IVs. Despite its widespread use, TWAS methodologies are constrained by several limitations: they predominantly rely on linear models or univariable approaches, thereby neglecting nonlinear gene-trait relationships and the confounding or pleiotropic effects of genetic variants. This gap impedes the accurate and robust identification of causal genetic mechanisms—a critical need for advancing the understanding of complex diseases. In this dissertation, we introduce several novel approaches to address these limitations. First, we propose DeLIVR, a novel deep learning (DL) framework tailored for TWAS applications. DeLIVR improves upon existing DL-based IV regression approaches by estimating a different target function and incorporating a hypothesis-testing framework. Second, we introduce MV-DeLIVR, which extends DeLIVR to multivariate settings, accounting for violations of model assumptions due to genetic pleiotropy. Finally, we explore the application of DeLIVR to imputed traits to enhance statistical power for diseases with limited cases, using Alzheimer’s disease (AD) as a motivating example. The underrepresentation of late-onset AD cases in biobank datasets, due to the younger age of participants relative to the typical onset age, presents a significant challenge for TWAS. In response to this challenge, we train DeLIVR on a large biobank study with imputed AD cases and test the gene-trait associations using a smaller dataset with observed AD status. We validate the accuracy and robustness of our proposed methods through comprehensive simulations and real-data applications.

Keywords

Description

University of Minnesota Ph.D. dissertation.December 2024. Major: Statistics. Advisors: Wei Pan, Xiaotong Shen. 1 computer file (PDF); xi, 113 pages.

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation

He, Ruoyu. (2024). Deep learning-based instrumental variable regression for nonlinear causal inference in Transcriptome-Wide Association Studies. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/270561.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.