Statistical Methods for Variable Selection in Causal Inference

Loading...
Thumbnail Image

Persistent link to this item

Statistics
View Statistics

Journal Title

Journal ISSN

Volume Title

Title

Statistical Methods for Variable Selection in Causal Inference

Published Date

2018-07

Publisher

Type

Thesis or Dissertation

Abstract

Estimating the causal effect of a binary intervention or action (referred to as a "treatment") on a continuous outcome is often an investigator's primary goal. Randomized trials are ideal for estimating causal effects because randomization eliminates selection bias in treatment assignment. However, randomized trials are not always ethically or practically possible, and observational data must be used to estimate the causal effect of treatment. Unbiased estimation of causal effects with observational data requires adjustment for confounding variables that are related to both the outcome and treatment assignment. Adjusting for all measured covariates in a study protects against bias, but including covariates unrelated to outcome may increase the variability of the estimated causal effect. Standard variable selection techniques aim to maximize predictive ability of a model for the outcome and are used to decrease variability of the estimated causal effect, but they ignore covariate associations with treatment and may not adjust for important confounders weakly associated to outcome. We propose two approaches for estimating causal effects that simultaneously consider models for both outcome and treatment assignment. The first approach is a variable selection technique for identifying confounders and predictors of outcome using an adaptive group lasso approach that simultaneously performs coefficient selection, regularization, and estimation across the treatment and outcome models. In the second approach, two methods are proposed that simultaneously model outcome and treatment assignment using a Bayesian formulation with spike and slab priors on each covariate coefficient; the Spike and Slab Causal Estimator (SSCE) aims to achieve minimum bias of the causal effect estimator while Bilevel SSCE (BSSCE) aims to minimize its mean squared error. We also propose TEHTrees, a new method that combines matching and conditional inference trees to characterize treatment effect heterogeneity. One of its main virtues is that, by employing formal hypothesis testing procedures in constructing the tree, TEHTrees preserves the Type I error rate.

Description

University of Minnesota Ph.D. dissertation. July 2018. Major: Biostatistics. Advisors: Julian Wolfson, David Vock. 1 computer file (PDF); x, 102 pages.

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Suggested citation

Koch, Brandon Lee. (2018). Statistical Methods for Variable Selection in Causal Inference. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/200318.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.