University Digital Conservancy :: Browsing by Subject "causal inference"

Browsing by Subject "causal inference"

Now showing 1 - 5 of 5

Fairness Estimation For Small And Intersecting Subgroups In Clinical Applications
(2024-03) Wastvedt, Solvejg
Along with the increasing availability of health data has come the rise of data-driven models to inform decision-making and policy. These models have the potential to benefit both patients and health care providers but can also exacerbate health inequities. Existing "algorithmic fairness" methods for measuring and correcting model bias fall short of what is needed for health policy in several ways that we address in this dissertation. First, in clinical applications, risk prediction is typically used to guide treatment, creating distinct statistical issues that invalidate most existing techniques. Second, methods typically focus on a single grouping along which discrimination may occur rather than considering multiple, intersecting groups. Third, most existing techniques are only usable for relatively large subgroups. Finally, most existing algorithmic fairness methods require complete data on the grouping variables, such as race or gender, along which fairness is to be assessed. However, in many clinical settings, this information is missing or unreliable. In this dissertation, we address each of these challenges and propose methods that expand the possibilities for algorithmic fairness work in clinical settings.
Impacts of Recreational Cannabis Legalization: Substance Use Development, Pre-Existing Vulnerability, and Psychosocial Outcomes
(2022-05) Zellers, Stephanie
Alcohol, tobacco, and cannabis are three of the most commonly consumed substances in the United States. The development of substance use is influenced by genes, the familial environment, and unique environmental exposures. One such exposure is legal policy surrounding the purchase and consumption of these substances, and cannabis policies in particular are changing dramatically across the United States, raising concerns about the potential for public health consequences associated with substance use. In the present work, we focus on the development of substance use in a normative community sample, as well as perturbations to substance use development and substance related outcomes as a consequence of recreational legalization using causally informative genetic longitudinal designs. Study 1 explores normative developmental trends of cannabis use in recreationally illegal environments and its relationship to development of alcohol and tobacco consumption from adolescence through mid-adulthood. Study 1 also investigates the genetic and environmental influences underlying all three substances and influences unique to each substance over time. Study 2 evaluates the causal impact of recreational legalization on cannabis frequency with a co-twin control model, as well as the changes to the magnitude genetic and environmental influences on cannabis use in a longitudinal gene-environment interaction model. Study 3 expands on this to evaluate the impact of recreational legalization on a broad range of psychiatric and psychosocial outcomes associated with cannabis use, and further examines whether vulnerable individuals are at exacerbated risk for negative outcomes due to legalization. Together, these studies use rigorous designs to expand on the existing literature and provide evidence consistent with a causal impact of cannabis legalization on cannabis use. Furthermore, our results suggest that cannabis legalization may perturb normative adult decreases in substance intake, but this is not coupled with negative psychosocial outcomes in adulthood.
Kernel Estimators in Complex Data Analysis
(2021-11) Weng, Guangwei
Kernel estimators, including kernel density estimators and kernel regression estimators, have drawn great research interests in terms of both theoretical studies and applications since invention, due to their easy interpretation and flexibility to model data with complicated density curves/conditional mean curves. Even during this information age, when the datasets confronting us have become larger and more complicated, which seems to disfavor the use of kernel estimators because of the so- called “curse of dimensionality” phenomenon with nonparametric statistical methods, statisticians continue to propose sophisticated methods based on kernel estimators to handle complex data analysis problems. In this thesis, we talk about two such newly developed methodologies related to applications of kernel estimators. In Chapter 2, we develop a bandwidth (matrix) selector for multivariate kernel density estimators of level sets and highest density regions. We consider a different loss function from the one used in classical bandwidth selection problem and derive an asymptotic approximation to the corresponding risk function. A multi-stage plug-in bandwidth selection procedure is proposed to estimate the unknown quantity in the risk function and solve the optimal bandwidth. In Chapter 3, we propose a nonparametric doubly robust test for a continuous treatment effect, which involves applying a local polynomial estimator based nonparametric test to the pseudo outcomes aiming for causal effect estimation. We will also see under this framework how classical nonparametric statistical methods could collaborate with modern machine learning models to drive effective and reliable learning from complex real data.
Optimal Treatment Regimes Estimation with Censored Data and Related Topics
(2021-06) Sengupta, Sanhita
The thesis is divided in three sections of interconnected topics. Motivated by applications from precision medicine, we consider the problem of estimating an optimal treatment regime (or individual optimal decision rule) based on right-censored survival data. We consider a non-parametric approach that maximizes the expected mean restricted survival time of the potential outcome distribution. Comparing with existing methods, our approach does not need to assume the decision rule belongs to a restricted class (e.g., class of index rules) and can accommodate high-dimensional covariates. We investigate the theory of the estimated optimal treatment regime. Monte Carlo studies and a real data example are used to demonstrate the performance of our proposed method. Random forests are widely used today for various purposes such as regression classification, survival analysis however its theoretical properties are not yet explored completely. We propose a quantile random forest estimator which considers sub-sampling instead of complete bootstrap samples as in Meinshausen[2006]. We study the point wise asymptotics of quantile random forest estimator proposed by rendering it in the framework of U-statistics. We prove point-wise weak convergence to normality and also propose a consistent estimator of the variance. We further explore the asymptotic behavior of the proposed estimator via a simulation study. Measuring the efficacy of a treatment or policy can involve data heterogeneity. In such cases, the entire conditional distributional impact of the treatment is important rather than just a discrete metric such as the average treatment effect. Quantiles inform more about the distribution than an average and multiple quantiles can be used together to get an idea about the entire distribution. In the context of survival analysis with censored data, we propose a quantile regression model estimated using survival random forest. We further extend this to estimate quantile treatment effects under censoring. We show the efficacy of the proposed method via simulations. We also demonstrate using this method and interpreting quantile effect by analysing a colon cancer dataset.
Statistical Methods for Organ Transplant
(2021-07) McKearnan, Shannon
In this dissertation, we propose novel statistical methods to improve clinical decision support for organ transplant donors and recipients, using data from the United Network for Organ Sharing national registry. In our first project, we develop a feature selection method for support vector regression in order to benefit from the method’s flexibility while combating overfitting. Support vector regression is advantageous due to its use of a kernel for flexibility and computational efficiency; penalized methods for feature selection limit the choice in kernel to finite dimensional transformations and are thus insufficient. We propose a novel feature selection method for support vector regression based on a genetic algorithm that iteratively searches across potential subsets of covariates to find those that yield the best performance according to a user-defined fitness function. We apply our method to predict donor kidney function one year after transplant. In our second project, we develop an estimator for marginal survival under a dynamic treatment regime for organ transplant, where treatment is defined as the patient’s decision to accept or decline an organ when it is offered to them. We apply our method to kidney transplant patients to recommend thresholds of the quality of organ for acceptance. In our third project, we again utilize the genetic algorithm’s flexible optimization, this time to identify optimal treatment regimes. We define the treatment regime as a decision list in order to develop our method. We apply our method to identify treatment regimes for liver transplant patients who may wish to undergo a simultaneous kidney transplant. Overall, we develop novel methods in diverse fields of statistics tailored for the organ transplantation context, and we demonstrate their performance and meaningful clinical implications via simulations and real data examples.

University Digital Conservancy

Browsing by Subject "causal inference"

Results Per Page

Sort Options