Browsing by Subject "Rare variants"
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
Item Bayesian Hierarchical Models For Multi-Variant and Multi-Trait Genome-Wide Association Studies(2020-06) Yang, YiWhile genome-wide association studies (GWASs) have been widely used to identify associations between complex diseases and genetic variants, standard single-variant and single-trait analyses often have limited power when applied to scenarios in which variants are in linkage disequilibrium, occur at low frequency, or are associated with multiple correlated traits. In this dissertation, we propose three Bayesian hierarchical models for multi-variant and multi-trait GWASs based on the hierarchically structured variable selection (HSVS) framework: the generalized fused HSVS (HSVS-GF), the adaptive HSVS (HSVS-A), and the multivariate HSVS (HSVS-M). HSVS is a discrete mixture prior composed of a point mass at zero and a multivariate scale-mixing normal distribution for modeling the effects of variants. As an extension and development of the HSVS framework, the proposed methods have the flexibility to account for various correlation structures, which allows them to extensively borrow strength from multiple correlated variants and traits. As Bayesian methods, they can also integrate complex genetic information into the priors and thus boost the power by leveraging information from various sources. In addition to testing associations, the proposed methods in the Bayesian framework also produce posterior effect estimates for individual variants simultaneously, a distinctive and useful feature that most of the competing methods do not possess. Specifically, HSVS-GF is a pathway-based method that uses summary statistics and pathway structural information to identify the association of a disease with variants in a pathway. HSVS-A is a set-based method that tests the association of a continuous or dichotomous trait with rare variants in a set and estimates the effects of individual rare variants. HSVS-M is a multi-variant and multi-trait method that uses summary statistics both to test the association of variants in a gene with multiple correlated traits and to estimate the strength of association of the gene with each trait. Through analysis of simulated data in various scenarios and GWAS data from the Wellcome Trust Case Control Consortium and the Global Lipids Genetics Consortium, we show that the proposed methods can substantially outperform the competing methods and identify novel causal variants.Item Tests for detection of rare variants and gene-environment interaction in cohort and twin family studies(2016-08) Coombes, BrandonComplex diseases are caused by a combination of environmental and genetic factors. While we have estimated that genetic factors explain a large proportion of variance in many of these diseases, current strategies using only the genotyped common variants (CVs) have failed to explain all of this heritability. There are many hypotheses for this so-called “missing heritability.” We study two such hypotheses by extending a sequential algorithm that was initially proposed to test for genetic main effects of a candidate gene. We first extend the model selection test using the sequential algorithm to a model-averaging test. We use these tests to study how rare variants (RVs) rather than CVs may explain a larger proportion of the disease risk and apply our methods to a candidate gene study of obesity that has sequenced CVs and RVs. It is also thought that the effect of the variants is moderated by environmental fac- tors. Thus, gene-environment interactions may explain why we are not able to identify genes that cause disease. To improve power to detect gene-environment interactions for variants within a candidate gene in studies with unrelated or related subjects, we extend the sequential algorithm for the model selection test for genetic main effects to instead test for these interactions. For studies with unrelated subjects, we extend the sequential algorithm to create summary measures for either the genetic main effect or the interaction and show that these tests are often valid under realistic scenarios. We use a combination of the main effect and interaction summary measures to powerfully test for gene-environment interaction in a variety of situations. We apply our method to test whether candidate genes interact with family climate to influence alcohol consumption among a parent population. Lastly, we extend the tests of gene-environment interaction for unrelated subjects to families. We model the family data using a linear mixed model (LMM) framework to account for shared genetic and environmental effects within a family. In order to reduce the number of parameters we need to estimate, we propose using a ridge penalty on the genetic main effect re-expressed as a random effect within the LMM. We also develop a test which is weighted version of a previous test using the sum of powers of the score vector for interaction where weights are chosen with our sequential algorithm. We show that this test can be more powerful than the previous test when there are a mix of positive and negative interaction effects. We apply our test to a twin study to identify significant interactions between the CVs of candidate genes and a set of environmental factors that influence alcohol consumption.