Baller, Joshua2017-11-272017-11-272012-02https://hdl.handle.net/11299/191399University of Minnesota Ph.D. dissertation. February 2012. Major: Biomedical Informatics and Computational Biology. Advisors: Chad Myers, Daniel Voytas. 1 computer file (PDF); viii, 136 pages.Chromatin plays a major role in the regulation and evolution of genomic DNA. The advent of high-throughput sequencing, and the subsequently increasing availability of sequencing data from chromatin immunoprecipitation experiments, is leading to a comprehensive view of the chromatin landscape in key model organisms such as S. cerevisiae. To date, little has been done to exploit the availability of such data. My work develops a logistic regression based framework capable of dissecting the observed distribution of a particular chromosomal modification. This framework models the observed distribution in terms of other known chromosomal features in the organism. I have applied this approach to the distributions of Ty5 and Ty1 retrotransposons, identifying previously unknown integration patterns. For Ty5, I identified integration, independent of the canonical mechanism, at sites of open DNA. For Ty1, I identified precise integration events on a single surface of nucleosomes found near Polymerase III transcribed genes. Additionally, a similar logistic regression approach was developed to predict origins of replication in terms of nucleosome patterning. This resulted in a 200-fold enrichment for origin sites and over 7000-fold enrichment when ORC occupancy data was considered. Together these studies present a general model capable of utilizing the available chromosomal data to provide either mechanistic models or site predictions in a variety of organisms.enLASSOLogistic RegressionMachine LearningOrigin of ReplicationTy1Ty5Modeling Distributions of Chromosomal Modifications Using Chromosomal FeaturesThesis or Dissertation