Sarangi, Vivekananda2024-01-192024-01-192023-10https://hdl.handle.net/11299/260142University of Minnesota Ph.D. dissertation. October 2023. Major: Biomedical Informatics and Computational Biology. Advisor: Alexej Abyzov. 1 computer file (PDF); viii, 68 pages.Mutations acquired in each cell during and after embryogenesis are passed to the descendant cells such that, within the same individual, different populations of somatic cells have slightly different DNA, resulting in genomic mosaicism. These mosaic mutations might give the cells proliferative advantage, and ultimately cause cancer, or can affect the cellular functions without a proliferative effect as in case of diverse neurological diseases. This makes the detection of mosaic mutations important for understanding the mechanism of various diseases. Although whole genome sequencing of bulk tissue has been used for detecting mosaic mutations, it is not sensitive enough to detect mosaic mutations present in less than 2% of the cells. This hurdle has been overcome by single-cell DNA sequencing (scDNA-seq) which in recent times has emerged as an efficient tool for discovering and analyzing mosaic mutations. However, there are pitfalls and drawbacks of scDNA-seq that need to be addressed. First, the amount of DNA in a single cell is not sufficient for sequencing and needs to be amplified. The most used amplification method MDA (Multiple Displacement Amplification) has drawbacks such as uniformity of amplification and high error rate. Some cells go through the uniform amplification process with less errors than others and it is important to identify the quality of the data prior to mosaic mutation discovery. In Chapter 2 I discuss a method we have developed which can rank amplification quality using low coverage (1X) sequencing data to give an estimate of uniformity of amplification which can be used to select single cells for high coverage (>30X) sequencing. Second, for detection of mosaic mutations in single cells, the most frequently used approach is to compare single cell genomes to that of a matched reference bulk. While this approach works well to find private mutations in a cell, it misses mutations that are present at higher frequency, and consequently present in multiple cells in the reference bulk. To address this, I have developed, described in Chapter 3, a bioinformatic tool to detect somatic mosaicism including SNVs and INDELs using pair-wise comparison of single cell data and provide data demonstrating that the method outperforms existing methods.enMosaicismSingle cell amplificationSingle cell DNA sequencingSomatic mutationMethod for comprehensive detection of somatic mosaicism using single cell sequencing data.Thesis or Dissertation