Browsing by Subject "Dimensionality Reduction"
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
Item Addressing geometric abnormalities and algorithmic shortcomings in metagenomic analysis.(2024) Hoops, SusanOur work aims to identify shortcomings in microbiome analysis methods, making maximal use of sequencing data output to accurately represent sample relationships in a microbiome dataset. We introduce LMdist, a dimensionality reduction method for adjusting pairwise distances to more accurately represent distances along a sampling gradient. This method can be applied to a range of microbiome applications from soil to human gut microbiome studies. Applications beyond microbiome may benefit as well considering the ubiquity of dimensionality reduction in high dimensional datasets. We then implement the MAGEnTa pipeline, tracking engraftment of microbes following fecal microbiome transplants in the human gut. Making use of the whole genome reads produced by shotgun sequencing combined with Bayesian estimation, we can more accurately estimate strain level engraftment following microbiome transplants. This pipeline could prove useful in expanding personalized medicine for microbiome therapies, allowing researchers and clinicians to more accurately predict successful pairs of donor and recipient microbiomes prior to transplant. Finally, we apply longitudinal analysis approaches, including LMdist, to a large infant microbiome dataset. The infant microbiome is well known to diversify over the first two years of life, but the influence of intrapartum and pediatric antibiotics on the infant microbiome and growth is still largely unknown. We explore the interactions between microbiome, growth, and antibiotics over time to determine how clinical choices may impact infant development.Item Data-Driven Exploratory Interfaces for Contextualizing Parameter Spaces: Adding Intuition to Big Data(2021-08) Orban, DanielThis dissertation investigates how to understand the complexities of large parameter spaces through user interaction. Both researchers and practitioners have embraced the hope of using Big Data (extremely large, heterogeneous, and unstructured datasets) to solve complex problems using statistical methods like artificial intelligence or machine learning. Unfortunately, for many disciplines (e.g. the scientific fields), these computational algorithms are black boxes that provide answers without the important explanations required for developing hypotheses. Interactive visualization offers the promise of adding intuition to Big Data, however, current approaches do not scale to the large data sizes, high-dimensionality, and ill-defined complexity. In order to address these challenges, we introduce Data-Driven Exploratory Interfaces (DDEIs), interfaces that are scalable, enable contextual navigation, and use meaningful feature interaction for intuitive exploration. Using DDEIs, we analyze Big Data in the scientific context using three separate applications, each focusing on a different ``Big'' aspect of Big Data. These include medical device design (memory intensive, spatially complex), shock physics (high-dimensional, many instances), and cell migration (memory intensive, spatially complex, high-dimensional, and many instances). In each case, we first look at traditional methods for visualizing and understanding these large datasets, then we overcome the limitations with key concepts introduced by DDEIs (specifically enabling users to define the contexts with respect to their questions, and then explore the relevant trends in the parameter space). In addition to studying the ``Big'' aspects of Big Data (Challenge 1), in parallel, we study two other challenges (Challenge 2 and Challenge 3) relevant to understanding Big Data. The second challenge involves investigating approaches to solving the high-dimensional sparsity problems (Challenge 2). Here we use prediction and sampling strategies to analyze the gaps in knowledge. In the third challenge, we consider how to use interaction to navigate and understand ill-defined features in the data (Challenge 3). Using DDEIs, we combine natural exploration techniques with data-driven algorithms to understand application specific features across parameter space sampling properties (input, output, local, global). In this dissertation we show that DDEIs can overcome the limitations of traditional visualization approaches while creating new intuitive views into otherwise ill-defined and complex phenomena.