Aggarwal, Karan2023-09-192023-09-192023-04https://hdl.handle.net/11299/257035University of Minnesota Ph.D. dissertation. April 2023. Major: Computer Science. Advisor: Jaideep Srivastava. 1 computer file (PDF); xii, 135 pages.Time series are encountered universally in any natural or man-made phenomenon. Time-series analysis has applications in critical domains like healthcare, meteorology, and finance. Recently, there has been a big shift in the nature of collected time-series data, with the popularity of cheaper consumer-grade sensors, e.g., smartwatches. This has provided us with a plethora of lower-quality but high-volume data. Modeling time-varying data is challenging owing to its high dimensionality and complex patterns. These challenges are compounded by issues like missing data which have detrimental effects on downstream tasks like classification. Feature engineering has been an important part of time series analysis, with the use of features like seasonality or frequency transforms. Time-series data's complexity makes feature engineering quite challenging, and hence, deep learning is quite promising. Recently, there has been a lot of work on the time-series using deep learning architectures, which requires access to labeled examples. Labeling is an expensive operation,  especially in areas requiring specialized knowledge like healthcare. In this thesis, we focus on utilizing the limited labeled data efficiently. We propose solutions that leverage: 1) unlabeled data; 2) data with missing time-series observations; and 3) effective use of scarce labels. We primarily focus on showcasing these techniques for applications in sleep science, with data from consumer-grade devices like smart watches becoming available. First, we present a method for unsupervised representation learning to create representations for human activity and sleep data. We exploit the context and content, and reduce subject-specific noise using adversarial training. These representations can be exploited to boost the performance of supervised learning models in low-labeled data settings, unlike the traditional time-series models. Empirical evaluation demonstrates that our proposed method performs better than many strong baseline methods, and adversarial learning helps improve the generalizability of our representations. Second, we use conditional random fields (CRFs) with deep neural networks to capture longer-term dependencies in the dynamics of output labels for time series segmentation tasks. This allows us to capture longer-term context while performing the segmentation labeling, allowing for more efficient usage of limited labels. Our method shows significant improvement over the baseline methods. We apply the proposed method for the detection of sleep stages from  Continuous Positive Air Pressure (CPAP) signals, an at-home therapy device for sleep apnea. Ours is the first work to detect a patient's sleep stages based on the CPAP collected data with reasonable accuracy.   Third, we present a novel semi-supervised method for time series data imputation. Observing missing data in time series is common because of issues like data drops or sensor malfunctioning. Imputation methods are used to fill in these values, with the quality of imputation having a significant impact on downstream tasks like classification. Our proposed semi-supervised approach uses unlabeled data as well as downstream task's labeled data. Our results indicate that the proposed method outperforms the existing supervised and unsupervised time series imputation methods measured on the imputation quality as well as on the downstream tasks ingesting imputed time series. Last, we adapt MixUp, a simple data augmentation technique for time series data. We show that a simple modification in the training process can improve the performance of time series classification methods. We perform data augmentation in both raw time series as well as latent space from time series classification models. The improvement in performance is observed consistently in low labeled data regimes as well as higher data regimes.enmachine learningsleep apneasleep sciencesleep stagingtime serieswearablesModels for Limited Labeled Time Series Data with Applications in Sleep ScienceThesis or Dissertation