Models for Limited Labeled Time Series Data with Applications in Sleep Science
2023-04
Loading...
View/Download File
Persistent link to this item
Statistics
View StatisticsJournal Title
Journal ISSN
Volume Title
Title
Models for Limited Labeled Time Series Data with Applications in Sleep Science
Alternative title
Authors
Published Date
2023-04
Publisher
Type
Thesis or Dissertation
Abstract
Time series are encountered universally in any natural or man-made phenomenon. Time-series analysis has applications in critical domains like healthcare, meteorology, and finance. Recently, there has been a big shift in the nature of collected time-series data, with the popularity of cheaper consumer-grade sensors, e.g., smartwatches. This has provided us with a plethora of lower-quality but high-volume data. Modeling time-varying data is challenging owing to its high dimensionality and complex patterns. These challenges are compounded by issues like missing data which have detrimental effects on downstream tasks like classification. Feature engineering has been an important part of time series analysis, with the use of features like seasonality or frequency transforms. Time-series data's complexity makes feature engineering quite challenging, and hence, deep learning is quite promising. Recently, there has been a lot of work on the time-series using deep learning architectures, which requires access to labeled examples. Labeling is an expensive operation, especially in areas requiring specialized knowledge like healthcare. In this thesis, we focus on utilizing the limited labeled data efficiently. We propose solutions that leverage: 1) unlabeled data; 2) data with missing time-series observations; and 3) effective use of scarce labels. We primarily focus on showcasing these techniques for applications in sleep science, with data from consumer-grade devices like smart watches becoming available. First, we present a method for unsupervised representation learning to create representations for human activity and sleep data. We exploit the context and content, and reduce subject-specific noise using adversarial training. These representations can be exploited to boost the performance of supervised learning models in low-labeled data settings, unlike the traditional time-series models. Empirical evaluation demonstrates that our proposed method performs better than many strong baseline methods, and adversarial learning helps improve the generalizability of our representations. Second, we use conditional random fields (CRFs) with deep neural networks to capture longer-term dependencies in the dynamics of output labels for time series segmentation tasks. This allows us to capture longer-term context while performing the segmentation labeling, allowing for more efficient usage of limited labels. Our method shows significant improvement over the baseline methods. We apply the proposed method for the detection of sleep stages from Continuous Positive Air Pressure (CPAP) signals, an at-home therapy device for sleep apnea. Ours is the first work to detect a patient's sleep stages based on the CPAP collected data with reasonable accuracy. Third, we present a novel semi-supervised method for time series data imputation. Observing missing data in time series is common because of issues like data drops or sensor malfunctioning. Imputation methods are used to fill in these values, with the quality of imputation having a significant impact on downstream tasks like classification. Our proposed semi-supervised approach uses unlabeled data as well as downstream task's labeled data. Our results indicate that the proposed method outperforms the existing supervised and unsupervised time series imputation methods measured on the imputation quality as well as on the downstream tasks ingesting imputed time series. Last, we adapt MixUp, a simple data augmentation technique for time series data. We show that a simple modification in the training process can improve the performance of time series classification methods. We perform data augmentation in both raw time series as well as latent space from time series classification models. The improvement in performance is observed consistently in low labeled data regimes as well as higher data regimes.
Description
University of Minnesota Ph.D. dissertation. April 2023. Major: Computer Science. Advisor: Jaideep Srivastava. 1 computer file (PDF); xii, 135 pages.
Related to
Replaces
License
Collections
Series/Report Number
Funding information
Isbn identifier
Doi identifier
Previously Published Citation
Other identifiers
Suggested citation
Aggarwal, Karan. (2023). Models for Limited Labeled Time Series Data with Applications in Sleep Science. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/257035.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.