Valovage, Mark2020-02-262020-02-262019-11https://hdl.handle.net/11299/211812University of Minnesota Ph.D. dissertation. November 2019. Major: Computer Science. Advisor: Maria Gini. 1 computer file (PDF); xii, 139 pages.Recent advances in machine learning have significant, far-reaching potential in electrical time series applications. However, many methods cannot currently be implemented in real world applications due to multiple challenges. This thesis explores solutions to many of these challenges in an effort to realize the full potential of applying machine learning to dynamic electrical systems. This thesis focuses on two areas: electricity disaggregation and time series shapelets. However, the contributions below can be applied to dozens of other domains. Electricity disaggregation identifies individual appliances from one or more aggregate data streams. In first world countries, disaggregation has the potential to eliminate billions of dollars of waste each year, while in developing countries, disaggregation could reduce costs enough to help provide electricity to over a billion people who currently have no access to it. Existing disaggregation methods cannot be applied to real-world households because they are too sensitive to varying noise levels, require parameters to be tuned to individual houses or appliances, make incorrect assumptions about real-world data, or are too resource intensive for inexpensive hardware. This thesis details label correction, a process to automatically correct user-labeled training samples, to increase classification accuracy. It also details an approach to unsupervised learning that is scalable to hundreds of millions of buildings using two novel approaches: event detection without parameter tuning and iterative discovery without appliance models. Time series shapelets are small subsequences of time series used for classification of unlabeled time series. While shapelets can be used for electricity disaggregation, they have applications to dozens of other domains. However, little research has been done on the distance metric used by shapelets. This distance metric is critical, as it is the sole feature a shapelet uses to discriminate between samples from different classes. This thesis details two contributions to time series shapelets. The first, selective z-normalization, is a technique that increases the shapelet classification accuracy by discovering a combination of z-normalized and non-normalized shapelets. The second is computing shapelet-specific distances, a technique to increase accuracy by finding a unique distance metric for each shapelet.enClassificationElectricity DisaggregationShapeletsSupervised LearningTime SeriesUnsupervised LearningEnhancing Machine Learning Classification for Electrical Time Series with Additional Domain ApplicationsThesis or Dissertation