Bandyopadhyay, Sunayan2016-09-192016-09-192016-06https://hdl.handle.net/11299/182297University of Minnesota Ph.D. dissertation. June 2016. Major: Computer Science. Advisors: Chad Myers, Paul Johnson. 1 computer file (PDF); vii, 124 pages.Cardiovascular (CV) disease is one of the leading causes of death in the United States; therefore, it is of vital importance that it be managed and treated effectively. Such treatment requires information to determine optimal strategies for treating complex patients so as to minimize their risk of a CV event. Creating such information requires the availability of predictive models that can estimate the probability of a CV event occurring over a fixed time horizon. Currently available predictive models are limited because they are constructed from carefully curated cohorts which may not be representative of the population currently under care. This limitation can largely be overcome by using more representative data. Electronic health records (EHR) provide us with such observations which are representative of the population currently being treated by physicians. They provide an attractive platform over which we can construct a predictive model. However, EHR data may have weaknesses, which include missing data and incomplete follow-up. As a result, it is not possible to apply unmodified traditional machine learning algorithms for constructing a predictive model. In this thesis we show how to adapt probabilistic graphical models (PGMs) to censored data with missing observations. In addition, we construct variants of adapted PGMs that allow us to take advantage of different types of historical observations available in the EHR to better predict the risk of CV events.enBayesian networksCardiovascular risk modeldynamic Bayesian networkElectronic health record datastructure learningCardiovascular risk prediction from Electronic Health Records using probabilistic graphical models.Thesis or Dissertation