Browsing by Subject "data quality"
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
Item A Data Quality Framework for the Secondary Use of Electronic Health Information(2016-04) Johnson, StevenElectronic health record (EHR) systems are designed to replace paper charts and facilitate the delivery of care. Since EHR data is now readily available in electronic form, it is increasingly used for other purposes. This is expected to improve health outcomes for patients; however, the benefits will only be realized if the data that is captured in the EHR is of sufficient quality to support these secondary uses. This research demonstrated that a healthcare data quality framework can be developed that produces metrics that characterize underlying EHR data quality and it can be used to quantify the impact of data quality issues on the correctness of the intended use of the data. The framework described in this research defined a Data Quality (DQ) Ontology and implemented an assessment method. The DQ Ontology was developed by mining the healthcare data quality literature for important terms used to discuss data quality concepts and these terms were harmonized into an ontology. Four high-level data quality dimensions (CorrectnessMeasure, ConsistencyMeasure, CompletenessMeasure and CurrencyMeasure) categorized 19 lower level Measures. The ontology serves as an unambiguous vocabulary and allows more precision when discussing healthcare data quality. The DQ Ontology is expressed with sufficient rigor that it can be used for logical inference and computation. The data quality framework was used to characterize data quality of an EHR for 10 data quality Measures. The results demonstrate that data quality can be quantified and Metrics can track data quality trends over time and for specific domain concepts. The DQ framework produces scalar quantities which can be computed on individual domain concepts and can be meaningfully aggregated at different levels of an information model. The data quality assessment process was also used to quantify the impact of data quality issues on a task. The EHR data was systematically degraded and a measure of the impact on the correctness of CMS178 eMeasure (Urinary Catheter Removal after Surgery) was computed. This information can help healthcare organizations prioritize data quality improvement efforts to focus on the areas that are most important and determine if the data can support its intended use.Item Managing Data Quality in Observational Citizen Science(2017-12) Sheppard, S.Observational citizen science is an effective way to supplement the environmental datasets compiled by professional scientists. Involving volunteers in data collection has the added educational benefits of increased scientific awareness and local ownership of environmental concerns. This thesis provides an in-depth exploration of observational citizen science and the associated challenges and opportunities for HCI research. We focus on data quality as a key lens for understanding observational citizen science, and how it differs from the related domains of crowdsourcing, open collaboration, and volunteered geographic information. In order to understand data quality, we performed a qualitative analysis of data quality assurance practices in River Watch, a regional water quality monitoring program. We found that data quality in River Watch is primarily maintained through universal adherence to standard operating procedures, rather than through a computable notion of “accuracy”. We also found that rigorous data quality assurance practices appear to enhance rather than hinder the educational goals of the program participants. In order to measure data quality, we conducted a quantitative analysis of CoCoRaHS, a multinational citizen science project for observing precipitation. Given the importance of long-term participation to data consumers, we focused on volunteer retention as our primary metric for data quality. Through survival analysis, we found that participant age is a significant predictor of retention. Compared to all other age groups, participants aged 60-70 are much more likely to sign up for CoCoRaHS, and to remain active for several years. We propose that the nature of the task can profoundly influence the types of participants attracted to a project. In order to improve data quality, we derived a general workflow model for observational citizen science, drawing on our findings in River Watch, CoCoRaHS, and similar programs. We propose a data model for preserving provenance metadata that allows for ongoing data exchange between disparate technical systems and participant skill levels. We conclude with general principles that should be taken into consideration when designing systems and protocols for managing citizen science data.