The Open Government Initiative began an era of information sharing by publishing data that is accessible to the public. HealthData.gov is a data portal that was developed by the U.S. Federal Government to publish metadata to disseminate information about healthcare datasets to the American people. Despite the growth in the number of datasets published, there has been limited public participation in the use of the data, which has been attributed to the currently implemented methods for data storage and retrieval. An automated assessment of the HealthData.gov metadata was conducted to assess completeness, accuracy, and consistency of metadata published from 2012 to 2014. Also, a method for indexing the datasets using Medical Subject Headings (MeSH) was evaluated using a term coverage study. The results of these studies demonstrated that metadata published in earlier years were less complete, lower quality, and less consistent. Also, metadata that underwent modifications following their original creation were of higher quality. MeSH offered adequate coverage of the metadata concepts, thereby lending support for the adoption of the terminology for indexing purposes. The results suggested that greater standardization is needed when publishing metadata. This research contributed to the development of automated metrics for assessing metadata quality, design recommendations for a framework to supports high quality metadata, and recommendations for expanding MeSH to offer greater coverage of concepts from HealthData.gov.
University of Minnesota Ph.D. dissertation. February 2016. Major: Health Informatics. Advisors: Laël Gatewood, Rui Zhang. 1 computer file (PDF); viii, 132 pages.
Assessing Metadata Quality and Terminology Coverage of a Federally Sponsored Health Data Repository.
Retrieved from the University of Minnesota Digital Conservancy,
Content distributed via the University of Minnesota's Digital Conservancy may be subject to additional license and use restrictions applied by the depositor.