Assessing Metadata Quality and Terminology Coverage of a Federally Sponsored Health Data Repository
2016-02
Title
Assessing Metadata Quality and Terminology Coverage of a Federally Sponsored Health Data Repository
Authors
Published Date
2016-02
Publisher
Type
Thesis or Dissertation
Abstract
The Open Government Initiative began an era of information sharing by publishing data that is accessible to the public. HealthData.gov is a data portal that was developed by the U.S. Federal Government to publish metadata to disseminate information about healthcare datasets to the American people. Despite the growth in the number of datasets published, there has been limited public participation in the use of the data, which has been attributed to the currently implemented methods for data storage and retrieval. An automated assessment of the HealthData.gov metadata was conducted to assess completeness, accuracy, and consistency of metadata published from 2012 to 2014. Also, a method for indexing the datasets using Medical Subject Headings (MeSH) was evaluated using a term coverage study. The results of these studies demonstrated that metadata published in earlier years were less complete, lower quality, and less consistent. Also, metadata that underwent modifications following their original creation were of higher quality. MeSH offered adequate coverage of the metadata concepts, thereby lending support for the adoption of the terminology for indexing purposes. The results suggested that greater standardization is needed when publishing metadata. This research contributed to the development of automated metrics for assessing metadata quality, design recommendations for a framework to supports high quality metadata, and recommendations for expanding MeSH to offer greater coverage of concepts from HealthData.gov.
Description
University of Minnesota Ph.D. dissertation. February 2016. Major: Health Informatics. Advisors: Laël Gatewood, Rui Zhang. 1 computer file (PDF); viii, 132 pages.
Related to
Replaces
License
Collections
Series/Report Number
Funding information
Isbn identifier
Doi identifier
Previously Published Citation
Other identifiers
Suggested citation
Marc, David. (2016). Assessing Metadata Quality and Terminology Coverage of a Federally Sponsored Health Data Repository. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/178942.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.