Between Dec 19, 2024 and Jan 2, 2025, datasets can be submitted to DRUM but will not be processed until after the break. Staff will not be available to answer email during this period, and will not be able to provide DOIs until after Jan 2. If you are in need of a DOI during this period, consider Dryad or OpenICPSR. Submission responses to the UDC may also be delayed during this time.
 

Semantic Relatedness and Similarity Reference Standards for Medical Terms

Loading...
Thumbnail Image
Statistics
View Statistics

Collection period

Date completed

Date updated

Time period coverage

Geographic coverage

Source information

Journal Title

Journal ISSN

Volume Title

Title

Semantic Relatedness and Similarity Reference Standards for Medical Terms

Published Date

2018-05-03

Author Contact

Pakhomov, Serguei
pakh0002@umn.edu

Type

Dataset

Abstract

This is a collection of reference standards created to test and validate computerized approaches to quantifying the degree of semantic relatedness and similarity between medical terms. Each dataset consists of a list of term pairs that have been evaluated by various healthcare professionals (e.g., medical coders, residents, clinicians) to determine the degree of semantic relatedness and similarity. The details pertaining to each dataset are provided in the referenced publications.

Description

1. MayoSRS.csv: A set of 101 medical concept pairs manually rater by medical coders for semantic relatedness. 2. MiniMayoSRS.csv: A subset of 29 medical concept pairs manually rater by medical coders for semantic relatedness with high inter-rater agreement. 3. UMNSRS_similarity.csv: A set of 566 UMLS concept pairs manually rated for semantic similarity using a continuous response scale. 4. UMNSRS_relatedenss.csv: A set of 588 UMLS concept pairs manually rated for semantic relatedness using a continuous response scale. 5. UMNSRS_similarity_mod449_word2vec.csv: Modification of the UMNSRS-Similarity dataset to exclude control samples and those pairs that did not match text in clinical, biomedical and general English corpora. Exact modifications are detailed in the referenced paper. The resulting dataset contains 449 pairs. 6. UMNSRS_relatedness_mod458_word2vec.csv: Modification of the UMNSRS-Similarity dataset to exclude control samples and those pairs that did not match text in clinical, biomedical and general English corpora. Exact modifications are detailed in the referenced paper. The resulting dataset contains 458 pairs.

Referenced by

Mayo Medical Coders Set (MayoSRS and MiniMayoSRS): Measures of semantic similarity and relatedness in the biomedical domain. Pedersen T., Pakhomov S.V.S., Patwardhan S., and Chute C.G. Journal of Biomedical Informatics. 2007;40(3):288-299.
https://doi.org/10.1016/j.jbi.2006.06.004
UMN Medical Residents Similarity/Relatedenss Set (UMNSRS-Similarity and UMNSRS-Relatedenss): Semantic Similarity and Relatedness between Clinical Terms: An Experimental Study. Pakhomov S., McInnes, B., Adams, T., Liu, Y., Pedersen, T. and Melton, G.B. Proceedings of the Annual Symposium of the American Medical Informatics Association. Washington, D.C. November, 2010.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041430/
Towards a Framework for Developing Semantic Relatedness Reference Standards. Pakhomov, Serguei V.S. and Pedersen, Ted and McInnes, Bridget and Melton, Genevieve B. and Ruggieri, Alexander and Chute, Christopher G. J. of Biomedical Informatics. 44 (2): 251-265.
https://doi.org/10.1016/j.jbi.2010.10.004
Modified Medical Residents Similarity/Relatedness Set (UMNSRS-Similarity-mod and UMNSRS-Relatedenss-mod): Corpus Domain Effects on Distributional Semantic Modeling of Medical Terms. Serguei V.S. Pakhomov, Greg Finley, Reed McEwan, Yan Wang, and Genevieve B. Melton. Bioinformatics. 2016; 32(23):3635-3644.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5181540/
Mayo Medical Coders Set (MayoSRS and MiniMayoSRS): UMLS-Interface and UMLS-Similarity : Open source software for measuring paths and semantic similarity. McInnes B.T., Pedersen T., and Pakhomov S.V. Proceedings of the Annual Symposium of the American Medical Informatics Association. San Fransisco, CA. 2009;431-435.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2815481/

Related to

Replaces

item.page.isreplacedby

Publisher

Funding information

NIH National Library of Medicine R01 grant (LM009623)

item.page.sponsorshipfunderid

item.page.sponsorshipfundingagency

item.page.sponsorshipgrant

Previously Published Citation

Other identifiers

Suggested citation

Pakhomov, Serguei. (2018). Semantic Relatedness and Similarity Reference Standards for Medical Terms. Retrieved from the Data Repository for the University of Minnesota (DRUM), https://doi.org/10.13020/D6CX04.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.