Semantic Relatedness and Similarity Reference Standards for Medical Terms
2018-05-03
Loading...
Persistent link to this item
Statistics
View StatisticsCollection period
Date completed
Date updated
Time period coverage
Geographic coverage
Source information
Journal Title
Journal ISSN
Volume Title
Title
Semantic Relatedness and Similarity Reference Standards for Medical Terms
Published Date
2018-05-03
Authors
Author Contact
Pakhomov, Serguei
pakh0002@umn.edu
pakh0002@umn.edu
Type
Dataset
Abstract
This is a collection of reference standards created to test and validate computerized approaches to quantifying the degree of semantic relatedness and similarity between medical terms. Each dataset consists of a list of term pairs that have been evaluated by various healthcare professionals (e.g., medical coders, residents, clinicians) to determine the degree of semantic relatedness and similarity. The details pertaining to each dataset are provided in the referenced publications.
Description
1. MayoSRS.csv: A set of 101 medical concept pairs manually rater by medical coders for semantic relatedness.
2. MiniMayoSRS.csv: A subset of 29 medical concept pairs manually rater by medical coders for semantic relatedness with high inter-rater agreement.
3. UMNSRS_similarity.csv: A set of 566 UMLS concept pairs manually rated for semantic similarity using a continuous response scale.
4. UMNSRS_relatedenss.csv: A set of 588 UMLS concept pairs manually rated for semantic relatedness using a continuous response scale.
5. UMNSRS_similarity_mod449_word2vec.csv: Modification of the UMNSRS-Similarity dataset to exclude control samples and those pairs that did not match text in clinical, biomedical and general English corpora. Exact modifications are detailed in the referenced paper. The resulting dataset contains 449 pairs.
6. UMNSRS_relatedness_mod458_word2vec.csv: Modification of the UMNSRS-Similarity dataset to exclude control samples and those pairs that did not match text in clinical, biomedical and general English corpora. Exact modifications are detailed in the referenced paper. The resulting dataset contains 458 pairs.
Referenced by
Mayo Medical Coders Set (MayoSRS and MiniMayoSRS): Measures of semantic similarity and relatedness in the biomedical domain. Pedersen T., Pakhomov S.V.S., Patwardhan S., and Chute C.G. Journal of Biomedical Informatics. 2007;40(3):288-299.
https://doi.org/10.1016/j.jbi.2006.06.004
UMN Medical Residents Similarity/Relatedenss Set (UMNSRS-Similarity and UMNSRS-Relatedenss): Semantic Similarity and Relatedness between Clinical Terms: An Experimental Study. Pakhomov S., McInnes, B., Adams, T., Liu, Y., Pedersen, T. and Melton, G.B. Proceedings of the Annual Symposium of the American Medical Informatics Association. Washington, D.C. November, 2010.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041430/
Towards a Framework for Developing Semantic Relatedness Reference Standards. Pakhomov, Serguei V.S. and Pedersen, Ted and McInnes, Bridget and Melton, Genevieve B. and Ruggieri, Alexander and Chute, Christopher G. J. of Biomedical Informatics. 44 (2): 251-265.
https://doi.org/10.1016/j.jbi.2010.10.004
Modified Medical Residents Similarity/Relatedness Set (UMNSRS-Similarity-mod and UMNSRS-Relatedenss-mod): Corpus Domain Effects on Distributional Semantic Modeling of Medical Terms. Serguei V.S. Pakhomov, Greg Finley, Reed McEwan, Yan Wang, and Genevieve B. Melton. Bioinformatics. 2016; 32(23):3635-3644.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5181540/
Mayo Medical Coders Set (MayoSRS and MiniMayoSRS): UMLS-Interface and UMLS-Similarity : Open source software for measuring paths and semantic similarity. McInnes B.T., Pedersen T., and Pakhomov S.V. Proceedings of the Annual Symposium of the American Medical Informatics Association. San Fransisco, CA. 2009;431-435.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2815481/
https://doi.org/10.1016/j.jbi.2006.06.004
UMN Medical Residents Similarity/Relatedenss Set (UMNSRS-Similarity and UMNSRS-Relatedenss): Semantic Similarity and Relatedness between Clinical Terms: An Experimental Study. Pakhomov S., McInnes, B., Adams, T., Liu, Y., Pedersen, T. and Melton, G.B. Proceedings of the Annual Symposium of the American Medical Informatics Association. Washington, D.C. November, 2010.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041430/
Towards a Framework for Developing Semantic Relatedness Reference Standards. Pakhomov, Serguei V.S. and Pedersen, Ted and McInnes, Bridget and Melton, Genevieve B. and Ruggieri, Alexander and Chute, Christopher G. J. of Biomedical Informatics. 44 (2): 251-265.
https://doi.org/10.1016/j.jbi.2010.10.004
Modified Medical Residents Similarity/Relatedness Set (UMNSRS-Similarity-mod and UMNSRS-Relatedenss-mod): Corpus Domain Effects on Distributional Semantic Modeling of Medical Terms. Serguei V.S. Pakhomov, Greg Finley, Reed McEwan, Yan Wang, and Genevieve B. Melton. Bioinformatics. 2016; 32(23):3635-3644.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5181540/
Mayo Medical Coders Set (MayoSRS and MiniMayoSRS): UMLS-Interface and UMLS-Similarity : Open source software for measuring paths and semantic similarity. McInnes B.T., Pedersen T., and Pakhomov S.V. Proceedings of the Annual Symposium of the American Medical Informatics Association. San Fransisco, CA. 2009;431-435.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2815481/
Related to
Replaces
item.page.isreplacedby
Publisher
Collections
Funding information
NIH National Library of Medicine R01 grant (LM009623)
item.page.sponsorshipfunderid
item.page.sponsorshipfundingagency
item.page.sponsorshipgrant
Previously Published Citation
Other identifiers
Suggested citation
Pakhomov, Serguei. (2018). Semantic Relatedness and Similarity Reference Standards for Medical Terms. Retrieved from the Data Repository for the University of Minnesota (DRUM), https://doi.org/10.13020/D6CX04.
View/Download File
File View/Open
Description
Size
MayoSRS.csv
MayoSRS Data
(5.74 KB)
MiniMayoSRS.csv
Mini Mayo SRS Data
(1.76 KB)
UMNSRS_similarity.csv
UMN SRS (Similarity) Data
(38.36 KB)
UMNSRS_relatedness.csv
UMN SRS (Relatedness) Data
(39.91 KB)
UMNSRS_similarity_mod449_word2vec.csv
UMN SRS (Similarity) Modified Data
(24.63 KB)
UMNSRS_relatedness_mod458_word2vec.csv
UMN SRS (Relatedness) Modified Data
(25.24 KB)
readme.txt
Description of Data
(8.45 KB)
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.