Peterson, Kevin2021-04-122021-04-122020-12https://hdl.handle.net/11299/219300University of Minnesota Ph.D. dissertation. December 2020. Major: Biomedical Science. Advisors: Hongfang Liu, Yuk Sham. 1 computer file (PDF); xi, 175 pages.Free-text clinical problem descriptions are used throughout the medical record to communicate patients’ pertinent conditions. These summary-level representations of diagnoses and other clinical concerns underpin critical aspects of the modern patient record such as the problem list, and are key inputs to predictive models and clinical decision support applications. Given their importance to both clinical care and downstream analytics, representations of these clinical problems must be amenable to both human interpretation and machine processing. While free-text is expressive and provides the most transparent and unbiased view into the intent of the clinician, standardized and consistent representations of the semantics of these problem descriptions are necessary for contemporary data-driven healthcare systems. Free-text problem descriptions may be standardized and structured in a variety of ways. First, they may be encoded using a controlled terminology such as Systematized Nomenclature of Medicine – Clinical Terms (SNOMED CT). Even though a single code may inadequately capture the context, modifiers, and related information of a problem, codes may be combined, or “post-coordinated” into more complex structures called SNOMED CT Expressions. Next, alignment to standardized semantic and data models such as Health Level 7 (HL7) Fast Healthcare Interoperability Resources (FHIR) allows for the most structured representation, but with higher implementation complexity. Competing usage priorities introduce a fundamental optimization problem in representing these entries – free-text is the most natural and useful form for clinicians, while structured and codified forms are computable and better suited for data analytics and interoperability. In this study, we introduce methods to minimize this conflict between structured and unstructured forms by proposing a framework for capturing the semantics of free-text clinical problems and transforming them into codified, structured formats using Natural Language Processing (NLP) techniques.enFHIRinformaticsNLPSNOMED CTstandardizationA Corpus-Driven Standardization Framework for Encoding Clinical Problems with SNOMED CT Expressions and HL7 FHIRThesis or Dissertation