Browsing by Subject "NLP"

Now showing 1 - 4 of 4

A Corpus-Driven Standardization Framework for Encoding Clinical Problems with SNOMED CT Expressions and HL7 FHIR
(2020-12) Peterson, Kevin
Free-text clinical problem descriptions are used throughout the medical record to communicate patients’ pertinent conditions. These summary-level representations of diagnoses and other clinical concerns underpin critical aspects of the modern patient record such as the problem list, and are key inputs to predictive models and clinical decision support applications. Given their importance to both clinical care and downstream analytics, representations of these clinical problems must be amenable to both human interpretation and machine processing. While free-text is expressive and provides the most transparent and unbiased view into the intent of the clinician, standardized and consistent representations of the semantics of these problem descriptions are necessary for contemporary data-driven healthcare systems. Free-text problem descriptions may be standardized and structured in a variety of ways. First, they may be encoded using a controlled terminology such as Systematized Nomenclature of Medicine – Clinical Terms (SNOMED CT). Even though a single code may inadequately capture the context, modifiers, and related information of a problem, codes may be combined, or “post-coordinated” into more complex structures called SNOMED CT Expressions. Next, alignment to standardized semantic and data models such as Health Level 7 (HL7) Fast Healthcare Interoperability Resources (FHIR) allows for the most structured representation, but with higher implementation complexity. Competing usage priorities introduce a fundamental optimization problem in representing these entries – free-text is the most natural and useful form for clinicians, while structured and codified forms are computable and better suited for data analytics and interoperability. In this study, we introduce methods to minimize this conflict between structured and unstructured forms by proposing a framework for capturing the semantics of free-text clinical problems and transforming them into codified, structured formats using Natural Language Processing (NLP) techniques.
Develop informatics solutions to deliver relevant information for clinical decision making that improve the management of cardiovascular disease risk of breast cancer patients
(2025-01) Zhou, Sicheng
In the realm of breast cancer treatment, balancing the efficacy of therapy against the risk of treatment toxicity particularly cardiotoxicity as one of the more morbid complications of treatment poses a significant clinical challenge. This thesis presents an innovative approach to addressing the issue of breat cancer-related cardiotoxicity by integrating health informatics with real world data from Electronic Health Records (EHRs) and other health information technology (e.g., imaging reports from cardiovascular information systems) to improve cardiovascular disease risk management in breast cancer patients. Our approach leverages deep learning algorithms and a range of natural language processing (NLP) approaches in order to extract, analyze, and interpret complex clinical data to enhance decision-making processes in healthcare settings for this vulnerable group of patients.At the core of this body of research is the development and application of transformer-based deep learning methods, which are specifically tailored to extract targeted information from clinical texts in EHRs. By creating a specialized cancer domain vocabulary, the study demonstrates the enhanced performance of these models in accurately identifying relevant clinical data, such as patient demographics, treatment details, and cancer phenotypes. This approach significantly advances the precision and reliability of extracting important clinical information, which often is a crucial step in developing robust predictive models. The thesis further explores the generalizability of these NLP algorithms across different healthcare institutions via external validations, a vital consideration given the varied nature of EHRs. Through cross-institutional evaluations, the research establishes the portability of these models, ensuring their effectiveness in diverse clinical environments. This aspect of the study is critical in validating the broader applicability of the developed methodologies in various healthcare settings. Central to the thesis is the creation of predictive models for assessing the risk of heart disease in breast cancer patients. Utilizing a deep learning approach, specifically LSTM-D models, the study effectively harnesses longitudinal EHR data to predict cardiovascular risks associated with cancer treatments. We find that these models outperform traditional methods, offering a more nuanced understanding of patient-specific risk factors and temporal patterns in treatment responses. The thesis' findings underscore the potential of integrating advanced data analysis tools in clinical decision-making, particularly in the context of breast cancer treatment. By providing a more detailed and personalized risk assessment, the research contributes significantly to the field of personalized medicine, enhancing the quality of patient care and treatment outcomes. Overall, this thesis bridges a critical gap in healthcare informatics by developing and validating innovative methodologies for extracting and analyzing EHR data. The research marks a significant step towards more informed and personalized breast cancer treatment, highlighting the transformative potential of health informatics in managing complex disease interactions and improving patient outcomes.
Gender/Genre: Gender difference in disciplinary communication
(2015-05) Larson, Brian
Within the professions, writers are expected to express themselves in certain ways, often within genres that are bound by conventions, including linguistic register. The student entering a profession learns those genres as if they are mandatory and static, and conforming or failing to conform to conventions is believed to have ties to career consequences. However, new members of a profession come to it with other habitual language practices affected—according to previous research—by the writer’s gender. Rhetorical genre theory and disciplinary, professional, and technical communication theory do not offer a full account for the ways in which these old habits and new conventions must interact, and previous research in gender and language does not fully account for how gendered persons write when confronted with high-stakes convention- bound writing tasks. I used tools from statistics and natural language processing (NLP) to assess stylistic features that previous research has associated with gender differences in written language: I applied those tools to texts created by law students near the end of their first year of study in the genre of a court memorandum, and I found there was no pattern of difference between male and female writers in these texts. I propose a “cognitive pragmatic rhetorical” (CPR) theory, grounded in work of Straßheim (2010), who attempted to bridge the relevance philosophy of Alfred Schutz (Schutz, 1964, 1966, 1973) and the Relevance Theory of Sperber and Wilson (1995); I have extended Straßheim’s work with insights from rhetoric and cognitive science. CPR theory explains that these apprentice members of a professional community will expend great effort to conform to its conventions and genres because of the students’ goals and the practical effects that depend on conformity. Consequently, we expect them to abandon gendered linguistic habits, at least while they are engaged in early training. This dissertation demonstrates a methodologically rigorous gender-difference study; offers evidence for an “anti-essentialist” view of gender differences in communication; and gives insight into the process by which apprentice members of a profession may adjust their communicative processes in response to their training. It demonstrates the utility of CPR theory and NLP tools in scholarly inquiries in rhetoric and disciplinary, professional, and technical communication.
Hypernym Discovery over WordNet and English Corpora - using Hearst Patterns and Word Embeddings
(2018-07) Vallabhajosyula, Manikya Swathi
Languages evolve over time. With new technical innovations, new terms get created and new senses are added to existing words. Dictionaries like WordNet which act as a database for English vocabulary should be updated with these new concepts. WordNet organizes these concepts in sets of synonyms and interlinks them by using semantic relations. Many Natural Language Processing applications like Machine Translation and Word Sense Disambiguation rely on WordNet for their functionality. WordNet was last updated in 2006. If WordNet is not updated with new vocabulary, the performance of applications which rely on WordNet would drop. The objective of our research is to automatically update WordNet with the new senses by using resources like online dictionaries and text corpora available over the internet. We use the ISA hierarchy structure of WordNet to insert new senses. In an ISA hierarchy, the concepts higher in a hierarchy (called hypernyms) are more abstract representations of the concepts lower in hierarchy (called hyponyms). To improve the coverage of our solution, we rely on two complementary techniques - traditional pattern matching and modern vector space models - to extract candidate hypernym from WordNet for a new sense. Our system was ranked 4 among the systems that participated in for this SemEval task SemEval 2016 Task 14 Semantic Taxonomy Enrichment. We also evaluate our system by participating in the task SemEval 2018 Task 09 Hypernym Discovery. In this task, we apply our system to the huge UMBC WebBase text corpus to extract candidate hypernyms for a given input term. Our system was ranked 3 among the systems which find hypernyms for Concepts.

University Digital Conservancy

Browse by Subject

Browsing by Subject "NLP"