GraphMatch: Knowledge Graphs for Allogeneic Stem Cell Matching
Authors
Published Date
Publisher
Abstract
Allogeneic bone marrow and umbilical cord stem cell transplants often offer the best hope for curing patients with blood cancers and other blood diseases. We establish the historical context by starting with World War II and the initial use of nuclear weapons when the world recognized the dangers of radiologic events and the following public health threats. These threats included blood cancers on an unimaginable scale. At that time, little was known about survivability or available treatments. Researchers at the University of Chicago, associated with the Manhattan Project, were among the first to identify the preservation of hematopoiesis as a primary therapy for otherwise deadly exposures. We introduce hematopoietic stem cell (HSC) transplantation, which has two distinct types: autologous and allogeneic.Next, we discuss how matching patients to unrelated donors requires flexible and timely searches as the matching criteria evolve. Matching systems should scale to accommodate the diversity in patient and donor typing resolution as well as the increasing number of donors. We begin by describing the context of this work and the challenges associated with timely, scalable, and adaptable platform solutions. We introduce a current patient/donor matching method called HapLogic at the National Marrow Donor Program. HapLogic is a leading platform for this work, offering nearly interactive access to transplant patients worldwide.
In the second chapter of the thesis, we propose a novel approach to matching patients and donors for stem cell transplants. GraphMatch (GM) is a scalable graph database solution for storing and searching variable-resolution HLA genotype markers. For our test set, we expanded the World Marrow Donor Association (WMDA) validation set based on version 2.16 of the IPD-IMGT/HLA Database to create a synthetic production dataset comprising 1 million patients and 10 million donors. Single-patient identity search times range from 218.5 milliseconds per patient with 2 million donors to 1201.4 milliseconds per patient with 10 million donors. Search performance timing remained linear relative to the number of edges, even at a production scale.
In the third chapter, we anticipate practical extensions to the GraphMatch platform, allowing horizontally scalable performance and a flexible schema to accommodate additional search criteria. GraphMatch can also simulate additional matching algorithms, such as GRIMM and Hap-E. Ultimately, GM demonstrates the usefulness of graph databases as a flexible platform for scalable matching solutions.
Keywords
Description
University of Minnesota Ph.D. dissertation. May 2025. Major: Computer Science. Advisor: Chad Myers. 1 computer file (PDF); viii, 57 pages.
Related to
item.page.replaces
License
Collections
Series/Report Number
Funding Information
item.page.isbn
DOI identifier
Previously Published Citation
Other identifiers
Suggested Citation
Kunau, Timothy. (2025). GraphMatch: Knowledge Graphs for Allogeneic Stem Cell Matching. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/275903.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.
