Cancer is one of the leading causes of death worldwide accounting for around 13 % of all deaths. Oral cancer in one of the more common cancers occurring more frequently than leukemia, brain, stomach, or ovarian cancer. Unfortunately, the 5-year survival rate for oral cancer has not significantly improved in the past 30 years and remains at approximately 50 %, in part, due to lack of reliable diagnostic biomarkers for early detection. It is estimated, if diagnosed and treated early, survival rates for oral cancer would significantly improve to between 80 % and 90 %. We need reliable reliable biomarkers for diagnosis and early detection of oral cancer.
Recent developments in high-throughput proteomics techniques have made it possible to detect and identify low abundance proteins in complex biological fluids such as saliva. These low-abundance proteins could be a source of the elusive reliable biomarkers needed to improve survival rates for oral cancer. Limiting the widespread use of these proteomics techniques is lack of an accurate protein relative quantification technique.
A typical high-throughput experiment identifies several thousand proteins with several hundred differentially abundant proteins. The cost of validating candidate biomarkers prevents validation of each differentially abundant protein to identify promising candidate biomarker. We need computational techniques to identify promising candidate biomarkers.
This two-part dissertation presents: 1) a new technique for accurate protein relative quantification implemented in freely-available, open-source software (LTQ-iQuant) and 2) relational database operators for analyzing differentially abundant proteins to identify promising candidate biomarkers.
Linear ion trap mass spectrometers, such as the hybrid LTQ-Orbitrap, are a popular choice for isobaric-tags based shotgun proteomics because of their advantages in analyzing complex biological samples. Coupled with orthogonal fractionation techniques, they can be used to detect low abundance proteins extending the range for detecting possible biomarkers. Limiting the widespread use of this combination for quantitative proteomics studies is lack of a technique tailored to LTQ type instruments that accurately reports protein abundance ratios, and is implemented in an automated software pipeline. This thesis presents a new technique implemented in a freely-available, open source software that fulfills this need.
A major limitation of existing computational techniques when using high-throughput techniques is results that are too broad to be practically useful. A lot of the `potential' disease-specific biomarkers discovered have been found not to be specific to the disease being studied. They either belong to biological categories that change in response to infection or tissue injury, or are proteins whose changes are induced by other stresses such as medication and diet. This thesis extends the relational database engine to enable use of biological pathways to identify promising candidate biomarkers. Using biological pathways to analyze high-throughput data avoids results that are too broad to be practically useful.
Protein differential abundance often is the criteria used to identify candidate biomarkers in high-throughput discovery-based biomarker studies. However, protein quantity by itself might not be the salient marker parameter. Protein function is often dependent on post-translational modifications such as phosphorylation and gylcosylation. By only using differential abundance to identify candidate biomarkers, we are limiting our ability to identify reliable biomarkers. We further develop new operators that in addition to using user specified pathways, use post-translational modification information to analyze high-throughput data. For the first time, we demonstrate feasibility of using post-translational modifications with relational database operators to analyze high-throughput proteomics data.
Collectively, this work will facilitate the search for reliable biomarkers. LTQ-iQuant will make LTQ instruments and isobaric peptide tagging accessible to more proteomics researchers providing a new window into complex biological fluids. Relation operators will provide a systematic way of bridging the gap between unbiased data driven approach and hypothesis driven approach to prioritize candidate biomarkers.
University of Minnesota Ph.D. dissertation. May 2011. Major: Computer science. Advisors:Dr. John V. Carlis & Dr. Timothy J. Griffin. 1 computer file (PDF); xvi, 193 pages, appendices A-D.
Onsongo, Getiria Innocent.
Identifying candidate salivary oral cancer biomarkers:accurate protein quantification and analysis on LTQ type mass spectrometers..
Retrieved from the University of Minnesota Digital Conservancy,
Content distributed via the University of Minnesota's Digital Conservancy may be subject to additional license and use restrictions applied by the depositor.