With the growing capabilities of high-throughput gene methods, one of the critical issues in using these methods is how to interpret the results. For example, it is possible to evaluate all of the genes for yeast (saccharomyces cerevisiae) at once to see how they react to a particular chemical. As a result of such an experiment a researcher might get a list of genes that all respond similarly. The question then becomes how to understand what these genes have in common to explain their response. Complicating this issue is the real possibility that there may be more than one explanation. In this work we look at a method for automatically annotating groups of genes with keyphrases (i.e., short groups of words to describe the genes) to help a user understand what the genes might have in common. As part of this process we want to consider how to deal with cases where there is more than one explanation. To address this problem we make use of a biclustering method called SAMBA (Tanay et al., 2002) which was developed to solve a similar problem for genes and measured conditions. We generate keyphrases by considering possible keyphrases as conditions and attempt to bicluster the genes of interest with keyphrases that are strongly associated with subgroups of those genes. We perform experiments using genes associated with known terms to see if our method can extract useful keyphrases and to separate out subgroups of the genes.
University of Minnesota M.S. thesis. August 2010. Major: Computer science. Advisor: Dr. Richard Maclin. vii, 177 pages, appendix p.64-177. tables (some col.)
A biclustering method for extracting keyphrases to describe groups of yeast genes..
Retrieved from the University of Minnesota Digital Conservancy,
Content distributed via the University of Minnesota's Digital Conservancy may be subject to additional license and use restrictions applied by the depositor.