Browsing by Subject "Annotation"

Now showing 1 - 2 of 2

Genomic and transcriptomic approaches for the advancement of CHO cell bioprocessing
(2014-06) Vishwanathan, Nandita
Recombinant protein therapeutics have transformed healthcare by paving the way for the treatment of refractory illnesses like cancer and arthritis. Chinese hamster ovary (CHO) cells are the major workhorse for the production of these therapeutics. Striving for continual improvements in the productivity and quality of protein produced in CHO cells, many process enhancements have been successfully implemented. However, many processes are still empirical, and we have little understanding of the mechanisms for these methods. The availability of genomic resources for CHO cells has ushered in a `genomics' era in bioprocessing. Genomic resources can now be employed to understand and improve cell lines and processes to enhance the productivity and quality of protein therapeutics produced by CHO cells. Seeking the development of genomic resources for CHO cells, the Chinese hamster genome and transcriptome were sequenced, assembled and annotated. Such transcriptomic resources can be used to study the inherent transcriptomic variability in CHO cells. The genetic cues identified from the study of the variability in the glycosylation pathway genes opens up several opportunities to manipulate protein quality. The relative expression of isozymes in CHO cells affect metabolic characteristics, which in turn may potentially impact product quality or even process robustness. The comparative study of isozymes can give important clues for cell engineering and process development. The isozyme distribution in CHO cells indicates a very high overall glycolytic rate, insinuating to the possibility of manipulating glycolytic flux for improving processes. Engineering superior metabolism through cell engineering can be used to reduce glycolytic flux in the late stage of the fed batch culture to reduce lactate accumulation. A novel dynamic promoter was used to drive the expression of a fructose transporter selectively in the late stages of the culture. By maintaining adequately low fructose levels in the late stage, the glycolytic flux was reduced significantly to induce lactate consumption. Since lactate accumulation is well accepted to be detrimental to productivity, this phenotype is desired for bioprocessing. In addition to such high productivity processes, high producing cells are also desired. The lengthy process of cell line development transforms non-producing cells to high producers. The molecular changes in this transformation were elucidated by studying the transcriptome of CHO cells during cell line development. We hypothesize that methotrexate treatment not only increases the transgene copy number, but also enriches cells with superior growth, energy metabolism, and secretion capabilities. This leads to an enriched population of high producers. The sustenance of high productivity over several generations depends on the stability of the integration site of the transgene. Two methods for identifying the cell's transgene integration site were developed and optimized. These methods can be applied for high throughput investigation of stability of integration sites.The application of genomics in bioprocessing has sparked a systems approach to investigate genetic regulation. This knowledge paved the way for controlling cellular metabolism and achieve stable and high producing cell lines and processes. Such genome scale analyses have a great potential to advance the capacity of CHO cells for biopharmaceutical applications.
Leveraging open source web resources to improve retrieval of low text content items
(2014-08) Singhal, Ayush
With the exponential increase in the amount of digital information in the world, search engines and recommendation systems have become the most convenient ways to find relevant information. As an example, the number of web pages on the world wide web was estimated to be over a trillion mark by the year 2008. However, search today is no longer limited to documents on the world wide web. The new "information needs" such as multimedia items (images, videos) opens up challenging avenues for scientific research. Thus the search techniques used to find items which are content rich ( e.g documents) no longer holds for items with low-text content. In the literature, several solutions are proposed for developing search framework for multimedia item search which includes using the visual or audio content of such items for retrieval purposes. However, there is little research on this problem in the domain of scientific research artifacts. This thesis investigates the problem of retrieval of low-text content items for search and recommendation purposes and propose novel techniques to improve retrieval of such items. In particular, we focus on scientific research datasets owing to their importance and exponential growth in the last few decades.One of the main challenges in searching research datasets is the lack of text content surrounding the dataset. While the datasets themselves have raw content, the problem of low text content makes the conventional text based search techniques inadequate for their retrieval. In comparison to multimedia items, where visual and audio features have been utilized to enhance search or recommendation based retrieval, scientific research datasets lack a uniform schema for representing their raw content. Although solutions such as curation and annotation by experts/data scientists exists but these are unfeasible for practical operation on a large scale. As a solution, this thesis provides a computational and an efficient framework for retrieving such low-text content items. We primarily present two retrieval models, namely, (1) a user profile based search, and (2) keyword based search. For the user profile based search model, we show that the text content of the item can be derived from the user's profile and the relevance ranking can also be derived based on users profile. We find that the proposed approach using open source knowledge for item extraction outperforms local content based extraction approach. For the keyword based search model, we have developed a content rich database for research datasets. We use novel content generation techniques to overcome the low-text content challenge for datasets. The content information is extracted from open source and crowd sourced knowledge resources like academic search engines and Wikipedia. In addition to the stand-alone quantitative assessment of the content generated, we evaluate the efficiency of the entire keyword based search framework via user study. Based on user responses, the thesis reports positive evidence that the proposed search framework is better than the popular general purpose search engine for searching datasets with a context based queries.The ideas developed in this thesis are implemented in a real search system DataGopher.org: an open source search engine for scientific research datasets. Moreover, the approaches developed for research datasets have application to other low-content items such as short text document, news feeds and twitter tweets. In summary, the computational approaches proposed in this thesis advance the state-of-the-art in retrieval of low-content items. Whereas the extensive evaluations that are performed on items like scientific research datasets and low text content documents demonstrate the validity of the findings.

University Digital Conservancy

Browse by Subject

Browsing by Subject "Annotation"