Browsing by Author "Ning, Xia"
Now showing 1 - 3 of 3
- Results Per Page
- Sort Options
Item Affinity-based Structure-Activity-Relationship Models: Improving Structure-Activity-Relationship Models by Incorporating Activity Information from Related Targets(2009-05-29) Ning, Xia; Rangwala, Huzefa; Karypis, GeorgeStructure-activity-relationship SAR models are used to inform and guide the iterative optimization of chemical leads, and play a fundamental role in modern drug discovery. In this paper we present a new class of methods for building SAR models, referred to as affinity-based, that utilize activity information from different targets. These methods first identify a set of targets that are related to the target under consideration and then they employ various machine-learning techniques that utilize activity information from these targets in order to build the desired SAR model. We developed different methods for identifying the set of related targets, which take into account the primary sequence of the targets or the structure of their ligands,and we also developed different machine learning techniques that were derived by using principles of semi-supervised learning, multi-task learning, and classifier ensembles.The comprehensive evaluation of these methods shows that they lead to considerable improvements over the standard SAR models that are based only on the ligands of the target under consideration. On a set of 117 protein targets obtained from PubChem, these affinity-based methods achieve an ROC score that is on the average 7.0% - 7.2% higher than that achieved by the standard SAR models. Moreover, on a set of targets belonging to six protein families, the affinity-based methods outperform chemogenomics-based approaches by 4.33%.Item Improved SAR Models - Exploiting the Target-Ligand Relationships(2008-04-04) Ning, Xia; Rangwala, Huzefa; Karypis, GeorgeSmall organic molecules, by binding to different proteins, can be used to modulate (inhibit/activate) their functions for therapeutic purposes and to elucidate the molecular mechanisms underlying biological processes. Over the decades structure-activity-relationship (SAR) models have been developed to quantify the bioactivity relationship of a chemical compound interacting with a target protein, with advances focussing on the chemical compound representation and the statistical learning methods. We have developed approaches to improve the performance of SAR models using compound activity information from different targets. The methods developed in the study aim to determine the candidacy of a target to help another target in improving the performance of its SAR model by providing supplemental activity information. Having identified a helping target we also develop methods to identify a subset of compounds that would result in improving the sensitivity of the SAR model. Identification of helping targets as well as helping compounds is performed using various nearest neighbor approaches using similarity measures derived from the targets as well as active compounds. We also developed methods that involve use of cross-training a series of SVM-based models for identifying the helping set of targets. Our experimental results show that our methods show statistically significant results and incorporate the target-ligand activity relationship well.Item Machine learning and data mining methods for recommender systems and chemical informatics.(2012-07) Ning, XiaThis thesis focuses on machine learning and data mining methods for problems arising primarily in recommender systems and chemical informatics. Although these two areas represent dramatically different application domains, many of the underlying problems have common characteristics, which allows the transfer of ideas and methods between them. The first part of this thesis focuses on recommender systems. Recommender systems represent a set of computational methods that produce recommendations of interesting entities (e.g., products) from a large collection of such entities by retrieving/filtering/learning information from their own properties (e.g., product attributes) and/or the interactions with other parties (e.g., user-product ratings). We have addressed the two core tasks for recommender systems, that is, top-N recommendation and rating prediction. We have developed 1). a novel sparse linear method for top-N recommendation, which utilizes regularized linear regression with sparsity constraints to model user-item purchase patterns; 2). a set of novel sparse linear methods with side information for top-N recommendation, which use side information to regularize sparse linear models or use side information to model user-item purchase behaviors; and 3). a multi-task learning method for rating prediction, which uses multi-task learning methodologies to model user communities and predict personalized ratings. The second part of this thesis is dedicated to chemical informatics, which is an interdisciplinary research area where computational and information technologies are developed to aid the investigation of chemical problems. We have developed computational methods to build two important models in chemical informatics, that is, Structure-Activity-Relationship (SAR) model and Structure-Selectivity-Relationship (SSR) model. We have developed 1). a multi-assay-based SAR model, which leverages information from different protein families; and 2). a set of computational methods for better SSR models, which use various learning methodologies including multi-class classification and multi-task learning. The studies on recommender systems and chemical informatics show that these two areas have great analogies in terms of the data, the problem formulations and the underlying principles, and any advances in one area could contribute to that of the other.