Kim, HyunsooPark, Haesun2020-09-022020-09-022003-01-27https://hdl.handle.net/11299/215548The prediction of protein secondary structure is an important step in the prediction of protein tertiary structure. While the neural network approach has been improved by the use of position specific scoring matrices, e.g., (PSSM) generated from PSI-BLAST, the support vector machine approach has recently been introduced. A new protein secondary structure prediction method SVMpsi is developed to improve the current level of prediction by incorporating new tertiary classifiers and their jury decision system, efficient methods to handle unbalanced data, a new optimization strategy for maximizing the $Q_3$ measure, and the PSI-BLAST PSSM profiles. The SVMpsi produces the highest published $Q_3$ and SOV94 scores on both the RS126 and CB513 data sets to date. For a new KP480 set, the prediction accuracy of SVMpsi was $Q_3=78.5$% and SOV94 = 82.8%. Moreover, the blind test results for 136 non-redundant protein sequences which do not contain homologues of training data sets, were $Q_3=77.2$% and SOV94 = 81.8%. From the cross validation tests and CASP5 experiment, this shows that the SVMpsi is another competitive method to predict the protein secondary structure. Multi-classification strategies based on the one-versus-one scheme and directed acyclic graph scheme (DAG scheme) are also investigated.en-USProtein Secondary Structure Prediction Based on an Improved Support Vector Machines ApproachReport