Combining classifiers for improved classification of proteins from sequence or structure Journal Article


Authors: Melvin, I.; Weston, J.; Leslie, C. S.; Noble, W. S.
Article Title: Combining classifiers for improved classification of proteins from sequence or structure
Abstract: Background: Predicting a protein's structural or functional class from its amino acid sequence or structure is a fundamental problem in computational biology. Recently, there has been considerable interest in using discriminative learning algorithms, in particular support vector machines (SVMs), for classification of proteins. However, because sufficiently many positive examples are required to train such classifiers, all SVM-based methods are hampered by limited coverage. Results: In this study, we develop a hybrid machine learning approach for classifying proteins, and we apply the method to the problem of assigning proteins to structural categories based on their sequences or their 3D structures. The method combines a full-coverage but lower accuracy nearest neighbor method with higher accuracy but reduced coverage multiclass SVMs to produce a full coverage classifier with overall improved accuracy. The hybrid approach is based on the simple idea of "punting" from one method to another using a learned threshold. Conclusion: In cross-validated experiments on the SCOP hierarchy, the hybrid methods consistently outperform the individual component methods at all levels of coverage. © 2008 Melvin et al; licensee BioMed Central Ltd.
Keywords: sequence analysis; molecular genetics; methodology; proteins; accuracy; protein analysis; metabolism; biology; classification; protein; structure activity relation; structure-activity relationship; algorithms; prediction; chemistry; amino acid sequence; molecular sequence data; algorithm; sequence alignment; systems integration; protein structure; ultrastructure; sequence analysis, protein; machine learning; learning algorithm; support vector machine; system analysis
Journal Title: BMC Bioinformatics
Volume: 9
ISSN: 1471-2105
Publisher: Biomed Central Ltd  
Date Published: 2008-09-22
Start Page: 389
Language: English
DOI: 10.1186/1471-2105-9-389
PUBMED: 18808707
PROVIDER: scopus
PMCID: PMC2561051
DOI/URL:
Notes: --- - "Cited By (since 1996): 11" - "Export Date: 17 November 2011" - "CODEN: BBMIC" - "Source: Scopus"
Altmetric
Citation Impact
BMJ Impact Analytics
MSK Authors
  1. Christina Leslie
    188 Leslie