2 Multiple Kernel Fuzzy SVM-Based Data Fusion for Improving Peptide Identification

Ling Jian, Zhonghang Xia, Xinnan Niu, Xijun Liang, Parimal Samir, Andrew J. Link

Research output: Contribution to journalArticlepeer-review

Abstract

SEQUEST is a database-searching engine, which calculates the correlation score between observed spectrum and theoretical spectrum deduced from protein sequences stored in a flat text file, even though it is not a relational and object-oriental repository. Nevertheless, the SEQUEST score functions fail to discriminate between true and false PSMs accurately. Some approaches, such as PeptideProphet and Percolator, have been proposed to address the task of distinguishing true and false PSMs. However, most of these methods employ time-consuming learning algorithms to validate peptide assignments [1]. In this paper, we propose a fast algorithm for validating peptide identification by incorporating heterogeneous information from SEQUEST scores and peptide digested knowledge. To automate the peptide identification process and incorporate additional information, we employ ℓ2 multiple kernel learning (MKL) to implement the current peptide identification task. Results on experimental datasets indicate that compared with state-of-the-art methods, i.e., PeptideProphet and Percolator, our data fusing strategy has comparable performance but reduces the running time significantly.

Original languageEnglish (US)
Article number7272074
Pages (from-to)804-809
Number of pages6
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume13
Issue number4
DOIs
StatePublished - Jul 1 2016
Externally publishedYes

Keywords

  • Fuzzy SVM
  • Peptide identification
  • mass spectrometry
  • multiple kernel learning
  • peptide-spectrum matches

ASJC Scopus subject areas

  • Biotechnology
  • Genetics
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'ℓ2 Multiple Kernel Fuzzy SVM-Based Data Fusion for Improving Peptide Identification'. Together they form a unique fingerprint.

Cite this