Immunoinformatics

Innovative Computational Techniques for Identification of Novel T-cell Epitopes

A probabilistic meta-predictor for the MHC class II binding peptides.

Oleksiy Karpenko, Lei Huang and Yang Dai.
Immunogenetics, 60:1, pp.25-36, 2008.
Link to PubMed.

Abstract
Several computational methods for the prediction of major histocompatibility complex (MHC) class II binding peptides embodying different strengths and weaknesses have been developed. To provide reliable prediction, it is important to design a system that enables the integration of outcomes from various predictors. The construction of a meta-predictor of this type based on a probabilistic approach is introduced in this paper. The design permits the easy incorporation of results obtained from any number of individual predictors. It is demonstrated that this integrated method outperforms six state-of-the-art individual predictors based on computational studies using MHC class II peptides from 13 HLA alleles and three mouse MHC alleles obtained from the Immune Epitope Database and Analysis Resource. It is concluded that this integrative approach provides a clearly enhanced reliability of prediction. Moreover, this computational framework can be directly extended to MHC class I binding predictions.

Building a Meta-predictor for MHC Class II Binding Peptides.

Lei Huang, Oleksiy Karpenko, Naveen Murugan and Yang Dai.
Immunoinformatics: Predicting Immunogenicity in silico, Flower, D.R. (ed), Humana Press Inc., Totowa, NJ, NJ. pp. 355-364, 2007.
Link to PDF.

Abstract
Prediction of class II major histocompatibility complex (MHC)-peptide binding is a challenging task due to variable length of binding peptides. Different computational methods
have been developed; however, each has its own strength and weakness. In order to provide reliable prediction, it is important to design a system that enables the integration of outcomes from various predictors. In this chapter, the procedure of building such a meta-predictor based on Naive Bayesian approach is introduced. The system is designed in such a way that results obtained from any number of individual predictors can be easily incorporated. This meta-predictor is expected to give users more confidence in the prediction.

Direct Prediction of T-cell Epitopes Using Support Vector Machines with Novel Sequence Encoding Schemes.

Lei Huang and Yang Dai.
Journal of Bioinformatics and Computational Biology 2006, Vol. 4(1), pp. 93-107.

Abstract
New peptide encoding schemes are proposed to use with support vector machines for the direct recognition of T cell epitopes. The methods enable the presentation of information on (1) amino acid positions in peptides, (2) neighboring side chain interactions, and (3) the similarity between amino acids through a BLOSUM matrix. A procedure of feature selection is also introduced to strengthen the prediction. The computational results demonstrate competitive performance over previous techniques.

Prediction of MHC class II binding peptides based on an iterative learning model.

Naveen Murugan and Yang Dai.
Immunome Research 2005, 1:6.
Link to PebMed

Abstract
Prediction of the binding ability of antigen peptides to major histocompatibility complex (MHC) class II molecules is important in vaccine development. The variable length of each binding peptide complicates this prediction. Motivated by a text mining model designed for building a classifier from labeled and unlabeled examples, we have developed an iterative supervised learning model for the prediction of MHC class II binding peptides. A linear programming (LP) model was employed for the learning task at each iteration, since it is fast and can re-optimize the previous classifier when the training sets are altered. The performance of the new model has been evaluated with benchmark datasets. The outcome demonstrates that the model achieves an accuracy of prediction that is competitive compared to the advanced predictors (the Gibbs sampler and TEPITOPE). The average areas under the ROC curve obtained from one variant of our model are 0.753 and 0.715 for the original and homology reduced benchmark sets, respectively. The corresponding values are respectively 0.744 and 0.673 for the Gibbs sampler and 0.702 and 0.667 for TEPITOPE. The iterative learning procedure appears to be effective in prediction of MHC class II binders. It offers an alternative approach to this important predictionproblem.

Prediction of MHC class II binders using the ant colony search strategy.

Oleksiy Karpenko, Jianming Shi and Yang Dai.
Artificial Intelligence in Medicine, Vol. 35, pp.147-156, 2005.
Link to PubMed.

Abstract
Predictions of the binding ability of antigen peptides to major histocompatibility complex (MHC) class II molecules are important in vaccine development. The variable length of each binding peptide complicates this prediction. Motivated by the search properties of the ant colony system (ACS), a method for the identification of an alignment for a given set of short protein peptides has been developed. This alignment is further used for the derivation of a position specific scoring matrix. The distinguishing feature of this method is the use of the collective optimized search strategy of ants for the selection of the alignment. RESULTS: The performance of the new model has been evaluated with several benchmark datasets. It achieves better or comparable results as compared to the performance of existing methods. The experiments demonstrate that the predictive performance of the scoring matrix embodies several promising characteristics.