Improving the Prediction of Protein-Protein Interaction Sites Using a Novel Over-Sampling Approach and Predicted Shape Strings

Lan Anh T. Nguyen *

Graduate School of Natural Science and Technology, Kanazawa University, Japan.

Osamu Hirose

Institute of Science and Engineering, Kanazawa University, Japan.

Xuan Tho Dang

Graduate School of Natural Science and Technology, Kanazawa University, Japan.

Tu Kien T. Le

Graduate School of Natural Science and Technology, Kanazawa University, Japan.

Thammakorn Saethang

Graduate School of Natural Science and Technology, Kanazawa University, Japan.

Vu Anh Tran

Graduate School of Natural Science and Technology, Kanazawa University, Japan.

Mamoru Kubo

Institute of Science and Engineering, Kanazawa University, Japan.

Yoichi Yamada

Institute of Science and Engineering, Kanazawa University, Japan.

Kenji Satou

Institute of Science and Engineering, Kanazawa University, Japan.

*Author to whom correspondence should be addressed.


Abstract

Identification of protein-protein interaction (PPI) sites is one of the most challenging tasks in bioinformatics and many computational methods based on support vector machines have been developed. However, current methods often fail to predict PPI sites mainly because of the severe imbalance between the numbers of interface and non-interface residues. In this study, we propose a novel over-sampling method that relaxes the class-imbalance problem based on local density distributions. We applied the proposed method to a PPI dataset that includes 2,829 interface and 24,616 non-interface residues. The experimental result showed a significant improvement in predictive performance comparing with the other state-of-the-art methods according to the six evaluation measures.

Keywords: Protein-protein interaction sites, shape strings, class imbalance, over-sampling


How to Cite

Nguyen, Lan Anh T., Osamu Hirose, Xuan Tho Dang, Tu Kien T. Le, Thammakorn Saethang, Vu Anh Tran, Mamoru Kubo, Yoichi Yamada, and Kenji Satou. 2013. “Improving the Prediction of Protein-Protein Interaction Sites Using a Novel Over-Sampling Approach and Predicted Shape Strings”. Annual Research & Review in Biology 3 (2):92-106. https://journalarrb.com/index.php/ARRB/article/view/1907.

Downloads

Download data is not yet available.