2009 IEEE International Conference on
Systems, Man, and Cybernetics |
![]() |
Abstract
We present a paper for the prediction of the bindings between microRNAs (miRNAs) and their target genes. A novel coding for the miRNAs, the binding sites (i.e. the target genes) and the flanking sequences of the binding sites is adopted to code the related information comprehensively. A feature selection method, Minimum Redundancy Maximum Relevance (mRMR), is used to filter out ineffective and redundant features. Because the data are severely imbalanced, a committee of NNA (Nearest Neighbor Algorithm) classifiers is applied to distribute the data more evenly between different classes. The final prediction results are gained through voting from the classifier committee. As a result, 83.33% positive samples are correctly identified with an overall correct prediction rate of 76.78%. The feature analysis, performed by mRMR feature selections using the classifier committee, shows that the seed region of miRNAs and the flanking sequences of the binding sites play a significant role in the regulation of miRNA binding.