Journal
MOLECULAR GENETICS AND GENOMICS
Volume 295, Issue 2, Pages 525-534Publisher
SPRINGER HEIDELBERG
DOI: 10.1007/s00438-019-01642-z
Keywords
Riboswitches; Feature extraction; Sequential blocks; Block location-based feature extraction; Classification; Performance measures
Ask authors/readers for more resources
As knowledge of genetics and genome elements increases, the demand for the development of bioinformatics tools for analyzing these data is raised. Riboswitches are genetic components, usually located in the untranslated regions of mRNAs, that regulate gene expression. Additionally, their interaction with antibiotics has been recently suggested, implying a role in antibiotic effects and resistance. Following a previously published sequential block finding algorithm, herein, we report the development of a new block location-based feature extraction strategy (BLBFE). This procedure utilizes the locations of family-specific sequential blocks on riboswitch sequences as features. Furthermore, the performance of other feature extraction strategies, including mono- and dinucleotide frequencies, k-mer, DAC, DCC, DACC, PC-PseDNC-General and SC-PseDNC-General methods, was investigated. KNN, LDA, naive Bayes, PNN and decision tree classifiers accompanied by V-fold cross-validation were applied for all methods of feature extraction, and their performances based on the defined feature extraction strategies were compared. Performance measures of accuracy, sensitivity, specificity and F-score for each method of feature extraction were studied. The proposed feature extraction strategy resulted in classification of riboswitches with an average correct classification rate (CCR) of 90.8%. Furthermore, the obtained data confirmed the performance of the developed feature extraction method with an average accuracy of 96.1%, an average sensitivity of 90.8%, an average specificity of 97.52% and an average F-score of 90.69%. Our results implied that the proposed feature extraction (BLBFE) method can classify and discriminate riboswitch families with high CCR, accuracy, sensitivity, specificity and F-score values.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available