Journal
PEERJ
Volume 5, Issue -, Pages -Publisher
PEERJ INC
DOI: 10.7717/peerj.3219
Keywords
Repeat detection; Random forest; Machine learning; CRISPR; Data visualization
Categories
Funding
- Committee on Faculty Research (CRF) Program, Miami University, Oxford, Ohio, USA
- Department of Biology, Miami University, Oxford, Ohio, USA
- Office for the Advancement of Research & Scholarship (OARS), Miami University, Oxford, Ohio, USA
Ask authors/readers for more resources
CRISPRs (clustered regularly interspaced short palindromic repeats) are particular repeat sequences found in wide range of bacteria and archaea genornes. Several tools are available for detecting CRISPR arrays in the genomes of both domains. Here we developed a new web-based CRISPR detection tool named CRF (CRISPR Finder by Random Forest). Different from other CRISPR detection tools, a random forest classifier was used in CRF to filter out invalid CRISPR arrays from all putative candidates and accordingly enhanced detection accuracy. In CRF, particularly, triplet elements that combine both sequence content and structure information were extracted from CRISPR repeats for classifier training. The classifier achieved high accuracy and sensitivity. Moreover, CRF offers a highly interactive web interface for robust data visualization that is not available among other CRISPR detection tools. After detection, the query sequence, CRISPR array architecture, and the sequences and secondary structures of CRISPR repeats and spacers can be visualized for visual examination and validation. CRF is freely available at http://bioinfolab.miamioh.edu/crf/home.php.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available