Journal
PLANT BIOTECHNOLOGY
Volume 26, Issue 5, Pages 469-477Publisher
JAPANESE SOC PLANT CELL & MOLECULAR BIOL
DOI: 10.5511/plantbiotechnology.26.469
Keywords
Batch-learning SOM; oligonucleotide frequency; random sequences; genome signature
Funding
- Ministry of Education, Culture, Sports, Science and Technology of Japan
Ask authors/readers for more resources
Novel tools are needed for comprehensive comparisons of the inter-and intraspecies characteristics of a large amounts of available genome sequences. An unsupervised neural network algorithm, Kohonen's Self-Organizing Map (SOM), is an effective tool for clustering and visualizing high-dimensional complex data on a single map. We modified the conventional SOM for genome informatics on the basis of Batch Learning SOM (BLSOM), making the resulting map independent of the order of data input. We generated BLSOMs for oligonucleotide frequencies in fragment sequences (e. g. 10-kb) from 13 plant genomes for which almost complete genome sequences are available. BLSOM recognized species-specific characteristics (key combinations of oligonucleotide frequencies) in most of the fragment sequences, permitting classification (self-organization) of sequences according to species without any information regarding the species during computation. To disclose sequence characteristics of a single genome independently of other genomes, we constructed BLSOMs for sequence fragments from one genome plus computer-generated random sequences. Genomic sequences were clearly separated from random sequences, revealing the oligonucleotides with characteristic occurrence levels in the genomic sequences. We discussed these oligonucleotides diagnostic for genomic sequences, in connection with genetic signal sequences. Because the classification and visualization power is very high, BLSOM is thought to be an efficient and powerful tool for extracting a wide range of genomic information.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available