נגישות
menu      
חיפוש מתקדם
BMC Bioinformatics
Freilich, S., The Blavatnik School of Computer Sciences and School of Medicine, Tel-Aviv University, Tel-Aviv 69978, Israel
Goldovsky, L., Computational Genomics Unit, Institute of Agrobiotechnology, Centre for Research and Technology Hellas CERTH, GR-57001 Thessalonica, Greece
Gottlieb, A., School of Physics and Astronomy, Tel-Aviv University, Tel-Aviv 69978, Israel
Blanc, E., King's College Centre for Bioinformatics (KCBI), School of Physical Sciences and Engineering, King's College London, Strand London WC2R 2LS, United Kingdom, MRC Centre for Developmental Neurobiology, New Hunt's House, King's College London, Guy's Campus, London WC2R 2LS, United Kingdom
Tsoka, S., King's College Centre for Bioinformatics (KCBI), School of Physical Sciences and Engineering, King's College London, Strand London WC2R 2LS, United Kingdom
Ouzounis, C.A., Computational Genomics Unit, Institute of Agrobiotechnology, Centre for Research and Technology Hellas CERTH, GR-57001 Thessalonica, Greece, King's College Centre for Bioinformatics (KCBI), School of Physical Sciences and Engineering, King's College London, Strand London WC2R 2LS, United Kingdom
Background: Previous methods of detecting the taxonomic origins of arbitrary sequence collections, with a significant impact to genome analysis and in particular metagenomics, have primarily focused on compositional features of genomes. The evolutionary patterns of phylogenetic distribution of genes or proteins, represented by phylogenetic profiles, provide an alternative approach for the detection of taxonomic origins, but typically suffer from low accuracy. Herein, we present rank-BLAST, a novel approach for the assignment of protein sequences into genomic groups of the same taxonomic origin, based on the ranking order of phylogenetic profiles of target genes or proteins across the reference database. Results: The rank-BLAST approach is validated by computing the phylogenetic profiles of all sequences for five distinct microbial species of varying degrees of phylogenetic proximity, against a reference database of 243 fully sequenced genomes. The approach - a combination of sequence searches, statistical estimation and clustering - analyses the degree of sequence divergence between sets of protein sequences and allows the classification of protein sequences according to the species of origin with high accuracy, allowing taxonomic classification of 64% of the proteins studied. In most cases, a main cluster is detected, representing the corresponding species. Secondary, functionally distinct and species-specific clusters exhibit different patterns of phylogenetic distribution, thus flagging gene groups of interest. Detailed analyses of such cases are provided as examples. Conclusion: Our results indicate that the rank-BLAST approach can capture the taxonomic origins of sequence collections in an accurate and efficient manner. The approach can be useful both for the analysis of genome evolution and the detection of species groups in metagenomics samples. © 2009 Freilich et al; licensee BioMed Central Ltd.
פותח על ידי קלירמאש פתרונות בע"מ -
הספר "אוצר וולקני"
אודות
תנאי שימוש
Stratification of co-evolving genomic groups using ranked phylogenetic profiles
10
Freilich, S., The Blavatnik School of Computer Sciences and School of Medicine, Tel-Aviv University, Tel-Aviv 69978, Israel
Goldovsky, L., Computational Genomics Unit, Institute of Agrobiotechnology, Centre for Research and Technology Hellas CERTH, GR-57001 Thessalonica, Greece
Gottlieb, A., School of Physics and Astronomy, Tel-Aviv University, Tel-Aviv 69978, Israel
Blanc, E., King's College Centre for Bioinformatics (KCBI), School of Physical Sciences and Engineering, King's College London, Strand London WC2R 2LS, United Kingdom, MRC Centre for Developmental Neurobiology, New Hunt's House, King's College London, Guy's Campus, London WC2R 2LS, United Kingdom
Tsoka, S., King's College Centre for Bioinformatics (KCBI), School of Physical Sciences and Engineering, King's College London, Strand London WC2R 2LS, United Kingdom
Ouzounis, C.A., Computational Genomics Unit, Institute of Agrobiotechnology, Centre for Research and Technology Hellas CERTH, GR-57001 Thessalonica, Greece, King's College Centre for Bioinformatics (KCBI), School of Physical Sciences and Engineering, King's College London, Strand London WC2R 2LS, United Kingdom
Stratification of co-evolving genomic groups using ranked phylogenetic profiles
Background: Previous methods of detecting the taxonomic origins of arbitrary sequence collections, with a significant impact to genome analysis and in particular metagenomics, have primarily focused on compositional features of genomes. The evolutionary patterns of phylogenetic distribution of genes or proteins, represented by phylogenetic profiles, provide an alternative approach for the detection of taxonomic origins, but typically suffer from low accuracy. Herein, we present rank-BLAST, a novel approach for the assignment of protein sequences into genomic groups of the same taxonomic origin, based on the ranking order of phylogenetic profiles of target genes or proteins across the reference database. Results: The rank-BLAST approach is validated by computing the phylogenetic profiles of all sequences for five distinct microbial species of varying degrees of phylogenetic proximity, against a reference database of 243 fully sequenced genomes. The approach - a combination of sequence searches, statistical estimation and clustering - analyses the degree of sequence divergence between sets of protein sequences and allows the classification of protein sequences according to the species of origin with high accuracy, allowing taxonomic classification of 64% of the proteins studied. In most cases, a main cluster is detected, representing the corresponding species. Secondary, functionally distinct and species-specific clusters exhibit different patterns of phylogenetic distribution, thus flagging gene groups of interest. Detailed analyses of such cases are provided as examples. Conclusion: Our results indicate that the rank-BLAST approach can capture the taxonomic origins of sequence collections in an accurate and efficient manner. The approach can be useful both for the analysis of genome evolution and the detection of species groups in metagenomics samples. © 2009 Freilich et al; licensee BioMed Central Ltd.
Scientific Publication
You may also be interested in