Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, V5A 1S6, Canada
OrtholugeDB provides ortholog predictions for completely sequenced bacterial and archaeal genomes from NCBI based on reciprocal best-BLAST hits, supplemented with further evaluation by the more precise Ortholuge method. The OrtholugeDB resource facilitates improved microbial comparative analysis by offering:
A unique approach to ortholog prediction, called Ortholuge that focuses on identifying the subset orthologs best suited for comparative genomics applications rather than all evolutionary orthologs. Ortholuge evaluates predicted orthologs, identifying those whose divergence reflects the species level of divergence. These orthologs are more likely to have similar functions. OrtholugeDB makes Ortholuge-based ortholog predictions readily available to the microbial research community (note however, availability of Ortholuge evaluations is determined by the availability of a suitable reference outgroup species. When no suitable outgroup species exists, RBB-based predictions are still presented).
In-paralog predictions based on the InParanoid method. In-paralogs are genes that have duplicated subsequent to species divergence. Gene duplication can alter gene function (for example through subfunctionalization).
Pre-computed ortholog groups that have been constructed for multiple hierarchical levels that represent consistent, phylogenetic groups. Instead of relying on taxonomic classifications such as Genus, Family etc. that are highly variable in terms of evolutionary relatedness, we computed consistent groupings using a distance metric. This approach allows researchers to select the ortholog group level with the desired scope and accuracy (ortholog groups constructed for closely related species are often more accurate).
Features that facilitate the rapid of extraction of bacterial and archaeal orthologs. OrtholugeDB provides multiple types of queries that allow the retrieval of orthologs for single genes to multiple genomes. These search interfaces are designed to be straightforward and easy to use. We also provide the ability to perform complex queries that identify genes that have or lack orthologs in multiple other species. Result pages provide visualizations, such as the ortholog group graph view, the phyletic matrix view and the gene context view, that help with evaluating and conceptualizing ortholog data. Data can be downloaded in multiple formats.
OrtholugeDB will be routinely updated to incorporate new sequenced bacterial and archaeal genomes in NCBI.
1. Fulton,D.L., Li,Y.Y., Laird,M.R., Horsman,B.G.S., Roche,F.M. and Brinkman,F.S.L. (2006) Improving the specificity of high-throughput ortholog prediction. BMC bioinformatics, 7, 270.
2. Altschul,S.F., Madden,T.L., SchÃ¤ffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic acids research, 25, 3389â€“402.
3. Pruitt,K.D., Tatusova,T., Brown,G.R. and Maglott,D.R. (2012) NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy. Nucleic acids research, 40, D130â€“5.
4. Benson,D.A., Karsch-Mizrachi,I., Lipman,D.J., Ostell,J. and Sayers,E.W. (2009) GenBank. Nucleic acids research, 37, D26â€“31.
5. Remm,M., Storm,C.E. and Sonnhammer,E.L. (2001) Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. Journal of molecular biology, 314, 1041â€“52.
6. Ostlund,G., Schmitt,T., Forslund,K., KÃ¶stler,T., Messina,D.N., Roopra,S., Frings,O. and Sonnhammer,E.L.L. (2010) InParanoid 7: new algorithms and tools for eukaryotic orthology analysis. Nucleic acids research, 38, D196â€“203.
7. Jensen,L.J., Julien,P., Kuhn,M., von Mering,C., Muller,J., Doerks,T. and Bork,P. (2008) eggNOG: automated construction and annotation of orthologous groups of genes. Nucleic acids research, 36, D250â€“4.
Oxford University Press is not responsible for the content of external internet sites