Skip Navigation

COG - Eukaryotic Orthologous Groups of proteins

NAR Molecular Biology Database Collection entry number 400
Tatusov, R.L., Natale, D.A., Fedorova, N.D., Jackson, J., Jacobs, A., Krylov, D.M., Mekhedov, S.L., Nikolskaya, A.N., Rao, B.S., Wolf, Y.I., Aravind, L., Lanczycki, C., Masumder, R., Sreekumar, K., Vasudevan, S., Walker, D.R., Tatusova, T.A., Yao, K., Yin, J., Koonin, E.V.
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA

Database Description

The version of the Clusters of Orthologous Groups of protein (COGs) for seven (nearly) complete eukaryotic genomes, S. cerevisiae, S. pombe, E. cuniculi, the green plant Arabidopsis thaliana, the nematode Caenorhabditis elegans, the fruit fly Drosophila melanogaster, and Homo sapiens, is currently available. The current set consists of 4852 KOGs which include 60,579 proteins. Detailed analysis of the KOGs revealed various trends in the evolution of eukaryotic genomes including widely different, lineage-specific propensities for gene loss. Manual validation and annotation of the KOGs and update to include additional eukaryotic genomes are underway.

Recent Developments

The KOG database can now be searched using RPS-BLAST through the Conserved Domain Database web site,


1. Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, Smirnov S, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ, Natale DA. (2003) The COG database: an updated version includes eukaryotes. BMC Bioinformatics. 4(1): 41

Go to the abstract in the NAR 2001 Database Issue.
Oxford University Press is not responsible for the content of external internet sites