SCOP2 - Structural Classification Of Proteins

NAR Molecular Biology Database Collection entry number 282
Antonina Andreeva, Dave Howorth, Cyrus Chothia, Eugene Kulesha and Alexey G. Murzin
MRC Centre for Protein Engineering and MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, UK

Database Description

The SCOP (Structural Classification of Proteins) database [1-3] provides a comprehensive and detailed description of the evolutionary and structural relationships of the proteins of known structure. SCOP embodies an evolutionary classification produced by human experts. This allows us to use a theory of protein evolution that encompasses our knowledge of the great variety, and the full extent, of the different types of changes that take place during evolution. So far, automatic procedures for classification can measure only a few of these types of changes and not the full extent of them, this makes them less powerful or less accurate. SCOP extends the ability of humans to analyse and interpret the data and makes the invaluable knowledge of protein evolutionary repertoire broadly available to scientific researchers.

The fundamental unit of classification in the SCOP database is the protein domain. A domain is defined as an evolutionary unit observed in nature either in isolation or in more than one context in multidomain proteins. The protein domains are classified hierarchically into families, superfamilies, folds and classes, whose meaning has been discussed before [1]. The first official SCOP release nine years ago comprised 3179 protein domains grouped into 498 families, 366 superfamilies and 279 folds [1]. The seven main classes in the latest release (1.65) contain 40452 domains organised into 2327 families, 1294 superfamilies and 800 folds. These domains correspond to 20619 entries in the Protein Data Bank (PDB) [4] and one literature reference to a structure with unpublished coordinates. Statistics of the current and previous releases, summaries and full histories of changes and other information are available from the SCOP website ( together with parseable files encoding all SCOP data [2]. The sequences and PDB-style structures of SCOP domains are available from the ASTRAL compendium ( [5], and hidden Markov models of SCOP domains are available from the SUPERFAMILY database ([6].


1. Murzin, A. G., Brenner, S. E., Hubbard, T.J.P., and Chothia, C. (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, 536-540.
2. Lo Conte, L., Brenner, S. E., Hubbard, T. J. P., Chothia, C., and Murzin, A.G. (2002) SCOP database in 2002: refinements accommodate structural genomics. Nucl. Acids. Res. 30, 264-267.
3. Andreeva, A., Howorth, D., Brenner, S. E., Hubbard, T. J. P., Chothia, C., and Murzin, A.G. (2004) SCOP database in 2004: refinements integrate structure and sequence family data. Nucl. Acids. Res. 32, [??-??].
4. Westbrook,J., Feng,Z., Jain,S., Bhat,T.N., Thanki,N., Ravichandran,V., Gilliland,G.L., Bluhm,W., Weissig,H., Greer,D.S., Bourne,P.E. and Berman,H.M. (2002) The Protein Data Bank: unifying the archive. Nucleic Acids Res. 30, 245-248.
5. Chandonia, J.M., Hon, G., Walker, N.S., Lo Conte, L., Koehl, P., Levitt, M., and Brenner, S.E. (2004) The ASTRAL compendium in 2004. Nucleic Acids. Res. 32, [xx-xx].
6. Madera, M. Vogel, C., Kummerfeld, S.K., Chothia, C. and Gough, J. (2004) The SUPERFAMILY database in 2004: additions and improvements. Nucleic Acids. Res. 32, [xx-xx].

Subcategory: Protein structure

Go to the article in the NAR Database issue.
Oxford University Press is not responsible for the content of external internet sites