ArchDB
NAR Molecular Biology Database Collection entry number 413
Espadaler, J. 1,2, Fernandez-Fuentes, N.1,3, Hermoso, A.1, Querol, E.1, Aviles, F.X. 1, Sternberg, M.J.E.3, Oliva, B.2
1Laboratori de Bioinformàtica. Institut de Biomedicina I Biotecnologia. Universitat Autònoma de Barcelona. Bellaterra 08193. Spain.
2 Laboratori de Bioinformàtica Estructural (GRIB-IMIM). Universitat Pompeu Fabra. Barcelona 08003. Spain.
3 Structural Bioinformatics Group. Biochemistry Building Department of Biological Sciences. Imperial College. London SW7 2AZ. United Kingdom.
2 Laboratori de Bioinformàtica Estructural (GRIB-IMIM). Universitat Pompeu Fabra. Barcelona 08003. Spain.
3 Structural Bioinformatics Group. Biochemistry Building Department of Biological Sciences. Imperial College. London SW7 2AZ. United Kingdom.
Contact boliva@imim.es
Database Description
ArchDB (http://sbi.imim.es/archdb) is a relational database of classified structural motifs, consisting of one loop plus its bracing secondary structures. Motifs have been classified using an improved version[1] of ArchType[2], an automated loop classification program. ArchType has been successfully applied to the SCOP95 set from SCOP 1.65 release[3]. The previous non-redundant classification[1], based on the SCOP40 set, has also been updated. Currently, the SCOP90-based classification contains 46894 motifs clustered into 2029 classes and 5111 subclasses, and the SCOP40-based classification contains 24356 motifs clustered into 1221 classes and 3575 subclasses. In addition, a manually curated classification of a subset of kinase proteins, ArchKI, has been included[4]. Functional annotations concerning residues found within clustered motifs has been incorporated into ArchDB, such as small ligand contacts (within 6.0 Å) and PDB SITE and ACTSITE records[5]. SCOP domain classification and protein functional descriptors such as EC numbers[6] and GO terms[7] are also reported. Subclasses displaying conserved functional or structural annotations (above 75%) are reported.
Recent Developments
Residues involved in small ligand contacts (within 6.0 Å) are reported for every motif. PDB records have been parsed and PDB SITE and ACTSITE information concerning residues found within in clustered motifs is also reported. Subclasses displaying either small ligand contacts or PDB information conserved in at least 75% of its member motifs are annotated as potential functional subclasses. Conserved protein annotations within a subclass may also suggest a structural or functional link. Therefore, subclasses with SCOP identifiers, GO terms or EC numbers found to be conserved with percentages of conservation chosen by the user are reported in the web. Ongoing developments include an Enzyme-specific classification set, a bootstrapping-based method to identify putative functional motifs and loop search algorithm, ArchFit. The algorithm performs rigid body bracing structure fitting and sequence scoring by conformation-specific sequence substitution statistical potentials combined with the subclass-specific sequence profile (manuscript in preparation).
Acknowledgements
B.O. acknowledges grants from Fundación Areces (Spain) and from MCYT (Ministerio de Ciencia y Tecnología, Spain; ref. BIO2002-03609). Also from MCYT (BIO2001-246 and BIO2001-264) and CERBA (Centre de Referència en Biotecnologia, Generalitat de Catalunya) by F.X.A. and E.Q. Support from predoctoral fellowships from the Generalitat de Catalunya and MCYT (Spain) are acknowledged by J.E. and N.F.F.
References
1. Espadaler, J., et al., ArchDB: Automated protein loop classification as a tool for Structural Genomics. Nucleic Acids Res, 2004. 32: p. D185-D188.
2. Oliva, B., et al., An automated classification of the structure of protein loops. J Mol Biol, 1997. 266(4): p. 814-830.
3. Lo Conte, L., et al., SCOP database in 2002: refinements accommodate structural genomics. Nucleic Acids Res, 2002. 30(1): p. 264-267.
4. Fernandez-Fuentes, N., et al., Classification of common functional loops of kinase super-families. Proteins, 2004. 56(3): p. 539-555.
5. Westbrook, J., et al., The Protein Data Bank and structural genomics. Nucleic Acids Res, 2003. 31(1): p. 489-491.
6. Kotyk, A., IUPAC-IUBMB Joint Commission on Biochemical Nomenclature (JCBN) and Nomenclature Committee of IUBMB (NC-IUBMB). Newsletter 1999. Folia Microbiol (Praha), 1999. 44(3): p. 243-246.
7. Ashburner, M., et al., Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet, 2000. 25(1): p. 25-9.
2. Oliva, B., et al., An automated classification of the structure of protein loops. J Mol Biol, 1997. 266(4): p. 814-830.
3. Lo Conte, L., et al., SCOP database in 2002: refinements accommodate structural genomics. Nucleic Acids Res, 2002. 30(1): p. 264-267.
4. Fernandez-Fuentes, N., et al., Classification of common functional loops of kinase super-families. Proteins, 2004. 56(3): p. 539-555.
5. Westbrook, J., et al., The Protein Data Bank and structural genomics. Nucleic Acids Res, 2003. 31(1): p. 489-491.
6. Kotyk, A., IUPAC-IUBMB Joint Commission on Biochemical Nomenclature (JCBN) and Nomenclature Committee of IUBMB (NC-IUBMB). Newsletter 1999. Folia Microbiol (Praha), 1999. 44(3): p. 243-246.
7. Ashburner, M., et al., Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet, 2000. 25(1): p. 25-9.
Category: Structure Databases
Subcategory: Protein structure
Go to the abstract in the NAR 2004 Database Issue.
Oxford University Press is not responsible for the content of external internet sites