NAR Molecular Biology Database Collection entry number 217
Vlahovicek, K1, Kajan, L1, Ágoston, V2, Pongor, S1
1International Centre for Genetic Engineering and Biotechnology, Area Science Park, 34012 Trieste, Italy
2National Center for Biotechnology Information, National Library of Medicine, National Institute of Health, Bethesda, MD 20894, USA

Database Description

SBASE ( is an on-line collection of protein domain sequences and related computational tools designed to facilitate detection of domain homologies based on simple database search. The tenth - "jubilee release" of the SBASE library of protein domain sequences contains 1,052,904 annotated structural, functional, ligand-binding and topogenic segments of proteins clustered into over 6000 domain groups. Domain identification and functional prediction are based on a comparison of BLAST search outputs with a knowledge base of biologically significant similarities within the known domain groups. The knowledge base is generated automatically for each domain group from the comparison of within-group (“self”) and out-of-group (“non-self”) similarities. This is a memory-based approach wherein group-specific similarity functions are automatically learned from the database [Stanfill and Waltz, 1986]

Recent Developments

Database system migrated to SQL Complete rewrite of the www interface, including the domain-prediction system


Stanfill, C. and Waltz, D. Communications of the ACM, 29:1213-1228, 1986

