NAR Molecular Biology Database Collection entry number 199
TIGRFAMs is a collection of manually curated protein families consisting of hidden Markov models (HMMs), multiple sequence alignments, Gene Ontology (GO) assignments, commentary, literature references and pointers to related TIGRFAMs, Pfam and InterPro models. These models are designed to support both automated and manually curated annotation of genomes. TIGRFAMs contains models of full-length proteins as well as domains at the levels of superfamilies, subfamilies and equivalogs (which are sets of homologous proteins that are conserved with respect to function since their last common ancestor). TIGRFAMs models are allowed to be heirarchically nested to yield the maximum amount of information for the annotation process. TIGRFAMs are thus complementary to Pfam models which are designed to represent non-overlapping structural domains. The TIGRFAMs database is integrated with the prokaryotic genome annotation pipeline at TIGR and thus is being constantly updated with respect to new information on protein function, model scope and performance. TIGRFAMs currently contains over 1600 protein families, having doubled in size in two years. TIGRFAMs is available for searching or downloading at www.tigr.org/TIGRFAMs.
Since the TIGRFAMs database was first described in the January 2001 database issue of Nucleic Acids Research, the number of models in TIGRFAMs has doubled to over 1600. A large number of entries have been assigned specific Gene Ontology (GO) terms. TIGRFAMs links are now reported in the SwissProt database. TIGRFAMs has been incorporated into InterPro; InterPro entries based on or including TIGRFAMs entries show parent/child and contains/found in relationships with entries from Pfam, SMART, and other protein classification databases. Continued use of TIGRFAMs in microbial annotation at TIGR has provided steady feedback for improving the accuracy of existing models while new genomes and new functional characterizationns became available. TIGRFAMs models now hit nearly twenty per cent of the proteins of typical newly sequenced bacterial genomes. The equivalog subset can be expected to make about 400 high-confidence specific functional assignments for a typical new 4-megabase bacterial genome.
Category: Protein sequence databases
Subcategory: Protein domain databases; protein classification
Go to the abstract in the NAR 2007 Database Issue.
Oxford University Press is not responsible for the content of external internet sites