NAR Molecular Biology Database Collection entry number 200
Pagni, M.1, Falquet, L.1, Iseli, C.2, Jongeneel, V.2, Junier, T.3, Bucher, P.3
1Swiss Institute of Bioinformatics, Ch. des Boveresses 155 CH-1066 Epalinges, Switzerland
2Office of Information Technology, Ludwig Institute for Cancer Research, Ch. des Boveresses 155 CH-1066 Epalinges, Switzerland
3Swiss Institute for Experimental Cancer Reaserch Ch. des Boveresses 155 CH-1066 Epalinges, Switzerland

Database Description

High throughput genome (HTG) and expressed sequence tag (EST) sequences are currently the most abundant nucleotide sequence classes in the public database. The large volume, high degree of fragmentation and lack of gene structure annotations prevent efficient and effective searches of HTG and EST data for protein sequence homologies by standard search methods. trEST and trGEN are regularly regenerated databases of hypothetical protein sequences predicted from EST and HTG sequences, respectively. Hits is a web-based data retrieval and analysis system providing access to precomputed matches between protein sequences (including sequences from trEST and trGEN) and patterns and profiles from Prosite and Pfam. The three resources can be accessed via the Hits home page (http://hits.isb-sib.ch).

Recent Developments

The collection of tools available to query the Hits database is continuously increased and improved. New species, including mouse and rat, were added to the trEST and trGEN databases.


The authors thank the Swiss federal governement, the Ludwig Institute for cancer research and Glaxo-Smithkline for financial support.


Pagni, M., Iseli. C., Junier, T., Falquet, L., Jongeneel, V., Bucher, P. trEST, trGEN and Hits: access to databases of predicted protein sequences. Nucleic Acids Res.29 (2001) 148-151

Go to the article in the NAR Database issue.
Oxford University Press is not responsible for the content of external internet sites