Skip Navigation

PIR - Protein Information Resource


NAR Molecular Biology Database Collection entry number 194
Wu C.H.1, Yeh L.S.2, Huang H.1 and Barker W.C.2
1Department of Biochemistry and Molecular Biology, Georgetown University Medical Center, Washington, DC 20057, USA
2National Biomedical Research Foundation, Washington, DC 20057, USA

Database Description

The Protein Information Resource (PIR) is an integrated public bioinformatics resource that supports genomic and proteomic research and scientific studies. PIR has provided many protein databases and analysis tools to the scientific community, including the PIR-International Protein Sequence Database (PSD) of functionally annotated protein sequences. Because the protein sequence database activities of PIR, Swiss-Prot, and TrEMBL are now combined to produce UniProt databases, the PIR-PSD is no longer being updated. Release 80.00 (31-Dec-2004), the final release for PIR-PSD, is available for ftp download and online searching at the PIR website. All PIR-PSD entries have been merged into the UniProt databases and PIR-PSD identifiers can be used to retrieve and track these sequences in either UniProtKB or UniParc. PIR contributes to the functional annotation of UniProtKB protein sequences. Major ongoing annotation efforts include curation of protein families in the PIRSF (SuperFamily) system, definition of classification-driven rules for the propagation of position-specific features, protein names, and GO terms to protein entries, as well as bibliographic attribution of experimental features. A new PIR resource iProLINK provides multiple annotated literature corpora to facilitate text mining research in the area of literature-based database curation, named entity recognition, and protein ontology development. PIR continues to enhance iProClass, an integrated database of protein family, function, and structure information and a hub for mapping and integrating protein data from multiple sources, and maintains PIR-NREF, a non-redundant reference database of protein sequences.

The PIR web site connects data mining and sequence analysis tools to underlying databases for information retrieval and knowledge discovery, with functionalities for interactive queries, combinations of sequence and annotation text searches, and sorting and visual exploration of search results. The FTP site provides free download for database releases.

Recent Developments

Recent developments for the UniProt, PIRSF, iProClass, and PIR-NREF databases are described individually in this on-line Database Collection.

Acknowledgements

The PIR is supported by NIH grant U01-HG02712 and NSF grants DBI-0138188 and ITR-0205470.

References

Wu, C.H., Yeh, L.S., Huang, H., Arminski, L., Castro-Alvear, J., Chen, Y., Hu, Z., Kourtesis, P., Ledley, R.S., Suzek, B.E., Vinayaka, C.R., Zhang, J. and Barker, W.C. The Protein Information Resource. Nucleic Acids Res. 2003 Jan 1;31(1):345-7.


Go to the abstract in the NAR 2004 Database Issue.
Oxford University Press is not responsible for the content of external internet sites