Skip Navigation

MulPSSM


NAR Molecular Biology Database Collection entry number 844
Ramakrishnan G., Mohanty S., Krishnadev O. and Srinivasan N.
Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India

Database Description

Representation of multiple sequence alignments of protein families in terms of Position Specific Scoring Matrices (PSSMs) is commonly used in the detection of remote homologues. A PSSM is generated with respect to one of the sequences involved in the multiple sequence alignment as a reference. We have shown earlier that use of multiple PSSMs corresponding to an alignment, with several sequences in the family used as reference, improves the sensitivity of the remote homology detection dramatically [1,2]. MulPSSM contains PSSMs for a large number of sequence and structural families of protein domains with multiple PSSMs for every family [3]. The approach involves use of a clustering algorithm to identify most distinct sequences corresponding to a family. With each one of the distinct sequences as reference, multiple PSSMs have been generated.

Recent Developments

The current release of MulPSSM contains 403,107 PSSMs corresponding to 14, 831 sequence based families from Pfam (version 27.0) and 14235 PSSMs corresponding to 3856 structural families in PALI database [5] corresponding to SCOP (Version 1.75). An RPS-BLAST [6,7] interface allows sequence search against PSSMs of sequence or structural families or both. The data has been presented using dynamic HTML. There is also an option to obtain the results of the RPS-BLAST as a dendrogram. Such a dendrogram will enable the users in better understanding of the relationship amongst the query and the various hits, along with an ease with which users can identify closely and remotely related hits to the query. An analysis interface allows display and convenient navigation of alignments and domain hits.

Acknowledgements

This work is supported by Department of Biotechnology, New Delhi.

References

1. Anand B., Gowri V.S., Srinivasan N., (2005) Use of multiple profiles corresponding to a sequence alignment enables effective detection of remote homologues. Bioinformatics 21, 2821-2826.

2. Gowri, V.S., Tina, K.G., Krishnadev, O., Srinivasan, N. (2007) Strategies for the effective identification of remotely related sequences in multiple PSSM search approach. Proteins 67, 789-794.

3. Gowri V.S., Krishnadev O., Swamy C.S., Srinivasan N. (2006) MulPSSM: a database of multiple position-specific scoring matrices of protein domain families. Nucleic Acids Res. 34, D243-246.

4. Bateman, A., Coin, L., Durbin, R., Finn, R.D., Hollich, V., Griffiths-Jones, S., Khanna, A., Marshall, M., Moxon, S., Sonnhammer, E.L.L., et.al. (2004) The Pfam Protein Families Database. Nucleic Acids Res. 32, D138-D141.

5. Balaji, S., Sujatha, S., Kumar, S.S.C. and Srinivasan, N. (2001) PALI: A database of Phylogeny and ALIgnment of homologous protein structures. Nucleic Acids Res. 29, 61-65.

6. Schaffer, A. A., Wolf, Y. I., Ponting, C. P., Koonin, E. V., Aravind, L., Altschul, S. F. (1999) IMPALA: matching a protein sequence against a collection of PSI-BLAST-constructed position-specific score matrices. Bioinformatics 12, 1000-1011.

7. Marchler-Bauer, A., Panchenko, A.R., Shoemaker, B.A., Thiessen, P.A., Geer, L.Y., Bryant, S.H. (2002) CDD: a database of conserved domain alignments with links to domain three-dimensional structure. Nucleic Acids Res. 30, 281-283.


Go to the abstract in the NAR 2006 Database Issue.
Oxford University Press is not responsible for the content of external internet sites