rSNP Guide

NAR Molecular Biology Database Collection entry number 45

Database Description

We have earlier developed the computer system and the database, rSNP_Guide, devoted to prediction of transcription factor (TF) binding sites, alterations of which are responsible for disease phenotype [Nucleic Acids Res. 29(1):312-316]. Then, with this system, the prediction results were obtained and, also, confirmed by 70 known relationships between TF sites and diseases, as well as by site-directed mutagenesis data. So that, for this current release presented, the rSNP_Guide is next considered as a tool for TF site annotation. Previously analyzed and characterized cases of altered TF sites are used to annotate potential sites of the same type and at the same location in homologous genes. Based on 20 TF sites with known alterations in TF binding to DNA, we have localized 245 potential TF sites in homologous genes. For these potential TF sites, rSNP_Guide estimates TF-DNA interaction according to three categories: 'present', 'weak', and 'absent'. The significance of each assignment is statistically measured. The present rSNP_Guide issue is at URL=

Recent Developments

We have adapted our earlier developed system rSNP_Guide to genome annotation. The main idea of our approach is to involve into TF site annotation both sequences of known TF-sites different types, not only of the type investigated, and available experimental data on alterations in binding of mutated DNA to unknown TFs. The focus of this resent annotation is on annotation of potential TF sites using models of experimentally characterized altered TF sites derived by rSNP_Guide system for recognizing TFs relevant to the known TF site. A set of known transcription factors is theoretically evaluated for their ability to bind altered DNA thus resulting in vectors of scores. Experimental data describing mutation-caused alteration in DNA binding to unknown TF are formalized thus resulting in vectors of values. Theoretical and experimental vectors of values are compared using Euclidian distance measure. Assignment of the TF from the theoretical set is then performed, for the binding site under study, which has been altered by mutation(s). For a given TF site with known SNP-disease association, when it is correctly predicted by rSNP_Guide, all the theoretical and experimental vectors are documented in the database rSNP_Report. From this report-entry, in this work, the corresponding Java applet addressed to examining the phylogenetic footprints of only this known TF site is automatically generated and stored in the knowledge-base rSNP_Tuning. So that, the final results of annotation of potential TF sites are documented in TFsite_Annotations database. Based on 20 TF sites with known alterations in TF binding to DNA, we have localized 245 potential TF sites in homologous genes. For these potential TF sites rSNP_Guide estimates TF-DNA interaction according to three categories: 'present', 'weak', and 'absent'. The significance of each assignment is statistically measured. The rSNP_Guide is available through the Web,


The work is supported by Russian Foundation for Basic Research, 01-04-49860 and 02-04-49485.


Benson,D., Karsch-Mizrachi,I., Lipman,D., Ostell,J., Rapp,B. and Wheeler,D. (2002) GenBank. Nucleic Acids Res. 30, 17-20.
3. Zdobnov,E., Lopez,R., Apweiler,R. and Etzold,T. (2002) The EBI SRS server-new features. Bioinformatics. 18, 1149-1150.
4. Krawczak,M., Ball,E., Fenton,I., Stenson,P., Abeysinghe,S., Thomas,N. and Cooper,D. (2000) Human gene mutation database-a biomedical information and research resource. Hum. Mutat., 15, 45-51.
5. Smigielski,E., Sirotkin,K., Ward,M. and Sherry,S.T. (2000), dbSNP: a database of single nucleotide polymorphisms, Nucleic Acids Res., 28, 352-355.
6. Brookes,A., Lehvaslaiho,H., Siegfried,M., Boehm,J., Yuan,Y., Sarkar,C., Bork,P. and Ortiga F. (2000) HGBASE: a database of SNPs and other variations in and around human genes, Nucleic Acids Res., 28, 356-360.
7. Cheung,K., Osier,M., Kidd,J., Pakstis,A., Miller,P. and Kidd,K. (2000) ALFRED: an allele frequency database for diverse populations and DNA polymorphisms. Nucleic Acids Res., 28, 361-363.
8. McKusick,V. (1998) Mendelian Inheritance in Man. Catalogs of human genes and genetic disorders, Johns Hopkins University Press, Baltimore
9. Vasiliev,G., Merkulov,V., Kobzev,V., Merkulova,T., Ponomarenko,M. and Kolchanov,N. (1999) Point mutations within 663-666 bp of intron 6 of the human TDO2 gene, associated with a number of psychiatric disorders, damage the YY-1 transcription factor binding site. FEBS Lett., 462, 85-88.
10. Knight,J., Udalova,I., Hill,A., Greenwood,B., Peshu,N., Marsh,K. and Kwiatkowski,D. (1999) A polymorphism that affects OCT-1 binding to the TNF promoter region is associated with severe malaria. Nat Genet. 22, 145-150.
11. Tsutsumi-Ishii,Y., Tadokoro,K., Hanaoka,F. and Tsuchida,N. (1995) Response of heat shock element within the human HSP70 promoter to mutated p53 genes. Cell Growth Differ. 6, 1-8.
12. Langdon,S. and Kaufman,R. (1998) Gamma-globin gene promoter elements required for interaction with globin enhancers. Blood, 91, 309-318.
13. Quandt,K., Frech,K., Karas,H., Wingender,E. and Werner,T. (1995) MatInd and MatInspector: new fast and versatile tools for detection of consensus matches in nucleotide sequence data. Nucleic Acids Res. 23, 4878-4884.
14. Roulet,E., Bucher,P., Schneider,R., Wingender,E., Dusserre,Y., Werner,T. and Mermod,N. (2000) Experimental analysis and computer prediction of CTF/NFI transcription factor DNA binding sites. J Mol Biol. 297, 833-848.
Mandel-Gutfreund,Y., Baron,A. And Margalit,H. (2001) A structure-based approach for prediction of protein binding sites in gene upstream regions. Pac Symp Biocomput., 139-150.
Wasserman,W. and Fickett,J. (1998) Identification of regulatory regions which confer muscle-specific gene expression. J Mol Biol. 278, 167-181.
Wasserman,W., Palumbo,M., Thompson,W., Fickett,J. and Lawrence,C. (2000) Human-mouse genome comparisons to locate regulatory sites. Nat Genet. 26, 225-228.
Ponomarenko,J., Merkulova,T., Vasiliev,G., Levashova,Z., Orlova,G., Lavryushev,S., Fokin,O., Ponomarenko,M., Frolov,A. and Sarai,A. (2001) rSNP_Guide, a database system for analysis of transcription factor binding to target sequences: application to SNPs and site-directed mutations. Nucleic Acids Res. 29, 312-316.
Ponomarenko,J., Merkulova,T., Orlova,G., Fokin,O., Gorshkova,E. and Ponomarenko,M. Mining DNA sequences to predict sites which mutations cause genetic diseases. (2002) Knowledge-Based Systems. 15, 225-233.
Jeunemaitre,X., Soubrier,F., Kotelevtsev,Y., Lifton,R., Williams,C., Charru,A., Hunt,S., Hopkins,P., Williams,R., Lalouel,J. et al. (1992) Molecular basis of human hypertension: role of angiotensinogen. Cell, 71, 169-180.
Zhao,Y., Zhou,J., Narayanan,C., Cui,Y. and Kumar,A. (1999) Role of C/A polymorphism at -20 on the expression of human angiotensinogen gene. Hypertension, 33, 108-115.
Klinge,C. (2001) Estrogen receptor interaction with estrogen response elements. Nucleic Acids Res. 29, 2905-2919.
Narayanan,C., Cui,Y., Zhao,Y., Zhou,J. and Kumar,A. (1999) Orphan receptor Arp-1 binds to the nucleotide sequence located between TATA box and transcriptional initiation site of the human angiotensinogen gene and reduces estrogen induced promoter activity. Mol Cell Endocrinol.148, 79-86.

Go to the abstract in the NAR 2003 Database Issue.
Oxford University Press is not responsible for the content of external internet sites