Database Description

The NCBI dbSNP database of genome variation complements GenBank by providing the resources to build comprehensive catalogs of common genomic variations in humans and other organisms. With build 121, dbSNP has over 26 million submissions for variations in human and 25 other species, and this number is expected to grow as additional genomes are surveyed for nucleotide diversity. dbSNP's largest set of data is currently 19.8 million variation submissions for human; clustered by dbSNP staff into 9.8 million unique, variable positions in the human genome. Author-driven submissions of flanking sequence, alleles, population-specific allele frequencies, individual genotypes, validation status and haplotype configurations are associated with this non-redundant reference set of markers, integrated with three public genome assemblies, and indexed with internal NCBI resources such as Entrez Genes, GenBank, UniGene and PubMed. The complete contents of dbSNP are freely available via anonymous FTP in a variety of useful formats at ftp://ncbi.nlm.nih.gov/snp/

Recent Developments

The SNP research community is currently validating millions of human SNPs by measuring either allele frequencies or individual genotypes in one or more reference population samples. dbSNP has introduced online batch query services, a structured XML report for refSNP clusters, and haplotype submissions for defining unique alleles at the chromosome fragment level. Due to the continuing growth in database size, GenBank taxonomic divisions will subdivide the database table structure on our FTP site.


