Drosophila polymorphism database

NAR Molecular Biology Database Collection entry number 774
Casillas S., Petit N. and Barbadilla A.
Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, 08193 Bellaterra (Barcelona), Spain

Database Description

The Drosophila Polymorphism Database (1) is a secondary database designed to provide a collection of all the existing polymorphic sequences in the Drosophila genus grouped by name of organism and gene. It allows, for the first time, the search for any polymorphic set according to different parameter values of nucleotide diversity. For data collection, diversity measures and updating we use PDA (2,3), a pipeline made of a set of Perl modules that automates the process of sequence retrieving, grouping, aligning and estimating diversity parameters from Genbank sequences.
Diversity measures, including polymorphism estimates at synonymous and nonsynonymous sites, linkage disequilibrium and codon bias, are calculated for each polymorphic set in different functional regions. The database also includes the primary information retrieved from different external sources: the Drosophila publicly available nucleotide sequences (excluding ESTs, STSs, GSSs, working draft and patents) with their annotations and references from Genbank, additional information of genes and aberrations from Flybase, and the cross-references to the Popset database. The database content is daily updated, and records are assigned unique and permanent DPDB identification numbers to facilitate cross-database referencing.
Online query interfaces facilitate data interrogation by different polymorphism parameter values and keyword queries (based on SQL searches) and sequence similarity (with Blast). Full reports, sequences and alignments in different formats and polymorphism parameters can be obtained in both textual and graphical formats. The web site also includes software facilities for data analysis and a daily-updated web page with exhaustive statistics on the contents of the database. The database is available at can be freely downloaded via FTP.

Recent Developments

The DPDB database has been reanalyzed with the new version of PDA (3) to incorporate its improved methods for data grouping and analysis.


Ongoing funding of DPDB is provided by the Ministerio de Educación y Ciencia grants BFU2006-08640/BMC and BES-2003-0416. Initial funding for DPDB was provided by the Ministerio de Educación y Ciencia grants DGICT-PB98-0900-C02, BMC2002-01708 and BES-2003-0416, and E-biointel.


1. Casillas S., Petit N. and Barbadilla A. (2005) DPDB: a database for the storage, representation and analysis of polymorphism in the Drosophila genus. Bioinformatics 21: ii26-ii30.
2. Casillas S and Barbadilla A. (2004) PDA: a pipeline to explore and estimate polymorphism in large DNA databases. Nucleic Acids Res. 32: W166-W169.
3. Casillas S and Barbadilla A. (2006) PDA v.2: improving the exploration and estimation of nucleotide polymorphism in large datasets of heterogeneous DNA. Nucleic Acids Res. 34: W632-W634.

Oxford University Press is not responsible for the content of external internet sites