Déjà vu

NAR Molecular Biology Database Collection entry number 1171
Errami, M., Trusty, D., Zhaohui S. and Garner H.R.
McDermott Center for Human Growth and Development and the Department for Translational Research, The University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX 75390-9185, USA

Database Description

Déjà vu is a manually curated repository of duplicate citations found in several databases, including Medline. In the scientific research community plagiarism and multiple publications of the same data are considered unacceptable practices undermining the public confidence in scientific integrity.

Déjà vu is a database of duplicate citations (abstracts and titles) that are computationally identified with the text similarity searching tool eTBLAST and manually confirmed (2;3). Duplications are recorded along with user provided commentary and evidence to affirm or deny a given document’s putative categorization. The availability of a search tool by which journal submissions can be compared to existing databases to identify potential duplicate citations, and this database of duplicate citations increases the ability of the scientific community to identify duplications and provides means to help prevent the multiple publication of a scientific study. The Déjà vu database is freely accessible at http://spore.swmed.edu/dejavu. The tool eTBLAST is also freely available at http://invention.swmed.edu/etblast/etblast.shtml.

Recent Developments

To extend our database content, we are currently identifying duplicate citations in all Medline. At the current rate of finding and manual validation, we expect to find around 10,000 duplicates with different authors (potential cases of plagiarisms) and a total of 70,000 duplicates in other categories. While these duplicates are currently manually verified, they are added to the database daily.


This work was supported by the P.O’B. Montgomery Distinguished Chair (HG) and the Hudson Foundation (HG) as well as the National Institute of Health/ National Library of Medicine grant R01 LM009758-01 (HG). We wish to thank Dr Wayne Fisher, Tara Long and Dr Jonathan D. Wren for helpful comments and discussions.


1. Errami,M., Hicks,M.J., Fisher,W., Trusty,D., Wren,J.D. and Garner,H.R. Déjà vu - A Study of Duplicate Citations in Medline (2007). In press.
2. Errami,M., Wren,J.D., Hicks,J.M. and Garner,H.R. (2007). eTBLAST: a web server to identify expert reviewers, appropriate journals and similar publications. Nucleic Acids Res. 35: W12-W15.
3. Lewis,J., Ossowski,S., Hicks,J., Errami,M. and Garner,H.R. (2006). Text similarity: an alternative way to search MEDLINE. Bioinformatics (Oxford, England). 22: 2298-304.

Go to the abstract in the NAR 2009 Database Issue.
See also the abstract in the 2007 NAR Web Server issue.
Oxford University Press is not responsible for the content of external internet sites