The Chromosome 7 Annotation Project

NAR Molecular Biology Database Collection entry number 596
MacDonald J.R., Cheung J., Zhang J., Khaja R., Duggan G., Kanematsu E. and Scherer S.W.
Program in Genetics and Genomic Biology, The Hospital for Sick Children, 555 University Ave, Toronto ON M5G 1X8 Canada

Database Description

The objective of this project is to generate the most comprehensive description of human chromosome 7 to facilitate biological discovery, disease gene research and medical genetic applications. The DNA sequence assembly, which encompasses 158,329,839 nucleotides (nt) of DNA, represents the most complete description of this chromosome available. It was generated utilizing Celera whole genome shotgun information, cloned-based sequence from Genbank, and targeted sequence from The Centre for Applied Genomics (TCAG). In comparison to the first version release (Science, 2003) an additional 376,050 nt is present and 23 additional gaps have been filled. This assembly (named CRA_TCAGchr7v2), amongst other variations, contains 704,297 nt of verified sequence not found in NCBI's Build 34 of chromosome 7.

To fulfill the objective of having every biological and medically relevant feature annotated along the sequence assembly, all annotation records have been updated and new ones added. For example, the CRA_TCAGchr7v2 assembly displays information from the most current RefSeq, Ensembl, Celera published, UCSC, TIGR Human Gene Index, dbEST and mRNA datasets. Unique TCAG annotations have also been added including a new 'experimentally-confirmed' gene track, all known structural features, as well as 619 new clinical cases of defined phenotype. A new proprietary segmental duplication browser has also been developed for the chromosome 7 browser.

Recent Developments

The Chromosome 7 database is continually updated, and all new updates and additions are posted on the News Section of the website


We thank The Centre for Applied Genomics at the Hospital for Sick Children (HSC) and Celera Genomics, as well as clinical collaborators and families. Supported by Genome Canada, the Canadian Institutes of Health Research, the Canadian Genetic Diseases Network, the Howard Hughes Medical Institute International Scholar Program (to S.W.S.) and the HSC Foundation.


1. Scherer SW, Cheung J, MacDonald JR, et al. Human chromosome 7: DNA sequence and biology. Science. 2003 May 2;300(5620):767-72. Epub 2003 Apr 10.
2. Iafrate AJ, Feuk L, Rivera MN, Listewnik ML, Donahoe PK, Qi Y, Scherer SW, Lee C. Detection of large-scale variation in the human genome. Nat Genet. 2004 Aug 1 [Epub ahead of print]
3. Armengol L, Pujana MA, Cheung J, Scherer SW, Estivill X. Enrichment of segmental duplications in regions of breaks of synteny between the human and mouse genomes suggest their involvement in evolutionary rearrangements. Hum Mol Genet. 2003 Sep 1;12(17):2201-8. Epub 2003 Jul 08.
4. Cheung J, Wilson MD, Zhang J, Khaja R, MacDonald JR, Heng HH, Koop BF, Scherer SW. Recent segmental and gene duplications in the mouse genome. Genome Biol. 2003;4(8):R47. Epub 2003 Jul 09.
5. Hudek AK, Cheung J, Boright AP, Scherer SW. Genescript: DNA sequence annotation pipeline. Bioinformatics. 2003 Jun 12;19(9):1177-8.
6. Cheung J, Estivill X, Khaja R, MacDonald JR, Lau K, Tsui LC, Scherer SW. Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence. Genome Biol. 2003;4(4):R25. Epub 2003 Mar 17.

Oxford University Press is not responsible for the content of external internet sites