CCDS
2University of California, Santa Cruz, CBSE/ITI-UCSC, Ste 501 E2 Bld., Santa Cruz, CA 95064, USA
3Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK
4European Bioinformatics Insitute, Hinxton, Cambridge CB10 1SD, UK
Database Description
The CCDS set is calculated after a coordinated whole-genome annotation update by the collaborating annotation groups: NCBI, EBI, and WTSI. The CCDS set includes only those protein coding annotations that precisely agree in every exon placement across the entire coding region. UCSC provides independent and intensive quality control procedures augmented by work from the other groups. Once a coding gene structure is agreed, it will be maintained for future releases unless there is good evidence to modify or remove it. The collaboration has improved independent annotation methods (both computational analysis and manual annotation) and we expect continued growth in the CCDS set as more gene structures and more organisms are included in the collection.
The long-term goal of the CCDS project is to converge toward a full set of standard gene annotations for finished high-quality genomes. We anticipate that through continued manual review, experimental validation of weakly supported genes, and as automatic annotation methods continue to improve, the CCDS set will become more complete.
The CCDS web site (http://www.ncbi.nlm.nih.gov/CCDS/) provides information about CCDS and sequence identifiers, the CDS data as both coordinates and sequence, and links to genome browsers. WTSI, UCSC, EBI and NCBI web sites indicate genome annotation data that have a CCDS status.