ENCODE Project at UCSC

Rosenbloom, Kate1; Dreszer, Timothy1; Malladi, Venkat1; Long, Jeffrey1; Sloan, Cricket1; Raney, Brian1; Cline, Melissa1; Karolchik, Donna1; Barber, Galt1; Clawson, Hiram1; Diekhans, Mark1; Fujita, Pauline1; Goldman, Mary1; Gravell, Robert1; Kuhn, Robert1; Harte, Rachel1; Hinrichs, Angie1; Kirkup, Vanessa1; Learned, Katrina1; Maddren, Morgan1; Meyer, Laurence1; Pohl, Andy1,2; Rhead, Brooke1; Wong, Mathew1; Zweig, Ann1; Haussler, David1,3; Kent, Jim1
1Center for Biomolecular Science and Engineering, School of Engineering, University of California, Santa Cruz (UCSC), Santa Cruz, CA 95064, USA
2Centre for Genomic Regulation (CRG), Barcelona, Spain and 3Howard Hughes Medical Institute, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA

Database Description

The Encyclopedia of DNA Elements (ENCODE) project is an international consortium of investigators funded to analyze the human genome with the goal of producing a comprehensive catalog of functional elements. Initially a pilot project focused on 1% of the human genome sequence, ENCODE was scaled up in September 2007 to cover the entire genome. The ENCODE Data Coordination Center (DCC) at The University of California, Santa Cruz (UCSC) is the primary repository for experimental results generated by ENCODE investigators. These results are captured in the UCSC Genome Bioinformatics database and download server for visualization and data mining via the UCSC Genome Browser and companion tools (Rhead et al. The UCSC Genome Browser Database: update 2010, in this issue). To support the breadth and volume of ENCODE data, new features for organizing and selecting datasets have been added to the browser. Additionally, metadata describing the experiments is now captured more formally via controlled vocabularies which are associated with browser tracks and download files. The ENCODE web portal at UCSC ( or provides information about the ENCODE data and convenient links for access. As of September 2009, the ENCODE investigators have completed four production-level data freezes, consisting of 341 experiments submitted to the ENCODE DCC, with 207 of these released in a total of 18 tracks on the UCSC Genome Browser. These experiments represent a major methodology shift from microarray to high-throughput sequencing assays, and include a broad range of experiment types (e.g. ChIP-seq, RNA-seq, Methyl-seq) in 7 designated common cell types, and 51 additional human cell lines and tissues.


We would like to thank the systems administration staff at the Center for Biomolecular Science and Engineering: Jorge Garcia, Erich Weiler, Victoria Lin and Gary Moro, for their dedication and support, keeping high-volume ENCODE data flowing to our public site while assuring our servers are reliable and available. Thanks also to members of the ENCODE Consortium for providing these valuable data sets.


