Skip Navigation


NAR Molecular Biology Database Collection entry number 889
Hsu C.-N.1, Lai J.-M.2, Liu C.-H.1, Tseng H.-H.3, Lin C.-Y.4, Lin K.-T.1, Yeh H.-H.3, Sung T.-Y.1, Hsu W.-L.1, Su L.-J.4, Lee S.-A.4,5, Chen C.-H.3,4, Lee G.-C.1,5, Lee DT1,5, Shiue Y.-L.6, Yeh C.-W.7, Chang C.-H.4, Kao C.-Y.5, Huang C.-Y.F.3,4,5,8,9,10
1Institute of Information Science, Academia Sinica, Taipei 115, Taiwan.
2Department of Life Science, Fu-Jen Catholic University, Taipei Hsien 242, Taiwan.
3Division of Molecular and Genomic Medicine, National Health Research Institutes, Miaoli County 350, Taiwan.
4Institute of Cancer Research, National Health Research Institutes, Taipei 114, Taiwan.
5Department of Computer Science and Information Engineering, National Taiwan University, Taipei 106, Taiwan.
6Institute of Biomedical Science, National Sun Yat-Sen University, Kaohsiung 804, Taiwan.
7National Center for High-performance Computing, Hsinchu 300, Taiwan.
8Institute of Bio-Pharmaceutical Sciences, National Yang-Ming University, Taipei 112, Taiwan.
9Institute of Biotechnology in Medicine, National Yang-Ming University, Taipei 112, Taiwan.
10Insititute of Clinical Medicine, National Yang-Ming University, Taipei 112, Taiwan.

Database Description

The significant advances in microarray and proteomics analyses has resulted in an exponential increase in potential new targets and has promised to shed light on the identification of disease markers and cellular pathways. We aim to collect and decipher the HCC-related genes at the systems level. Here, we build an integrative platform, the Encyclopedia of Hepatocellular Carcinoma genes Online, dubbed EHCO, to systematically collect, organize and compare the pileup of unsorted HCC-related studies by using natural language processing and softbots. Among the eight gene set collections, ranging across PubMed, SAGE, microarray, and proteomics data, there are 2,906 genes in total; however, more than 77% genes are only included once, suggesting that tremendous efforts need to be exerted to characterize the relationship between HCC and these genes. Of these HCC inventories, protein binding represents the largest proportion (~25%) from Gene Ontology analysis. In fact, many differentially expressed gene sets in EHCO could form interaction networks (e.g. HBV-associated HCC network) by using available human protein-protein interaction datasets. To further highlight the potential new targets in the inferred network from EHCO, we combine comparative genomics and interactomics approaches to analyze 120 evolutionary conserved and overexpressed genes in HCC. 47 out of 120 queries can form a highly interactive network with 18 queries serving as hubs. This architectural map may represent the first step toward the attempt to decipher the hepatocarcinogenesis at the systems level. Targeting hubs and/or disruption of the network formation might reveal novel strategy for HCC treatment. EHCO can be freely accessed through


This project was supported in part by grants from the National Health Research Institutes and National Science Council (Taiwan) (NSC95-2320-B-400-009-MY3) to C. F. Huang, and by the National Research Program for Genomic Medicine (NRPGM), National Science Council (Taiwan) (NSC95-3112-B-011-013-Y, Advanced Bioinformatics Core) to C. Hsu.


1. Hsu CN, Lai JM, Liu CH, Tseng HH, Lin CY, Lin KT, Yeh HH, Sung TY, Hsu WL, Su LJ, Lee SA, Chen CH, Lee GC, Lee DT, Shiue YL, Yeh CW, Chang CH, Kao CY, Huang CY. (2007) Detection of the inferred interaction network in hepatocellular carcinoma from EHCO (Encyclopedia of Hepatocellular Carcinoma genes Online). BMC Bioinformatics. 2007, 8:66.

Subcategory: Cancer gene databases

Oxford University Press is not responsible for the content of external internet sites