Skip Navigation


NAR Molecular Biology Database Collection entry number 1589
Frenkel-Morgenstern M1, Gorohovski A1, Lacroix V2, Rogers M3, Ibanez K1, Boullosa C1, Andres Leon E4, Ben-Hur A3 and, Valencia A1.
1Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain

2UMR CNRS 5558, Laboratoire de Biométrie et Biologie Evolutive, INRIA Bamboo, Université Claude Bernard, Villeurbanne 69100, France

3Department of Computer Science, Colorado State University, Fort Colins, CO 80523-1873, USA

4Bioinformatics Unit, Structural Biology and BioComputing Program, Spanish National Cancer Research Centre (CNIO), Madrid 28029, Spain

Database Description

Chimeric RNAs, by definition, contain parts of protein-coding sequences from normally disparate genomic locations. In contrast to conventional alternatively spliced RNA isoforms, chimeric RNAs are generated by the trans-splicing of pre-mRNAs [1-11] or alternatively, through gene fusion following chromosomal translocations or rearrangements [3, 8, 12]. Chimeric transcripts can contribute to various cellular processes, and thus serve to increase the combinatorial number of functional gene products encoded by a given genome [8, 13]. Our ChiTaRS database ( is comprised of more than 16,000 putative human, mouse and fly chimeric transcripts described in Frenkel-Morgenstern et al [14], including the subset of chimeras confirmed by RNA-seq reads in 16 different human tissues.
Chimeric transcripts were confirmed using paired-end RNA-seq data, by mapping the RNA-seq reads to the sequence of each chimera and specifically, its junction sites [14]. For humans, we utilized the Human Body Map 2.0 data generated on the HiSeq 2000 by Illumina in 2010. Briefly, the following mapping protocol was used to ensure that a read could be unambiguously assigned as a chimera and not to a specific location in the genome. First, we mapped the RNA-seq reads to the reference genome to identify which reads could be linearly assigned to genomic regions. Subsequently, we selected reads not mapped in this previous stage and attempted to map them to chimeric transcripts. Finally, we screened only for reads that mapped precisely to either side of the junction of the chimera, with at least six nucleotides inside the junction. In this way, 175 chimeric transcripts were confirmed by at least two RNA-seq reads covering the gene-gene junction site [14].

In the ChiTaRS database we provide detailed information about more than 500 unique breakpoints in cancers reported in the TICdb and dbCrid databases based on the Mittelman database (see "Breakpoints"). To the best of our knowledge, this is the first catalogue that enables cross-referencing between chimeric transcripts found in GenBank, relevant Pubmed articles about putative breakpoints, the two incorporated genes, respective genomic loci and RNA-seq evidences. Moreover, the entries in ChiTaRS ( are linked from the universal UniProt Knowledgebase system (UniProtKB), which contains a broad catalogue of information on proteins from laboratories around the world.

A bonus feature of our ChiTaRS is that it provides visualization of chimeric transcripts, and their genomic context, including the junction site. These figures were produced using the SpliceGrapher package, which was designed for analysis and visualization of RNA-Seq data [15]. These figures highlight the genes on either side of a chimeric junction, making it possible to visualize the potential transcripts that could arise from each chimera.

In summary, ChiTaRS database [16] may be useful for the biologists looking for chimeras and their corresponding proteins, for the genome researchers interested in the regions of chromosomal aberrations and their DNA sequences as well as for the biomedical studies of protein fusions related to cancer translocations. This database represents a valuable tool for the large-scale study of chimeric RNAs, chimeric proteins and their potential functions in human cancers.


The authors thank Begoña Aguado, Alberto Rastrojo, Jaime Prilusky, Roderic Guigo and David Pisano for valuable discussions. The authors also thank authors of ChimerDB, dbCrid TICdb and the Mitelman database of the Cancer Aberrations for making available many human chimeric transcripts and cancer breakpoints, and MPLabs LTD for a graphical design of the ChiTaRS home and full collection pages.


1 Horiuchi, T. and Aigaki, T. (2006) Alternative trans-splicing: a novel mode of pre-mRNA processing. Biol Cell 98, 135-140.

2 Robertson, H.M., et al. (2007) The bursicon gene in mosquitoes: an unusual example of mRNA trans-splicing. Genetics 176, 1351-1353.

3 Herai, R.H. and Yamagishi, M.E. (2010) Detection of human interchromosomal trans-splicing in sequence databanks. Brief Bioinform 11, 198-209.

4 Douris, V., et al. (2010) Evidence for multiple independent origins of trans-splicing in Metazoa. Mol Biol Evol 27, 684-693.

5 Pettitt, J., et al. (2010) The evolution of spliced leader trans-splicing in nematodes. Biochem Soc Trans 38, 1125-1130.

6 Allen, M.A., et al. (2010) A global analysis of C. elegans trans-splicing. Genome Res

7 McManus, C.J., et al. (2010) Global analysis of trans-splicing in Drosophila. Proc Natl Acad Sci U S A 107, 12975-12979.

8 Gingeras, T.R. (2009) Implications of chimaeric non-co-linear transcripts. Nature 461, 206-211

9 Li, H., et al. (2008) A neoplastic gene fusion mimics trans-splicing of RNAs in normal human cells. Science 321, 1357.

10 McManus, C.J., et al. (2010) Regulatory divergence in Drosophila revealed by mRNA-seq. Genome Res 20, 816-825.

11 Pirrotta, V. (2002) Trans-splicing in Drosophila. Bioessays 24, 988-991.

12 Maher, C.A., et al. (2009) Chimeric transcript discovery by paired-end transcriptome sequencing. Proc Natl Acad Sci U S A 106, 12353-12358.

13 Birney, E., et al. (2007) Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799-816.

14 Frenkel-Morgenstern el al (2012) Chimeras taking shape: Potential functions of proteins encoded by chimeric RNA transcripts, Genome Research, Published in Advance May 15, 2012, doi: 10.1101/gr.130062.111.

15 Rogers, M.F., Thomas, J., Reddy, A.S.N. and Ben-Hur, A. ,SpliceGrapher: Detecting patterns of alternative splicing from RNA-seq data in the context of gene models and EST data. Genome Biology 13:R4, 2012.

16 Frenkel-Morgenstern M, Gorohovski A, Lacroix V, Rogers M, Ibanez K, Boullosa C, Andres Leon E, Ben-Hur A, Valencia A.,ChiTaRS: a database of human, mouse and fruit fly chimeric transcripts and RNA-sequencing data. Nucleic Acids Res. 2013 Jan;41(Database issue):D142-51.

Oxford University Press is not responsible for the content of external internet sites