COLOMBOS v2.0: An ever expanding collection of bacterial expression compendia


NAR Molecular Biology Database Collection entry number 1731
Pieter Meysman1,2, Paolo Sonego3, Luca Bianco3, Qiang Fu4, Daniela Ledezma-Tejeida5, Socorro Gama-Castro5, Veerle Liebens4, Jan Michiels4, Kris Laukens1,2, Kathleen Marchal4,6,7, Julio Collado-Vides5 and Kristof Engelen3,4
1 Department of Mathematics and Computer Science, University of Antwerp, B-2020 Antwerp, Belgium 2 Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp / Antwerp University Hospital, B-2650 Edegem, Belgium 3 Department of Computational Biology, Fondazione Edmund Mach, San Michele all'Adige, Trento (TN), 38010, Italy 4 Department of Microbial and Molecular Sciences, KU Leuven, Leuven, B-3001, Belgium 5 Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos 62210, Mexico 6 Department of Plant Biotechnology and Bioinformatics, Ghent University, Gent, 9052, Belgium 7 Department of Information Technology, IMinds, Ghent University, Gent, 9052, Belgium

Database Description

The COLOMBOS database (http://www.colombos.net) features comprehensive organism-specific cross-platform gene expression compendia of several bacterial model organisms and is supported by a fully interactive web portal and an extensive web API. COLOMBOS, name derived from the acronym 'COLections Of for Bacterial OrganismS', was originally published in PLoS One (1) and hosts several large expression compendia derived from high-throughput expression experiments with an explicit focus on bacterial organisms. The expression experiments available in COLOMBOS are derived from public resources, such as the Gene Expression Omnibus(2) or ArrayExpress(3) repositories, but the actual data originates from a reanalysis starting from the raw hybridization intensities for microarrays or short read sequences for RNA-seq using a consistent and robust normalization pipeline with stringent quality controls at each step. This procedure yields high quality expression compendia that can directly integrate high-throughput expression data from different technological platforms.

Recent Developments

The current version (v2.0) of COLOMBOS includes both an update of the expression data, by expanding the previously available compendia and by adding compendia for several new species, and an update of the surrounding functionality, with improved search and visualization options and novel tools for programmatic access to the database. The scope of the database has also been extended to incorporate RNA-seq data in our compendia by a dedicated analysis pipeline. We demonstrate the validity and robustness of this approach by comparing the same RNA samples measured using both microarrays and RNA-seq. As far as we know, COLOMBOS currently hosts the largest homogenized gene expression compendia available for seven bacterial model organisms.

References

1. Engelen K, Fu Q, Meysman P, Sánchez-Rodríguez A, De Smet R, Lemmens K,Fierro AC, Marchal K. (2011) COLOMBOS: Access Port for Cross-Platform Bacterial Expression Compendia. PLoS ONE, 6, e20938.
2. Barrett,T., Wilhite,S.E., Ledoux,P., Evangelista,C., Kim,I.F., Tomashevsky,M., Marshall,K.A., Phillippy,K.H., Sherman,P.M., Holko,M., et al. (2013) NCBI GEO: archive for functional genomics data sets--update. Nucleic acids research, 41, D991-D995.
3. Rustici,G., Kolesnikov,N., Brandizi,M., Burdett,T., Dylag,M., Emam,I., Farne,A., Hastings,E., Ison,J., Keays,M., et al. (2013) ArrayExpress update--trends in database growth and links to data analysis tools. Nucleic acids research, 41, D987-D990.


Go to the abstract in the NAR 2014 Database Issue.
Oxford University Press is not responsible for the content of external internet sites