Skip Navigation

BioCreative Virtual Issue

This Database virtual issue is themed around BioCreative: Critical Assessment of Information Extraction in Biology, an international community-wide effort for evaluating text mining and information extraction systems applicable to the biological field. The aim is to drive the development of practically relevant text mining systems in order that these may facilitate information access to biologists and also provide tools that may be integrated into the biocuration workflow and the searching processes conducted by databases.
The following articles were subject to the journal’s normal peer review process, and are collected together here as a ‘virtual issue’.

BioCreative V

EXTRACT: interactive extraction of environment metadata and term suggestion for metagenomic sample annotation
Evangelos Pafilis, Pier Luigi Buttigieg, Barbra Ferrell, Emiliano Pereira, Julia Schnetzer, Christos Arvanitidis and Lars Juhl Jensen
Database 2016: baw005 doi:10.1093/database/baw005
FREE Full Text

BioCreative 2014

EDITORIAL: BioCreative-IV virtual issue
Cecilia N. Arighi, Cathy H. Wu, Kevin B. Cohen, Lynette Hirschman, Martin Krallinger, Alfonso Valencia, Zhiyong Lu, John W. Wilbur, and Thomas C. Wiegers
Database 2014: bau039 doi:10.1093/database/bau039
FREE Full Text

A robust data-driven approach for gene ontology annotation
Yanpeng Li and Hong Yu
Database 2014: bau113 doi:10.1093/database/bau113
FREE Full Text

Unsupervised gene function extraction using semantic vectors
Ehsan Emadzadeh, Azadeh Nikfarjam, Rachel E. Ginn, and Graciela Gonzalez
Database 2014: bau084 doi:10.1093/database/bau084
FREE Full Text

Integrating information retrieval with distant supervision for Gene Ontology annotation
Dongqing Zhu, Dingcheng Li, Ben Carterette, and Hongfang Liu
Database 2014: bau087 doi:10.1093/database/bau087
FREE Full Text

RLIMS-P: an online text-mining tool for literature-based extraction of protein phosphorylation information
Manabu Torii, Gang Li, Zhiwen Li, Rose Oughtred, Francesca Diella, Irem Çelen, Cecilia N. Arighi, Hongzhan Huang, K. Vijay-Shanker, and Cathy H. Wu
Database 2014: bau081 doi:10.1093/database/bau081
FREE Full Text

Overview of the gene ontology task at BioCreative IV
Yuqing Mao, Kimberly Van Auken, Donghui Li, Cecilia N. Arighi, Peter McQuilton, G. Thomas Hayman, Susan Tweedie, Mary L. Schaeffer, Stanley J. F. Laulederkind, Shur-Jen Wang, Julien Gobeill, Patrick Ruch, Anh Tuan Luu, Jung-jae Kim, Jung-Hsien Chiang, Yu-De Chen, Chia-Jung Yang, Hongfang Liu, Dongqing Zhu, Yanpeng Li, Hong Yu, Ehsan Emadzadeh, Graciela Gonzalez, Jian-Ming Chen, Hong-Jie Dai, and Zhiyong Lu
Database 2014: bau086 doi:10.1093/database/bau086
FREE Full Text

Closing the loop: from paper to protein annotation using supervised Gene Ontology classification
Julien Gobeill, Emilie Pasche, Dina Vishnyakova, and Patrick Ruch
Database 2014: bau088 doi:10.1093/database/bau088
FREE Full Text

LiverCancerMarkerRIF: a liver cancer biomarker interactive curation system combining text mining and expert annotations
Hong-Jie Dai, Johnny Chi-Yang Wu, Wei-San Lin, Aaron James F. Reyes, Mira Anne C. dela Rosa, Shabbir Syed-Abdul, Richard Tzong-Han Tsai, and Wen-Lian Hsu
Database 2014: bau085 doi:10.1093/database/bau085
FREE Full Text

tmBioC: improving interoperability of text-mining tools with BioC
Ritu Khare, Chih-Hsuan Wei, Yuqing Mao, Robert Leaman, and Zhiyong Lu
Database 2014: bau073 doi:10.1093/database/bau073
FREE Full Text

BC4GO: a full-text corpus for the BioCreative IV GO task
Kimberly Van Auken, Mary L. Schaeffer, Peter McQuilton, Stanley J. F. Laulederkind, Donghui Li, Shur-Jen Wang, G. Thomas Hayman, Susan Tweedie, Cecilia N. Arighi, James Done, Hans-Michael Müller, Paul W. Sternberg, Yuqing Mao, Chih-Hsuan Wei, and Zhiyong Lu
Database 2014: bau074 doi:10.1093/database/bau074
FREE Full Text

Assisting manual literature curation for protein–protein interactions using BioQRator
Dongseop Kwon, Sun Kim, Soo-Yong Shin, Andrew Chatr-aryamontri, and W. John Wilbur
Database 2014: bau067 doi:10.1093/database/bau067
FREE Full Text

Web services-based text-mining demonstrates broad impacts for interoperability and process simplification
Thomas C. Wiegers, Allan Peter Davis, and Carolyn J. Mattingly
Database 2014: bau050 doi:10.1093/database/bau050
FREE Full Text

BioC interoperability track overview
Donald C. Comeau, Riza Theresa Batista-Navarro, Hong-Jie Dai, Rezarta Islamaj Doğan, Antonio Jimeno Yepes, Ritu Khare, Zhiyong Lu, Hernani Marques, Carolyn J. Mattingly, Mariana Neves, Yifan Peng, Rafal Rak, Fabio Rinaldi, Richard Tzong-Han Tsai, Karin Verspoor, Thomas C. Wiegers, Cathy H. Wu, and W. John Wilbur
Database 2014: bau053 doi:10.1093/database/bau053
FREE Full Text

Assisted curation of regulatory interactions and growth conditions of OxyR in E. coli K-12
Socorro Gama-Castro, Fabio Rinaldi, Alejandra López-Fuentes, Yalbi Itzel Balderas-Martínez, Simon Clematide, Tilia Renate Ellendorff, Alberto Santos-Zavaleta, Hernani Marques-Madeira, and Julio Collado-Vides
Database 2014: bau049 doi:10.1093/database/bau049
FREE Full Text

Processing biological literature with customizable Web services supporting interoperable formats
Rafal Rak, Riza Theresa Batista-Navarro, Jacob Carter, Andrew Rowley, and Sophia Ananiadou
Database 2014: bau064 doi:10.1093/database/bau064
FREE Full Text

Text-mining-assisted biocuration workflows in Argo
Rafal Rak, Riza Theresa Batista-Navarro, Andrew Rowley, Jacob Carter, and Sophia Ananiadou
Database 2014: bau070 doi:10.1093/database/bau070
FREE Full Text

BioC implementations in Go, Perl, Python and Ruby
Wanli Liu, Rezarta Islamaj Doğan, Dongseop Kwon, Hernani Marques, Fabio Rinaldi, W. John Wilbur, and Donald C. Comeau
Database 2014: bau059 doi:10.1093/database/bau059
FREE Full Text

Natural language processing pipelines to annotate BioC collections with an application to the NCBI disease corpus
Donald C. Comeau, Haibin Liu, Rezarta Islamaj Doğan, and W. John Wilbur
Database 2014: bau056 doi:10.1093/database/bau056
FREE Full Text

Egas: a collaborative and interactive document curation platform
David Campos, Jóni Lourenço, Sérgio Matos, and José Luís Oliveira
Database 2014: bau048 doi:10.1093/database/bau048
FREE Full Text

Finding abbreviations in biomedical literature: three BioC-compatible modules and four BioC-formatted corpora
Rezarta Islamaj Doğan, Donald C. Comeau, Lana Yeganova, and W. John Wilbur
Database 2014: bau044 doi:10.1093/database/bau044
FREE Full Text

tagtog: interactive and text-mining-assisted annotation of gene mentions in PLOS full-text articles
Juan Miguel Cejuela, Peter McQuilton, Laura Ponting, Steven J. Marygold, Raymund Stefancsik, Gillian H. Millburn, Burkhard Rost, the FlyBase Consortium
Database 2014 : bau033 doi: 10.1093/database/bau033
FREE Full Text

iSimp in BioC standard format: enhancing the interoperability of a sentence simplification system
Yifan Peng, Catalina O. Tudor, Manabu Torii, Cathy H. Wu and K. Vijay-Shanker
Database 2014 : bau038 doi: 10.1093/database/bau038
FREE Full Text

BioC: a minimalist approach to interoperability for biomedical text processing
Donald C. Comeau, Rezarta Islamaj Doğan, Paolo Ciccarese, Kevin Bretonnel Cohen, Martin Krallinger, Florian Leitner, Zhiyong Lu, Yifan Peng, Fabio Rinaldi, Manabu Torii, Alfonso Valencia, Karin Verspoor, Thomas C. Wiegers, Cathy H. Wu, and W. John Wilbur
Database 2013: bat064 doi:10.1093/database/bat064
FREE Full Text

BioCreative 2012


BioCreative - 2012 Virtual Issue
Cathy H. Wu, Cecilia N. Arighi, Kevin B. Cohen, Lynette Hirschman, Martin Krallinger, Zhiyong Lu, Carolyn Mattingly, Alfonso Valencia, Thomas C. Wiegers, and W. John Wilbur
Database Vol. 2012, bas049; doi:10.1093/database/bas049
FREE Full Text


Collaborative biocuration—text-mining development task for document prioritization for curation
Thomas C. Wiegers, Allan Peter Davis, and Carolyn J. Mattingly
Database Vol. 2012, bas037; doi:10.1093/database/bas037
FREE Full Text

Prioritizing PubMed articles for the Comparative Toxicogenomic Database utilizing semantic information
Sun Kim, Won Kim, Chih-Hsuan Wei, Zhiyong Lu, and W. John Wilbur
Database Vol. 2012, bas042; doi:10.1093/database/bas042
FREE Full Text

Using the OntoGene pipeline for the triage task of BioCreative 2012
Fabio Rinaldi, Simon Clematide, Simon Hafner, Gerold Schneider, Gintarė Grigonytė, Martin Romacker, and Therese Vachon
Database Vol. 2013, bas053; doi:10.1093/database/bas053
FREE Full Text

Using binary classification to prioritize and curate articles for the Comparative Toxicogenomics Database
Dina Vishnyakova, Emilie Pasche, and Patrick Ruch
Database Vol. 2012, bas050; doi:10.1093/database/bas050
FREE Full Text


Biocuration workflows and text mining: overview of the BioCreative 2012 Workshop Track II
Zhiyong Lu and Lynette Hirschman
Database Vol. 2012, bas043; doi:10.1093/database/bas043
FREE Full Text

Text mining in the biocuration workflow: applications for literature curation at WormBase, dictyBase and TAIR
Kimberly Van Auken, Petra Fey, Tanya Z. Berardini, Robert Dodson, Laurel Cooper, Donghui Li, Juancarlos Chan, Yuling Li, Siddhartha Basu, Hans-Michael Muller, Rex Chisholm, Eva Huala, Paul W. Sternberg and the WormBase Consortium
Database Vol. 2012, bas040; doi:10.1093/database/bas040
FREE Full Text

Building an efficient curation workflow for the Arabidopsis literature corpus
Donghui Li, Tanya Z. Berardini, Robert J. Muller, and Eva Huala
Database Vol. 2012, bas047; doi:10.1093/database/bas047
FREE Full Text

Manual Gene Ontology annotation workflow at the Mouse Genome Informatics Database
Harold J. Drabkin, Judith A. Blake, for the Mouse Genome Informatics Database
Database Vol. 2012, bas045; doi:10.1093/database/bas045
FREE Full Text

The Xenbase literature curation process
Jeff B. Bowes, Kevin A. Snyder, Christina James-Zorn, Virgilio G. Ponferrada, Chris J. Jarabek, Kevin A. Burns, Bishnu Bhattacharyya, Aaron M. Zorn, and Peter D. Vize
Database Vol. 2013, bas046; doi:10.1093/database/bas046
FREE Full Text

Opportunities for text mining in the FlyBase genetic literature curation workflow
Peter McQuilton and the FlyBase Consortium
Database Vol. 2012, bas039; doi:10.1093/database/bas039
FREE Full Text

Developing a biocuration workflow for AgBase, a non-model organism database
Lakshmi Pillai, Philippe Chouvarine, Catalina O. Tudor, Carl J. Schmidt, K. Vijay-Shanker, and Fiona M. McCarthy
Database Vol. 2012, bas038; doi:10.1093/database/bas038
FREE Full Text


An overview of the BioCreative 2012 Workshop Track III: interactive text mining task
Cecilia N. Arighi, Ben Carterette, K. Bretonnel Cohen, Martin Krallinger, W. John Wilbur, Petra Fey, Robert Dodson, Laurel Cooper, Ceri E. Van Slyke, Wasila Dahdul, Paula Mabee, Donghui Li, Bethany Harris, Marc Gillespie, Silvia Jimenez, Phoebe Roberts, Lisa Matthews, Kevin Becker, Harold Drabkin, Susan Bello, Luana Licata, Andrew Chatr-aryamontri, Mary L. Schaeffer, Julie Park, Melissa Haendel, Kimberly Van Auken, Yuling Li, Juancarlos Chan, Hans-Michael Muller, Hong Cui, James P. Balhoff, Johnny Chi-Yang Wu, Zhiyong Lu, Chih-Hsuan Wei, Catalina O. Tudor, Kalpana Raja, Suresh Subramani, Jeyakumar Natarajan, Juan Miguel Cejuela, Pratibha Dubey, and Cathy Wu
Database Vol. 2013, bas056; doi:10.1093/database/bas056
FREE Full Text

PPInterFinder — a mining tool for extracting causal relations on human proteins from literature
Kalpana Raja, Suresh Subramani, and Jeyakumar Natarajan
Database Vol. 2013, bas052; doi:10.1093/database/bas052
FREE Full Text

The eFIP system for text mining of protein interaction networks of phosphorylated proteins
Catalina O. Tudor, Cecilia N. Arighi, Qinghua Wang, Cathy H. Wu, and K. Vijay-Shanker
Database Vol. 2012, bas044; doi:10.1093/database/bas044
FREE Full Text

T-HOD: a literature-based candidate gene database for hypertension, obesity and diabetes
Hong-Jie Dai, Johnny Chi-Yang Wu, Richard Tzong-Han Tsai, Wen-Harn Pan, and Wen-Lian Hsu
Database Vol. 2013, bas061; doi:10.1093/database/bas061
FREE Full Text

Accelerating literature curation with text-mining tools: a case study of using PubTator to curate genes in PubMed abstracts
Chih-Hsuan Wei, Bethany R. Harris, Donghui Li, Tanya Z. Berardini, Eva Huala, Hung-Yu Kao, and Zhiyong Lu
Database Vol. 2012, bas041; doi:10.1093/database/bas041
FREE Full Text