Регистрация | Вход в службу | FAQ      [?] 
CiteULike is a free online bibliography manager. Register and you can start organising your references online.
Recent | Unread | Search | Authors | Tags | Export

Text-mining assisted regulatory annotation

by: Stein Aerts, Maximilian Haeussler, Steven van Vooren, Obi Griffith, Paco Hulpiau, Steven Jones, Stephen Montgomery, Casey Bergman, The
Genome Biology, Vol. 9, No. 2. (2008)


View FullText article


X Reviews [Write a review of this article]

There are no reviews of this article

X Find related articles from these CiteULike users

X Find related articles with these CiteULike tags

X Abstract

BACKGROUND:Decoding transcriptional regulatory networks and the genomic cis-regulatory logic implemented in their control nodes are fundamental challenges in genome biology. High-throughput computational and experimental analyses of regulatory networks and sequences rely heavily on positive control data from prior small-scale experiments, but the vast majority of previously discovered regulatory data remains locked in the biomedical literature. RESULTS:We develop text-mining strategies to identify relevant publications and extract sequence information to assist the regulatory annotation process. Using a vector space model to identify Medline abstracts from papers likely to have high cis-regulatory content, we demonstrate that document relevance ranking can assist the curation of transcriptional regulatory networks and estimate that over 30,000 papers harbour unannotated cis-regulatory data. Additionally, we show that DNA sequences can be automatically extracted from full-text articles with high cis-regulatory content and accurately mapped to genome sequences as a means of identifying the location, organism and target gene information that is critical to the cis-regulatory annotation process. CONCLUSIONS:Our results demonstrate that text-mining technologies can be successfully integrated with genome annotation systems, thereby increasing the availability of annotated cis-regulatory data needed to catalyze advances in the field of gene regulation.


X BibTeX record

X RIS record



RIS BibTeX