Start of topic | Skip to actions

Tools - Resources - Repositories

BioLexicon

The BioLexicon is a large-scale lexical resource, especially designed to contain and manage data from bio-databases. The lexicon data model has been conceived to be compliant to the ISO-ratified international standards for lexicons; its associated data categories, also reflects the ISO Data Category conceptual model available in the ISO Registry.

BioOntology

The Gene Regulation Ontology (GRO) is a methodologically rigorously crafted formal ontology which covers the domain of gene regulation. It integrates knowledge that is partially assembled in alternative ontological resources we relied upon, viz. the Gene Ontology (GO), the Sequence Ontology (SO), Chemical Entities of Biological Interest (ChEBI), INOH Molecule Role (IMR), INOH Event Ontology (IEV), the NCBI Taxonomy and TransFac.

E.coli Corpora

Four E. coli relevant corpora are available:

Text Analytics Toolkit

The BOOTStrep Text Analytics Toolkit is a collection of almost forty human language technology modules which cover virtually all phases of text analytics such as text segmentation (sentence splitting, tokenisation), morpho-lexical analysis (stemming, lemmatisation, acronym and abbreviation resolution, term recognition), syntactic analysis (part-of-speech tagging, coordination resolution, chunking, parsing), semantic analysis (named entity recognition and interpretation, relation and event extraction) and discourse-level analysis (co-reference resolution). While some modules could be employed on an as-is basis (e.g., statistical term recognition systems such as TerMine, or rule-based parsers such as Enju), in particular machine learning-based systems had to be re-trained. As a consequence, this required the creation of training material and, therefore, BOOTStrep partners had to develop several text corpora annotated with formal text structure, syntactic, semantic and discourse information.

BioFactStore

The BOOTStrep BioFactStore is a database for biological researchers and developers, which contains factual information (empirical assertions, statements) about gene regulation in E. coli. The BioFactStore is a merger of RegulonDB (http://regulondb.ccg.unam.mx/), the most authoritative manually curated database of regulatory networks in many species, and the automatically harvested factoids relating to gene regulation in E. coli from the Knowledge Reaper (see (5)). It not only reports on selected types of events and the involved agents and patients but also on the polarity of the relation and the physical contact data. Modality (certainty of the information) is covered as well as additional parameters such as environment parameters:
  • Event type: currently Regulation of Transcription (ROT), Regulation of Gene Expression (ROGE), and Transcription Factor Binding to Regulatory Region (TFBRR). All events are represented in the BioOntology (GRO).
  • Involved Agent and Patient: The involved biological entities are genes, proteins and transcription factors. They can be distinguished according to their role definition. Entities are linked to the BioLexicon and from there to other external data resources (via the BioLexicon)
  • The Polarity represents the direction of the interaction of the event, i.e. which part is acting on its partners.
  • Physical contact denotes whether or not the involved entities have a direct interaction, i.e. whether the entities interact through a physical connection.
  • Additional constraints have to be considered eventually, for example the sigma factor, temperature, experimental settings and other parameters that could influence the event.
toggleopenShow attachmentstogglecloseHide attachments
Topic attachments
I Attachment Action Size Date Who Comment
elseEXT EcoliCopora manage 4.0 K 03 Aug 2009 - 13:06 UnknownUser  
elseEXT WebPreferences manage 4.0 K 03 Aug 2009 - 10:12 UnknownUser  
elseEXT BioLexicon manage 4.0 K 03 Aug 2009 - 13:33 UnknownUser