S1: A data warehousing approach with TargetMine

A data warehousing approach with TargetMine to candidate gene prioritization

S1: A data warehousing approach with TargetMine, OpenTox Asia 2017

Yayoi Natsume


National Institutes of Biomedical Innovation, Health and Nutrition


Computational Biologist


In the post-genomic era, the process of extracting knowledge from data to discover target biomolecules for characterization can be one of the most challenging and critical steps. The latest progress of scientific technology enables us to catch a glimpse of the life as a system by producing huge amount of data. However, it is still a non-trivial task to interpret such data. One of the reasons of this difficulty is the complexity of life, that is, the biological roles of biomolecules are greatly affected by their cellular environment. Because the power of data from a single type of source is often limited, integration of data from multiple sources is expected to overcome such limitation to give more insights in how the life is maintained and how its system is disrupted under certain conditions such as disease.

For such integration to succeed, several hurdles must be overcome. Biological data are produced, processed and stored by a variety of approaches, and that can result in the increased heterogeneity and the reduced compatibility among scattered data resources. Commonly used approaches for the integration of such resources may be generally classified into two categories. One is collecting the available data together and storing them locally, for example, in a relational database. Another recent trend is to gather the necessary data on the fly using semantic web technology (RDF, SPARQL and so on).

Against this background, we have developed TargetMine (http://targetmine.mizuguchilab.org), an integrated data warehouse that is specifically designed for candidate gene prioritization and target discovery. TargetMine is powered by InterMine, a biological data warehouse framework that satisfies the performance of query execution by storing all the data into one location. In this talk, we will introduce the data integration approach taken in TargetMine and discuss methodological and practical differences between TargetMine and other related resources.