Session 3: Structures, substances, nanomaterials

On chemical structures, substances, nanomaterials and data exploration
OpenTox Euro 2014 speaker: Nina Jeliazkova
PRESENTING AUTHOR: 

Nina Jeliazkova

INSTITUTION / COMPANY : 

Ideaconsult Ltd.

AUTHOR(S): 

Nina Jeliazkova, Vedrin Jeliazkov

WHEN: 
Tue, 23. Sep. 2014

 

REFERENCES:
1. Jeliazkova N.; Jeliazkov, V. Journal of cheminformatics, 2011, 3:18.
2. Jeliazkova N.; Jeliazkov, V. Current topics in medicinal chemistry, 12(18), 2012, 1987-2001.

BIOGRAPHY:
Nina received a M.Sc. in Computer Science from the Institute for Fine Mechanics and Optics, St. Petersburg, Russia in 1991, followed by a PhD in Computer Science (thesis "Novel computer methods for molecular modelling") in 2001 in Sofia, Bulgaria, and a PostDoc at the Central Product Safety department, Procter & Gamble, Brussels, Belgium (2002 - 2003) Her professional career started as a software developer first at the oil refinery Neftochim at Burgas, Bulgaria (1991 - 1995), then at the Laboratory for Mathematical Chemistry, Burgas, Bulgaria (1996 - 2001). She joined the Bulgarian Academy of Sciences at 1996 as a researcher and network engineer at the Network Operating Centre of the Bulgarian National Research and Education Network. She is founder and co-owner of Ideaconsult Ltd and is technical manager of the company since 2009. She participated in a number of R&D projects in Belgium and Bulgaria, authored and co-authored about 40 scientific papers in Bulgarian and international textbooks, as well as several book chapters. Nina received the Blue Obelisk Award in 2010 for achievements in promoting Open Data, Open Source and Open Standards.

ABSTRACT CONTENT / DETAILS:
The status-quo of chemical and bioinformatics databases is changing fast; an increasing number of online databases offer programmatic access (mostly via REST API), along with GUI. This brings both opportunities and challenges towards integrating the information, originating from diverse systems, as these interfaces are unique and incompatible, reflecting the underlying incompatible data models. A short summary, highlighting the pros and cons of the existing integration approaches is presented. We argue the structure-centric approach, adopted by the majority of chemical databases is biased towards the modelling community and fails to properly represent the complexity of the chemical substances and materials as produced, which itself is of substantial regulatory and scientific interest. Adopting a data model, describing the materials and measurements instead, provides a common ground for integration.

Besides retaining the data provenance the focus on measurements provides insights how to extend chemical structures and address the challenges of representing the identity of chemical substances and nanomaterials. We illustrate the approach with the latest developments of AMBIT web services [1] and OpenTox API, in the context of supporting read across for mono- and multiconstituent substances with impurities and additives; as well as in the context of integrating nanomaterials characterization and safety data.

An exercise of data integration is only useful if it allows better insight about the subjects studied, rather than inspecting the data points alone.

While OpenTox API had been designed with the goal of supporting wide variety of machine learning methods, the emphasis of this talk is on interactive data exploration and visualization. Different web browser data views will be presented, with the help of a set of newly developed and embeddable JavaScript widgets, acting as clients of AMBIT REST services. One use case is the chemical landscape analysis and visualisation of diverse datasets, based on a recently proposed a new and efficient method for identifying activity cliffs and visualization of activity landscapes [2].

The method is applicable large datasets, as it does not require the storage of the pairwise similarity matrix. The techniques of detecting discontinuities in property-activity landscapes are potentially useful as supporting tool for read-across, in modelling of any chemical property, as well as in guiding compound selection based on both structural and biological similarity, as we demonstrate with examples.