S2: Data Management in nano safety research
Nanosafety research is rapidly transforming into a highly competitive and data intensive field. Translational research, the introduction of nanoinformatics and the shift towards big data introduce the need for complex data capture, storage, handling and analysis. This combination is called data management and is the most neglected aspect of scientific research, with few laboratories having established processes to automatically capture, store and make data (primary and metadata) available on demand. Data management, if done correctly, ensures that collected data stay safe, can be easily handled and analysed, and is continuously available via data repositories. Data management has become even more significant due to requirements for Horizon 2020 (H2020) project data to be FAIR (Findable, Accessible, Interoperable, and Re-usable) and compatible with the EU Observatory for Nanomaterials (EUON).
The H2020 infrastructure project NanoCommons will design, test and implement detailed data management processes that can be offered to the wider nanosafety community. The NanoCommons data management process starts from the data generation step, rather than being a post-processing step (chore whose value is not apparent to researchers). Capturing data upfront will facilitate research and the subsequent analysis and publication. To this end, the use of online notebooks, capturing data as it is created, and linking a few instruments locally or laboratories around the world through local (LAN) or wide (WAN) networks, is a first critical step. Online notebooks can facilitate data harmonisation and enhance data quality through easy to setup and implement templates that are versatile (can import and/or modify existing protocols and tools) and allow full version control, making it suitable for assay/method development. Templates can also enhance data standardisation and quality control (QC) for regulatory testing, by linking to standard operating procedures and associated data capture formats. While the proposed process substantially decreases the time needed to gather, sort, manipulate and analyse experimental data, and is an enormous step forward, it doesn’t replace dedicated data curators to integrate and QC larger or historic datasets, but should allow them to focus on the earlier steps of template generation and data harmonisation.