Session 3: PubChem HTS data

The PubChem's practice for hosting high throughput screening (HTS) data

Yanli Wang, Ph.D.


National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health


Lead Scientist


The PubChem BioAssay database was created in 2004 by the National Center for Biotechnology Information (NCBI) at NIH and serves as a public repository for small molecule bioactivity and toxicity data generated by high throughput screening (HTS), chemical biology and medicinal chemistry experiments.

It was extended later on to support literature derived experimental results and RNAi screening data. The BioAssay database currently contains one million bioassay records, three million tested substances, and 200 million bioactivity outcomes against thousands of protein and gene targets, which are contributed by worldwide researchers. The database provides a flexible data model to accommodate bioactivity data, including summary results, multiple data points, and experimental replicates, that are produced by diverse experimental procedures.

The BioAssay data model provides multiple mechanisms for recording molecular information for the corresponding protein and nucleotide targets in each submission. It supports structured annotations for assay targets, cell lines, assay experimental information, as well as cross link to other biomedical databases, and continues to expand to support new types of information generated as experimental methodology and technology evolve.

The database infrastructure allows for seamlessly storing the submitted BioAssay records, tracking and versioning subsequent updates, and supporting data retrieval and analysis. PubChem can be freely accessed and downloaded using the NCBI information retrieval system Entrez at and a suite of services provided by PubChem at

PubChem welcomes feedback on the usability of the information platform and welcomes contributions from the community by sharing experimental and annotation data. Chemical structures and assay results can be deposited via the PubChem submission system at: with enhanced user interface and new functions helping to manage and share on-hold data.