S1: Garuda/Shoe and Percellome analytic workflow
Sony Computer Science Laboratories Inc.
‘Percellome’ database is a unique source of toxicity response data for more than 100 chemicals obtained on various organs of mice (Kanno group, NIHS). Original “per cell” readout in mRNA copy method with high precision in measuring gene expression that allows incorporating every single molecule in the analysis. Such high-precision however highly complex data (40,000^n-order) requires extensive support from software tools. Such tools are needed at each step of a computational workflow, which typically consists of data handling, clustering, network inference, transcription regulation analysis followed by knowledge-based curation. We have developed an analytic pipeline that provides an interactive presentation of available evidence of transcription regulation and gene expression of Percellome data. SHOE (Sequence HOmology in Eukaryotes) analyzes transcription factor binding motifs in orthologous promoters of human, mice, and rats by a combination of original multiple alignment (MA) score and publicly available position-specific scoring matrices (PSSM) libraries, such as Transfac32, Jaspar, HOCOMOCO, ChIP-seq, SELEX, PBM and iPS-reprogramming factor. To increase the confidence of sequence analysis, Pearson correlation of genes is calculated on 112 “eigen cell” synthetic expression profiles available from Cell Montage Database based on 1714 normal human cells selected from Gene Expression Omnibus Database. SHOE visualizes results as a network of gene and transcription factors on its original GeneViz gadget and CellDesigner pathway analyzer. SHOE is connected to the Garuda platform www.garuda-alliance.organd is freely available for download.