Image based cellular phenotypic profiling and machine learning for predicting of toxicological modes of action
The knowledge gap regarding occurrence, modes of toxic action (MOA) and risks for new and/or emerging risk chemicals (NERC) is large. NERCs that we will focus on in this project are the thousands of per- and polyfluoroalkyl substances and thousands of polyaromatic compounds in the environment. There is a need for a sensitive approach for the early discovery of NERCs. The aim of this project is to develop a sensitive effect-based monitoring method for NERCs using a non-target high-content cellular phenotypic profiling approach using open-source Cell Profiler software (cellptofiler.org). Multi-parametric measures of cellular responses are summarized as phenotypic profiles and can be used to group compounds by their profile similarity and for multiple mode-of-action predictions. Such profiles contain 2000 morphological descriptors from eight broadly relevant cellular components or organelles per cell. This project is going to employ a substantial amount of open source images from by Broad Institute databank  in order to create a library of reference phenotypical profiles. Additionally, these phenotypic profiles will be combined with other databases, e.g. gene expression data, molecular descriptors, toxicological data, etc. for toxicity prediction. Combined datasets will be used to train a machine learning model to connect morphological fingerprints with the chemical properties of the tested compounds, their mode of action, and their toxicity. Our machine learning models will be used to predict toxicity and MOA of the chemicals in the fractionated environmental samples and thus contributing to the identification of toxicity-driving chemicals using the EDA methodology.
Morphological data will be retrieved from the cell images using CellProfiler software (https://www.cellprofiler.org/). For building the machine learning models, we are going to use Tensorflow (https://www.tensorflow.org/) and Pytorch (https://pytorch.org/) platforms. All these software are open-source, also perform very good and can take advantage of their employing on parallel architectures.
 Mark-Anthony Bray, et al., A dataset of images and morphological profiles of 30 000 small-molecule treatments using the Cell Painting assay, GigaScience, V 6, Issue 12, 2017, giw014, https://doi.org/10.1093/gigascience/giw014