Our main research theme is quantifying and mining the rich information present in cellular images to yield biological discoveries, often using deep learning. We work on high-throughput projects (100,000-1,000,000 images) probing a variety of biological processes and diseases of interest, with a special interest in psychiatric research, infectious disease, and cancer.
High-throughput imaging experiments generate extremely large, multidimensional data sets with quantifiable phenotypic information for every individual cell. Using machine learning, including deep learning, we mine this rich, latent information to identify patterns resulting from chemical or genetic perturbations to probe the causes and cures for various diseases. For example:
- Predicting how new chemical compounds act in cells
- Identifying and classifying toxicity of compounds destined for clinical trials
- Identifying differences in cell structure between patient cells affected by bipolar disorder or schizophrenia
- Discerning the functional impact of gene variants associated with human disease
- Identifying gene function from large-scale genome sequencing studies
We developed the Cell Painting assay in order to carry out high-throughput morphological profiling experiments.
CellProfiler and other bioimage analysis software
Algorithms developed in my group are made readily usable by the scientific community via our user-friendly software, CellProfiler and CellProfiler Analyst (cellprofiler.org). CellProfiler is versatile, open-source software for quantifying a variety of phenotypes in biological images. Since its release in 2005, it has become well established and widely used by thousands of biologists worldwide [citations]. The software evolves within an active research environment involving dozens of diverse image-based assays, resulting in rich functionality as we continue to improve its capabilities, interface, and support.
We are now leading the community to bridge the gap between biologists and advancements in deep learning. Our first major projects in this area are Piximi, for cell classification, and the Data Science Bowl, which yielded a robust trained model to segment nuclei in images, across diverse microscopy types and cell types.
Image analysis for high-content screens
Example projects include the identification of genetic regulators (glioblastoma differentiation, breast cancer cells' response to heregulin, meiosis) and chemical regulators (leukemic differentiation, mitochondrial function, tuberculosis infection).
Imaging flow cytometry
Imaging flow cytometry combines the high-throughput nature of flow cytometry with the high-resolution nature of fluorescence microscopy. For each experimental sample, it yields hundreds of thousands of images of individual cells. We are developing methods to mine these large datasets [NSF project page].
Impact on human health
Our research has yielded discoveries in several translational projects, some of which have already had a direct impact on the treatment of disease. For example, CellProfiler has been used to identify several small molecules that are effective in treating particular diseases in mouse models. In some cases, discoveries made using CellProfiler have even led to clinical trials in humans, and directly improved patient outcomes [more details].