Image analysis is the process of converting digital information from images into measurements that characterize the status of each and every cell in an experiment. A digital image is basically a n-dimensional array of pixels, where each of the pixels has its own value (i.e., 0-255, for an 8-bit image). We can map these values and extract some information from it. Basic job of a bioimage analyst is you try to process the image which you can threshold to get the objects that are depicted in the image, separated from each other and from the background. This is called segmentation.
When there is a multi-segmentation problem, we can use automated image analysis, where we use machine learning models which can be trained for specific images or we use a pre-trained model that segments the image for our needs. Automated image analysis is objective, because it removes the human from the profiling, it gives us quantitative data with the use of statistics. It gives us the advantage of measuring multiple properties at once, it helps us to quantify heterogeneity in the case of single-cell measurements. It is able to distinguish subtle changes, even those undetectable to the naked eye. The process is faster and less tedious than by doing it by hand.
Illumination correction
It is generally understood that no optical system is perfect, particularly in the area of light microscopy, where inhomogeneous lighting is present in every image obtained by a microscope. Dust, nonuniform light sources, intensity falloff from the center of the image (vignetting), and misaligned optics all contribute to inhomogeneous illumination of your sample. This effect can disrupt accurate segmentation and intensity measurements.
The most widely used techniques for correcting your inhomogeneous lighting retrospectively are multi-image methods, which can calculate the correction model by using multiple images acquired in the experiment. In principle, you should create a separate correction model or illumination function for each imaging set or sample preparation conditions since either might result in variations of illumination. For instance, if you use different staining methods for a batch of images or alter any components or settings in the microscope’s optical path, the illumination pattern will vary.
We can use a method that estimates an illumination correction function (ICF) by combining information across multiple images. The ICF can be calculated by averaging all images in an experimental batch, that is, all images for a particular channel from a particular multi-well plate, followed by smoothing using a median filter. Then, each image is corrected by dividing it by the ICF.
The majority of cell painting collections are of good quality, however for visual reasons, we used a brightfield image of red blood cells here.
Segmentation
Accurate cell segmentation is the foundation for analyzing images of complex cellular phenotypes at the single cell level. For instance, cellular compartment identification or feature extraction based on cell shape, intensity, or texture. Moreover, its purpose is to classify pixels as either foreground or background and to distinguish between different cells. This can be based on one of the two established models. First is the model-based approach which mostly involves histogram-based methods, such as thresholding, edge detection, and watershed transformation. This can involve choosing the appropriate algorithm and manually optimizing the parameters on the basis of visual inspection of segmentation results, identifying the nuclei and cell outlines. This approach is commonly used with CellProfiler software. Using different modules in CellProfiler depends on the image that you use and the segmentation objectives of your study.
For our example of segmentation in CellProfiler we used an image from the Human MCF7 cells – compound-profiling experiment, available on Broad Bioimage Benchmark Collection. The nuclei segmentation (Figures 6 and 7) was done on a DAPI dyed image and the cell segmentation was done on tubulin-dyed image (Figures 8 and 9).
The second approach involves machine learning, where the ground-truth data and manually specifying which pixels in an image correspond to distinct types of objects is provided for training a classifier that is then able to find an optimal segmentation solution. This typically involves the application of different transformations to the image in order to capture different patterns in the local pixel region. Finally, segmentation is carried out by applying the trained model to new images and classifying pixels accordingly. This approach is commonly used in Cellpose. It uses a deep learning algorithm to automatically identify and outline individual cells in an image, allowing you to quickly and accurately analyze large datasets.

Figure 11. Segmentation workflow in CellPose
As shown in Figure 11., CellPose only required the tubulin-stained image of MCF7 cells for segmentation; otherwise, everything was handled by the program. In this case, you do not need to optimize the image or segment the nuclei beforehand because the program does it automatically from the image where whole cells are visible.
Overall, the specific techniques you use for segmentation will depend on the specific goals of your analysis and the characteristics of your images.
Feature extraction
Feature extraction involves the measurement of phenotypic characteristics of individual cells, which can be used for profiling. Most common types of features that can be extracted are shape features, intensity-based features, texture features and microenvironment and context features.
Shape features are usually computed on the boundaries of segmented cell compartments, which include perimeter, area and roundness of the object. You can compute some features from the intensity values in individual channels of the image on the basis of each compartment of a single cell. These intensity-based features mostly include simple statistics as mean and maximum intensity. The periodic changes and the quantification of the regularity of intensities in single-cell analysis can be detected by using mathematical functions such as cosines and correlation matrices. These features are often termed texture features. Counts and spatial interactions among cells in the visual field, based on the number of cells and the distance among them, as well as the cell’s location relative to a cell colony, are among the microenvironment and context features. These also include subcellular structures like speckles within a nucleus or distances between the nucleus and individual cytoplasmic vesicles.
The CellProfiler software is a suitable choice for feature extraction since it includes a variety of different measuring modules. For measuring the object’s granularity, you can use the MeasureGranularity-module, which outputs spectra of size measurements of the textures in the image, you can measure the texture of objects (MeasureTexture), their intensity (MeasureObjectIntensity), size, and form (MeasureObjectSizeShape).