br Digitization of Pap smear images is a complex
Digitization of Pap smear images is a complex task . A Pap smear prepared using the conventional method covers a surface area of about 10 cm2. A typical Pap smear contains between 10,000 and 100,000 Calpain Inhibitor I with a typical cell having a diameter of about 35 μm . This implies that under ideal conditions 100,000 cells could be packed in an area of 1 cm2. To obtain reliable data for classification, the cells should be digitized at a maximum resolution (which is about 0.25 μm pixel size) . This type of resolution is quite difficult to achieve with conventional imaging modalities. Hence, several Pap smear image ac-quisition techniques have been developed to provide efficient high-re-solution digital Pap smear images. Three examples of such acquisition techniques are:
⁃ Flying spot scanners. These measure one point at a time either by moving the entire sample in a raster fashion or by moving the measuring point or the illumination source over the sample . ⁃ Continuous motion imaging. The single line integration systems use a line of a photosensor which is moved orthogonally to the line directly across the sample . For better results, the speed with which the array is swept across the sample, or the sample moving below the array, needs to be synchronized with electronic scanning speed. ⁃ TV-scanners. These integrate light from a whole rectangular area at once. The rectangular area is moved in order to capture the entire area of the sample. However, with this technique, for each captured image the microscope sensor has to be moved, the image has to stabilize and focused, and light has to be optimized .
Preprocessing is needed for background extraction, and for noise/ debris removal in the Pap smear images. It also helps to define the regions containing cells, or the regions without cells, in order to reduce the area to be searched. Furthermore, preprocessing helps in de-termining the colour model to be used during image analysis. Preprocessing techniques include contrast stretching, noise filtering and histogram equalization . Malm et al.  proposed a sequential classification scheme focused on removing debris in Pap smear images before segmentation. Other preprocessing approaches have been pro-posed by several researchers [17–19].
Segmentation is needed for the definition of the regions of interest (ROI) in the image and is foundational to an automated cervical cancer screening system. Effective image segmentation facilitates the extrac-tion of meaningful information and simplifies the image data for later analysis. Poor segmentation leads to poor results during image analysis . Most of the time, due to the fundamentally important role of Informatics in Medicine Unlocked 14 (2019) 23–33
nuclei in a cervical cancer cell, cytopathologists are interested in the evaluation of the nucleus and cytoplasm parameters in order to facil-itate cell-based diagnosis screening. Hence, accurate nucleus and cy-toplasm segmentation are paramount. There are several segmentation methods which have been applied to Pap smear images and these in-clude water immersion, active contour models, Hough transform, seed-based region growing algorithm and moving k-means clustering [5,20–23]. All of these methods are solving a puzzling problem that images obtained from Pap smears are difficult to segment because of the diversity of cell structures, the intensity variation of background and overlapping of cell clusters; hence, an efficient preprocessing technique is paramount prior to segmentation. Recently, Lili et al.  proposed a superpixel-based Markov random field (MRF) segmentation framework to segment the nucleus, cytoplasm and image background of cervical cell images. Srikanth et al.  presented a method based on Gaussian mixture models (GMM) combined with shape-based identification of nucleus to segment the nucleus and cytoplasm from cervical cells. Song et al.  proposed a multiscale convolutional network (MSCN) and graph-partitioning-based method for accurate segmentation of cervical cytoplasm and nuclei. Specifically, deep learning via the MSCN was explored to extract scale-invariant features, and then, segment regions centered at each pixel were defined by a graph partitioning method. Zhi et al.  presented an algorithm for accurately segmenting the in-dividual cytoplasm and nuclei from a clump of overlapping cervical cells using a joint level set optimization on all detected nuclei and cy-toplasm pairs. The optimization is constrained by the length and area of each cell, a prior on cell shape, the amount of cell overlap, and the expected gray values within the overlapping regions. r>