### 1. INTRODUCTION

Biosensors can sense single molecule through using nanopores. They may sense unlabeled biopolymers such as DNA and RNA and single proteins. The sensing takes place when ion currents reduced largely due to blocking pores by passing molecules [1]. The nanopore diameter is very important for sensing molecules. The main step for finding that is through using segmenting of them from scanning electron microscope (SEM) image.

Image segmentation defined as dividing images into multiple parts that have homogeneity in pixel intensity, color, or texture [2]. One of the simple, sometimes useful, segmenting methods is threshold technique. However, it is time-consuming for its strategy based on trial and error method, and for sometimes, a single threshold value does not work well, especially, for a series of image frames of video data. Akhtaruzzaman*et al*. [3] used an automated threshold detection on a video which is a series of image frames of human walking to segment human lower limbs. They applied automated threshold detection to convert the image frames into grayscale image, line fill algorithm to smoothing the edges of object, and remove background to get out the object.

In general, image enhancing through denoising is an important previous step before segmenting objects. One of the denoising filters is bilateral filter which reduces noise with remaining sharp edges of the objects. Besides, Nguyen *et al*. [4] denoised specific artifacts and segmented the full body bone structure by employing 3D bilateral filter and 3D graph-cut, respectively. On the other hand, Sahadevan*et al*. [5] increased the accuracy of super vector machine classifier using a bilateral filter which merges spatial contextual information to spectral domain.

Another method of segmentation is K-means which put the image into multi cluster of pixels according to factors such as their intensities. Chen *et al*. [6] propose a semiautomatic segmentation method, using K-means, to determine object’s mean temperature and variance through segmenting contours of thermal images taken by the optical camera.

Fu and Wang [7] applied expectation maximization-Gaussian mixture model (EM-GMM) on color images to segmenting them, and their results approve the power of it. The EM-GMM and fuzzy-C-means (FCM) methods are widely used in image segmentation. However, they have a major drawback for their sensitivity to the noise. Kalti and Mahjoub [8] proposed a variant of these methods to resolve this problem. Their results showed improvement compare to standard version of EM-GMM and FCM.

Several researches work to find geometrical structures of nanopores. Alexander *et al*. [9] computed nanopore size, perimeter, and some other geometric features using histogram equalization, morphological, and statistical operations. In another work, that done by Phromsuwan*et al*. [10], size of nanopores of SEM images obtained through using morphological and Canny edge detector techniques. Parashuram and Vidyasagar [11] used morphological and global thresholding for obtaining nanopore diameter and statistical features. Same authors with Muralidhara [12] using same operations to obtain perimeter of the nanopores. It can be realized that all above methods using methods that need trial and error parameters to give proper results.

This work aims to find semiautomatic algorithm to find diameter of nanopores of SEM images through examine four segmenting techniques. The performance will be evaluated objectively, and the average of nanopore’s diameter will be computed.

### 2. MATERIALS AND METHODS

Three SEM images of the nanopores anodic alumina film [13] used in this study for segmenting by our segmenting techniques and compute their diameters and number of pores. They consist of three SEM images with different widening times, namely, 0, 10, and 20 minutes as shown in Fig. 1.

**Fig. 1.** Four segmentation methods for three scanning electron microscope images with pore widening times: (a) 0 min, (b) 10 min, and (c) 20 min (Scale bar = 500 nm) [13]

The images segmented by four methods. The simple method is thresholding method that used here as ground truth images for objective evaluating of other segmenting methods. The second and third segmenting methods utilize bilateral filter, k-means, as the first step and using region selector as the second step. The fourth one is segmenting images using EM-GMM (Fig. 2).

**Fig. 2.** Image processing steps for different segmentation methods

Thresholding is a technique of selecting optimum gray level value which separates the region of interest from other regions. Thresholding produced binary images from gray level by making pixels lower or greater than a gray level value to zero and other remaining pixels to one. If g(x, y) is threshold output of an input f(x, y) at specific input gray level value T, it can be described as [14] follows:

K-means methoddivides pixels into a number of separate clusters. It’s algorithm consists of two steps. First, it finds k centroid (k number of clusters) for pixels of the image, and second, relate each pixel to a centroid through using different methods of computing distance between them. Euclidean distance may be used to measure distance, and it defined as follows:

Where p(x, y) is an input pixel to be cluster and c_{k} is the cluster centers. After grouping pixels into k sets (i.e. clusters), new Euclidean distance evaluated between each center and pixels, so pixels assigned to the minimum Euclidean distance [15].

The bilateral filtering is a technique for smoothing and sharpening edges of an image. It obtained by applying one Gaussian filter for obtaining the spatial domain and another one for intensity domain. The filter output of s pixel is given by following equation:

Where K(s) is normalization expression:

Where f and g are Gaussian, in the spatial domain and in the intensity domain, which represents the range filter, respectively [16].

Region selector method uses roicolor command in Matlab which select wanted region according to color or intensity levels in grayscale image.

The GMMconsists ofGaussian distributions that defined as follows:

Where every component of function N(x_{n}|Θ_{k}) is a Gaussian distribution which, for a D-dimensional vector x, defined as follows:

Where µ and Σ are a D-dimensional average vector and a D × D covariance matrix, respectively. The prior distribution π_{k} defines the probability of noticing x_{n} that belongs to the k^{th} class ?_{k}. It is unrelated to the observation x_{n}. Moreover, π_{k} must possess these restrictions:

After finding the density function for an observation, the log-likelihood function of N observations is as follows:

According to Equations 5 and 8, the major feature of the GMM is that its form is too simple and it needs few variables. Moreover, when GMM used in image segmentation, the correct results obtained if they unrelated to each other. To find the variables (π_{k},µ_{k},and Σ_{k}), the EM step is usually applied to get the upper limit of the log-likelihood function in Equation 8. The last probability for expectation stage of EM obtained as follows:

In the maximization step of EM, the parameters (π_{k},µ_{k},and Σ_{k}) are changed iteratively through the following formulas:

Where t denotes the iteration value. The loop is stopped in the convergence condition. The value from Equation 9 for maximum posterior criterion used to get the class label for each pixel [17].

### A. Rand Index

The Rand index, which founded by William Rand, used for the comparison of two arbitrary segmentations using pair-wise label relationships. It obtained by division of the number of pixel pairs that have the same label relationship in both segmentations. The n_{uv} is the number of points labeled u in S and that labeled v in S’. The labeled points u in the first part of S, and labeled points v in second part S’ are termed as n_{u■} and n_{■v}, respectively. Afterward:

The R-index is 1 when both segmentations have total similarities and 0 for zero ones. This type of similarity measurements takes small running time when unique labels in S and S’ are smaller than total data numbers N [18].

### 3. RESULTS AND DISCUSSION

All three SEM images segmented using threshold technique obtained after a large number of trial and error for optimize threshold intensity pixel value, morphological operation, and removing small objects. They considered as ground truth images through visual perception to objective evaluating other segmenting methods (Fig. 1). The results of other segmenting methods are shown in same figure too.

Fig. 1a shows SEM that suffers from some noise effect. The Wiener filter and adaptive histogram equalization used for denoising and contrast enhancement before segmenting by threshold technique. Nevertheless, it still effects on segmenting by other methods.

Fig. 1b and c show good segmenting for all segmenting methods. Fig. 3 presents all three images that segment by threshold, ground truth image, and EM-GMM, higher R index, that number labeled each pore. Furthermore, the distribution of pore size which mentioned showed in same figure. They can be fitted mainly as Gaussian distribution as appear charts of Fig. 3 and that in agreement with what in results of Macias *et al*. [13].

**Fig. 3.** (a-c) Total counting nanopores and distribution of nanopore sizes for threshold segmenting (ground truth image) and expectation maximization-Gaussian mixture model (higher R index) for all three scanning electron microscope types

The time consuming for running code, Rand index, number, and diameter of nanopores for segmenting methods and for all three SEM images presented in Table I. The obtained results for diameter of nanopores are in a good agreement with Macias *et al*. [13] results for threshold segmenting and smaller for EM-GMM segmenting. The method of analysis SEM images in mentioned reference is unknown. The EM-GMM is semiautomatic method, with high R index, and relatively smaller time consuming is better than other segmenting methods studied here.

**TABLE I** Time Consuming, Rand Index, Nanopore Diameter and Nanopore Counts for all SEM Images Segmented by the Four Techniques

### 4. CONCLUSION

Four different segmenting methods are applied on three SEM images with various time widening pores 0, 10, and 20 minutes. It can be noticed that the threshold segmenting possesses good results, but perhaps, it needs a large number of trial and error for choosing optimum threshold pixel intensity and needs also morphological operation and removing small objects. The authors also concluded that the EM-GMM is superior than bilateral filter and K-means with region selector, since it has higher R index than them. Consequently, their segmenting results used for pore counting and computing their diameters. Likewise, it has relatively small time consuming of running. Accordingly, EM-GMM can be used professionally for segmenting SEM images and finding number of pores and their diameters.