boring-is-good.com

Binary Image Analysis - Part 1

Published on 2023-02-05

Computer Vision

Human Brain is amazing! We can recognize shapes, colors and patterns almost instantly. Computing devices does not! That's why we, humans have to teach them image analysis algorithms. This post is part of Computer Vision Series.

Demystifying Binary Image

If you are into photography, you will know that every digital image contains pixels. For most images, pixel values are integers that range from 0 (black) to 255 (white). For grayscale images, the pixel value is typically an 8-bit data value (with a range of 0 to 255) or a 16-bit data value (with a range of 0 to 65535). For color images, there are 8-bit, 16-bit, 24- bit, and 30-bit colors. The 24-bit colors are known as true colors and consist of three 8-bit pixels, one each for red, green, and blue intensity.

Binary Image is exactly as it sounds. Being binary, it only contains two pixel values, often 0 for black, and either 1 or 255 for white. You might wonder why would we need to convert the colorful image into two pixel values Image. When transformed into 2 pixel-values image (1 pixel-value as background and 1 pixel-value as foreground), we could clearly see the objects without disturbance of colors. Hence, binary Image is especially useful for shape identification, documents processing, parts counting and inspections.

Now that, we have cleared up the definition of Binary Image. We will look at how to get started with image analysis: 1. Converting an image into a binary image, affectionately known as Thresholding 2. Basic Morphological filtering for eliminating noises 3. Counting objects and Labeling (Connected Component Analysis) 4. Features extractions - centroid - perimeter - area - bounding box - roundness

1. Converting an image into a binary image

In order to convert an image into a binary image i.e. we want an image separating foreground and background, we need to first identify the pixel values frequency in the image. Normally, background would have a different color than foreground (objects) in the image. Background color would be situated into one or similar pixel value ranges. So, when converting an image into a binary image, we need to know the range of pixel value in the image (color intensity). If we plot a histogram with pixel values and their occurrences in the image, we would know the distribution. Once we know the distribution, we can decide on a threshold value to convert the image into binary image!

Reading a PGM

Before all that, how does a computing device read an image? Let's start with PGM (Portable Gray Map). A PGM image represents a grayscale graphic image and you can open a pgm image with a note editor of your choice. It follows below format. Line1: A "magic number" for identifying the file type. A pgm image's magic number is the two characters "P2" then Whitespace (blanks, TABs, CRs, LFs). Line2: A width, formatted as ASCII characters in decimal. Whitespace. A height, again in ASCII decimal. Line3: The maximum gray value (Maxval), again in ASCII decimal. Must be less than 65536, and more than zero. A single whitespace character (usually a newline). And pixel values

PGM File Examplel!

Ok, now that, you get the idea of PGM. We will look at reading PGM file with Python. There are many libraries you can use for Image Processing. OpenCV is one of the popular ones. But we won't use any of the libraries as we want to understand what happens behind and not clouded with abstraction level libraries.

Alright, here is the code snippet written in pure python to read PGM file.

#0. Read the PGM and output the image information in a dict object 
# name: the name of the file
# cols: no. of columns in the image file i.e height of the image
# rows: no. of rows in the image file i.e. length of the image
# max_gray_value: maximum gray value of the input image file
# pixels: image's pixels value split into 2d arrays in the format of [Row][Col]
def readPGMImage(filename):
    with open(filename, "r") as f:
        # Read the first line and check if it's the PGM magic number
        magic_number = f.readline().strip()
        if magic_number != "P2": 
            raise ValueError("Not a PGM file")

        # Read the next line and split it into the width and height of the image
        size = f.readline().strip().split()
        cols = int(size[0]) # width
        rows = int(size[1]) # height

        # Read the next line and use it as the maximum gray value
        max_gray_value = int(f.readline().strip())

        # Read the rest of the file into a list of pixel values
        pixel_raw = [int(x) for x in f.read().strip().split()]

        # Convert the list of pixel values into a 2D list of pixels in the position of pixel = [Row][Column]
        pixels = [pixel_raw[i:i + cols] for i in range(0, len(pixel_raw), cols)]
        return dict(name = filename, cols = cols, rows = rows, max_gray_value = max_gray_value, pixels = pixels)

# Usage example
imgInfo = readPGMImage("image1.pgm")
print(f"\n==== Image Information ====")
print(f"Name of the file: {imgInfo['name']}")
print(f"Pixels Dimension (width x height) = {imgInfo['cols']} x {imgInfo['rows']}")
print(f"Max Gray Value = {imgInfo['max_gray_value']}")

Image1.pgm!

Image Histogram

Now that, we have our pixel values (stored as imgInfo.pixels), we can plot Image Histogram.

Image Histogram is a graph of pixel-intensity distribution. The values are usually normalized from 0 to 1. From the image histogram, we can deduce the contrast and object identifications. If it is a one tailed distribution, image contrast is low (not good!). The other useful aspect of image histogram is determining the threshold value.

Threshold value is the decision pixel value determining which pixel values would be background and which pixel values would be foreground. For example, in the range of 0 to 255, if a pixel value is greater than threshold value, say = 125, set it to 0 otherwise, change the pixel value to 1. Now, we can have a binary image with just two pixel values 0s and 1s.

The most popular and widely used method for image thresholding is Otsu's method. It finds the threshold value that minimizes the intra-class variance between the background and foreground pixels.

Here is the code snippet written in pure python for finding thershold value from input image

def getThresholdValuefromImgHistogram(pixels, max_gray_value):
    import matplotlib.pyplot as plt #optional

    # Initialization
    size = max_gray_value + 1 
    x = range(size) 

     # Loop the pixel value and count the occurrence of pixel value
    y = [0 for i in range(size)] 
    for row in pixels:
        for value in row: 
            y[value] += 1

    # Plot the histogram chart (optional)
    plt.plot(x, y)
    plt.xlabel("Pixel Value")
    plt.ylabel("Number of pixels")
    plt.title("Image Histogram")
    plt.show()

    # Normalization
    _bin = [count / len(pixels) / len(pixels[0]) for count in y]

    # Implementing Otus's Method
    # Compute the cumulative sum
    cumulative_sum = [0] * (size) # initialize the array size of max_gray_value and fill with zeros
    cumulative_sum[0] = _bin[0] # set the first value
    for i in range(1, size): # loop through the _bin and compute the cumulative sum
        cumulative_sum[i] = cumulative_sum[i - 1] + _bin[i] # total sum at the final element is 1! (because of normalization)

    # Compute the cumulative mean
    cumulative_mean = [0] * (size) # initialize the array size of max_gray_value and fill with zeros
    for i in range(1, size):
        cumulative_mean[i] = cumulative_mean[i - 1] + i * _bin[i]

    # Compute the maximum between-class variance
    max_variance = -1
    threshold = -1
    for t in range(1, len(_bin)):
        if (cumulative_sum[t] != 0 and cumulative_sum[t] != 1):
            background_mean = cumulative_mean[t] / cumulative_sum[t] 
            foreground_mean = (cumulative_mean[max_gray_value] - cumulative_mean[t]) / (1 - cumulative_sum[t])
            variance = cumulative_sum[t] * (1 - cumulative_sum[t]) * (background_mean - foreground_mean) ** 2
            if variance > max_variance:
                max_variance = variance
                threshold = t

    return threshold

Got it? The result of above code is a plotted image histogram and threshold value which for image1 is 175. Image Histogram

Thresholding

Once we get our threshold value, we just simply convert the pixel values to either 0 or 1 by checking the pixel value > threshold_value.

def transformIntoBinaryImage(thresholdValue, pixels, max_gray_value):
    return [[0 if x > thresholdValue else max_gray_value for x in row] for row in pixels]

## Example Usage
thresholdValue = getThresholdValuefromImgHistogram(imgInfo["pixels"],imgInfo["max_gray_value"] )
thresholded_pixels = transformIntoBinaryImage(thresholdValue, imgInfo["pixels"], imgInfo["max_gray_value"])

Let's see how does our threshold-ed image (binary image) look like:

You will notice that background is now black and foreground (hearts, numbers, spade) is white. Do you also see the white dot around rows 150 mark? This is the noise we need to clean from image.

Binary Image of Image1.pgm!

In next post, we look at Basic Morphological Operations for filtering noises in the image.

Binary Image Analysis - Part 1

Demystifying Binary Image

1. Converting an image into a binary image

Reading a PGM

Image Histogram

Thresholding

Further Reading