Programming for beginners: Exploring Dilation Operation for Image Enlargement in OpenCV

Dilation process enlarge the boundaries of all the foreground regions of an image. Go through this video to get a quick glance (https://www.youtube.com/watch?v=2LAooUu1IjQ)

How the Dilation process performed?

Dilation process is performed with the help of a structuring element. Structuring element is a small predefined binary image or matrix, common shapes of structuring element include a circle, square, rectangles.

Square structuring element

Square structuring element is used, when you want to to perform symmetric dilation in all directions around a pixel.

Example

0 1 0

1 1 1

0 1 0

Circular structuring element

Circular structuring elements are defined in terms of their radius 'r' in general. Following example define a circular structuring element with a radius of 2.

0 1 1 1 0

1 1 1 1 1

0 1 1 1 0

All the pixels of distance 2 from the center of the structure element are marked as 1, remaining are 0. Circular structuring element is used to perform dilation with a circular neighborhood.

Rectangle structuring element

Rectangle structuring element is used to enlarge objects in both the horizontal and vertical directions.

Example

1 1 1 1 1

Let us try to understand the dilation process with a simple square structuring element example.

Dilation process will not impact any foreground pixels, it works on the pixels with zero values.

We take the center element of structuring element, keep the center pixel of structuring element on each input image pixel with zero value. When we keep the structuring element on top of the input image, and any one of the 1’s of structuring elements are matched, then it flips the value of input image pixel from zero to 1.

For example,

a. when I place the center value of structuring element on input_image[2][1], the left, top values relative to input_image[2][1] are 1, so this pixel is updated as 1.

b. When I place the center value of structuring element on input_image[3][6] , left, top and bottom values are zero, right side cross the corner and treated as zero. so the pixel input_image[3][6] left with value 0.

How the corners are handled during dilation process?

When you place the center of a structuring element on the corner pixels of the input image, a portion of the structuring element will indeed extend beyond the bounds of the image.

For example, when I place the center of structuring element on input_image[0][0], then the elements at indexes [0][0], [1][0], [2][0] and [0] [1] are spill over the boundary. There are couple of approaches to handle the corner values.

a. Zero Padding: Pixels outside the image boundary are considered to have a value of zero.

b. Wrap-Around or Periodic Boundary: While dealing with cyclic or periodic data, boundary condition assumes that the image repeats or wraps around. When the structuring element crosses the corner, it continues on the opposite side of the image.

After applying the dilation the input image is changed like below.

Find the below working application in NumPy

array_dilation.py

import numpy as np

# Create a binary array (0s and 1s)
binary_array = np.array([
    [0, 0, 0, 0, 0, 0, 0],
    [0, 1, 1, 1, 1, 1, 0],
    [1, 0, 0, 1, 0, 1, 0],
    [0, 0, 1, 0, 1, 0, 0],
    [1, 0, 0, 1, 1, 1, 0],
    [0, 1, 1, 1, 1, 0, 0]
], dtype=np.uint8)

# Create a square structuring element (kernel)
kernel = np.array([
    [0, 1, 0],
    [1, 1, 1],
    [0, 1, 0]
], dtype=np.uint8)

# Perform binary dilation using NumPy
dilated_array = np.zeros_like(binary_array)

# Get the dimensions of the binary array and kernel
input_array_height, input_array_width = binary_array.shape
kernel_height, kernel_width = kernel.shape
kernel_center = (kernel_height // 2, kernel_width // 2)

# Iterate through each pixel in the binary array
for i in range(input_array_height):
    for j in range(input_array_width):
        if binary_array[i, j] == 1:
            # Check the overlapped region
            for m in range(kernel_height):
                for n in range(kernel_width):
                    if i + m - kernel_center[0] >= 0 and i + m - kernel_center[0] < input_array_height and j + n - kernel_center[1] >= 0 and j + n - kernel_center[1] < input_array_width:
                        if kernel[m, n] == 1:
                            dilated_array[i + m - kernel_center[0], j + n - kernel_center[1]] = 1

# Print the original and dilated binary arrays
print("Original Binary Array:")
print(binary_array)

print("\nDilated Binary Array:")
print(dilated_array)

Output

Original Binary Array:
[[0 0 0 0 0 0 0]
 [0 1 1 1 1 1 0]
 [1 0 0 1 0 1 0]
 [0 0 1 0 1 0 0]
 [1 0 0 1 1 1 0]
 [0 1 1 1 1 0 0]]

Dilated Binary Array:
[[0 1 1 1 1 1 0]
 [1 1 1 1 1 1 1]
 [1 1 1 1 1 1 1]
 [1 1 1 1 1 1 0]
 [1 1 1 1 1 1 1]
 [1 1 1 1 1 1 0]]

How to perform dilate operation in OpenCV?

Using cv2.dilate method, we can perform dilate operation on OpenCV.

Signature

cv2.dilate(src, kernel, dst=None, anchor=None, iterations=1, borderType=None, borderValue=None)

Following table summarizes the parameters of dilate method.

Parameter	Description
src	Source image on which the dilation operation should be performed.
kernel	It represents the structuring element used for dilation process
dst	It is optional parameter, and specifies destination image where the result of the dilation operation will be stored.
anchor	It is optional parameter, specifies the anchor point within the kernel. It is a tuple of two integers that defines the relative position of the anchor within the kernel. Default value is (-1, -1), which means the anchor is at the center of the kernel.
iterations	It is optional parameter, specifies how many times the dilation operation should be applied.
borderType	It is optional parameter, specifies the type of border to use when the kernel goes outside the image boundaries. The default is cv2.BORDER_CONSTANT. Other values cv2.BORDER_REPLICATE, and cv2.BORDER_REFLECT are also supported.
borderValue	It is optional, If you choose cv2.BORDER_CONSTANT for the border type, you can specify a constant value to be used for the border. The default is 0.

Example

dilated_image = cv.dilate(image, kernel, iterations=1)

Find the below working application.

dilation.py

import cv2 as cv
import numpy as np

def resize_frame(frame, scale_width=0.5, scale_height=0.5):
    width = int(frame.shape[1] * scale_width)
    height = int(frame.shape[0] * scale_height)
    new_dimensions = (width, height)
    return cv.resize(frame, new_dimensions, interpolation=cv.INTER_AREA)

# Load the image
 # Load as grayscale image
image = cv.imread('pyramids.png', 0)
image = resize_frame(image)

# Define the structuring element (square in this case)
kernel = np.array([
    [0, 1, 0],
    [1, 1, 1],
    [0, 1, 0]
], 'uint8')

# Perform dilation
dilated_image = cv.dilate(image, kernel, iterations=1)

# Display the original and dilated images
cv.imshow('Original Image', image)

cv.imshow('Dilated Image', dilated_image)

cv.waitKey(0)
cv.destroyAllWindows()

Output

Original image

Dilated image

References

https://www.youtube.com/watch?v=2LAooUu1IjQ

https://www.youtube.com/watch?v=xO3ED27rMHs

Previous Next Home

Programming for beginners

Tuesday, 21 November 2023

Exploring Dilation Operation for Image Enlargement in OpenCV

No comments:

Post a Comment