Dilation process enlarge the boundaries of all the foreground regions of an image. Go through this video to get a quick glance (https://www.youtube.com/watch?v=2LAooUu1IjQ)
How the Dilation process performed?
Dilation process is performed with the help of a structuring element. Structuring element is a small predefined binary image or matrix, common shapes of structuring element include a circle, square, rectangles.
Square structuring element
Square structuring element is used, when you want to to perform symmetric dilation in all directions around a pixel.
Example
0 1 0
1 1 1
0 1 0
Circular structuring element
Circular structuring elements are defined in terms of their radius 'r' in general. Following example define a circular structuring element with a radius of 2.
0 1 1 1 0
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
0 1 1 1 0
All the pixels of distance 2 from the center of the structure element are marked as 1, remaining are 0. Circular structuring element is used to perform dilation with a circular neighborhood.
Rectangle structuring element
Rectangle structuring element is used to enlarge objects in both the horizontal and vertical directions.
Example
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
Let us try to understand the dilation process with a simple square structuring element example.
Dilation process will not impact any foreground pixels, it works on the pixels with zero values.
We take the center element of structuring element, keep the center pixel of structuring element on each input image pixel with zero value. When we keep the structuring element on top of the input image, and any one of the 1’s of structuring elements are matched, then it flips the value of input image pixel from zero to 1.
For example,
a. when I place the center value of structuring element on input_image[2][1], the left, top values relative to input_image[2][1] are 1, so this pixel is updated as 1.
b. When I place the center value of structuring element on input_image[3][6] , left, top and bottom values are zero, right side cross the corner and treated as zero. so the pixel input_image[3][6] left with value 0.
How the corners are handled during dilation process?
When you place the center of a structuring element on the corner pixels of the input image, a portion of the structuring element will indeed extend beyond the bounds of the image.
For example, when I place the center of structuring element on input_image[0][0], then the elements at indexes [0][0], [1][0], [2][0] and [0] [1] are spill over the boundary. There are couple of approaches to handle the corner values.
a. Zero Padding: Pixels outside the image boundary are considered to have a value of zero.
b. Wrap-Around or Periodic Boundary: While dealing with cyclic or periodic data, boundary condition assumes that the image repeats or wraps around. When the structuring element crosses the corner, it continues on the opposite side of the image.
After applying the dilation the input image is changed like below.
Find the below working application in NumPy
array_dilation.py
import numpy as np
# Create a binary array (0s and 1s)
binary_array = np.array([
[0, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 1, 0],
[1, 0, 0, 1, 0, 1, 0],
[0, 0, 1, 0, 1, 0, 0],
[1, 0, 0, 1, 1, 1, 0],
[0, 1, 1, 1, 1, 0, 0]
], dtype=np.uint8)
# Create a square structuring element (kernel)
kernel = np.array([
[0, 1, 0],
[1, 1, 1],
[0, 1, 0]
], dtype=np.uint8)
# Perform binary dilation using NumPy
dilated_array = np.zeros_like(binary_array)
# Get the dimensions of the binary array and kernel
input_array_height, input_array_width = binary_array.shape
kernel_height, kernel_width = kernel.shape
kernel_center = (kernel_height // 2, kernel_width // 2)
# Iterate through each pixel in the binary array
for i in range(input_array_height):
for j in range(input_array_width):
if binary_array[i, j] == 1:
# Check the overlapped region
for m in range(kernel_height):
for n in range(kernel_width):
if i + m - kernel_center[0] >= 0 and i + m - kernel_center[0] < input_array_height and j + n - kernel_center[1] >= 0 and j + n - kernel_center[1] < input_array_width:
if kernel[m, n] == 1:
dilated_array[i + m - kernel_center[0], j + n - kernel_center[1]] = 1
# Print the original and dilated binary arrays
print("Original Binary Array:")
print(binary_array)
print("\nDilated Binary Array:")
print(dilated_array)
Output
Original Binary Array: [[0 0 0 0 0 0 0] [0 1 1 1 1 1 0] [1 0 0 1 0 1 0] [0 0 1 0 1 0 0] [1 0 0 1 1 1 0] [0 1 1 1 1 0 0]] Dilated Binary Array: [[0 1 1 1 1 1 0] [1 1 1 1 1 1 1] [1 1 1 1 1 1 1] [1 1 1 1 1 1 0] [1 1 1 1 1 1 1] [1 1 1 1 1 1 0]]
How to perform dilate operation in OpenCV?
Using cv2.dilate method, we can perform dilate operation on OpenCV.
Signature
cv2.dilate(src, kernel, dst=None, anchor=None, iterations=1, borderType=None, borderValue=None)
Following table summarizes the parameters of dilate method.
Parameter |
Description |
src |
Source image on which the dilation operation should be performed. |
kernel |
It represents the structuring element used for dilation process |
dst |
It is optional parameter, and specifies destination image where the result of the dilation operation will be stored. |
anchor |
It is optional parameter, specifies the anchor point within the kernel. It is a tuple of two integers that defines the relative position of the anchor within the kernel.
Default value is (-1, -1), which means the anchor is at the center of the kernel. |
iterations |
It is optional parameter, specifies how many times the dilation operation should be applied. |
borderType |
It is optional parameter, specifies the type of border to use when the kernel goes outside the image boundaries.
The default is cv2.BORDER_CONSTANT. Other values cv2.BORDER_REPLICATE, and cv2.BORDER_REFLECT are also supported. |
borderValue |
It is optional, If you choose cv2.BORDER_CONSTANT for the border type, you can specify a constant value to be used for the border. The default is 0. |
Example
dilated_image = cv.dilate(image, kernel, iterations=1)
Find the below working application.
dilation.py
import cv2 as cv
import numpy as np
def resize_frame(frame, scale_width=0.5, scale_height=0.5):
width = int(frame.shape[1] * scale_width)
height = int(frame.shape[0] * scale_height)
new_dimensions = (width, height)
return cv.resize(frame, new_dimensions, interpolation=cv.INTER_AREA)
# Load the image
# Load as grayscale image
image = cv.imread('pyramids.png', 0)
image = resize_frame(image)
# Define the structuring element (square in this case)
kernel = np.array([
[0, 1, 0],
[1, 1, 1],
[0, 1, 0]
], 'uint8')
# Perform dilation
dilated_image = cv.dilate(image, kernel, iterations=1)
# Display the original and dilated images
cv.imshow('Original Image', image)
cv.imshow('Dilated Image', dilated_image)
cv.waitKey(0)
cv.destroyAllWindows()
Output
Original image
Dilated image
References
https://www.youtube.com/watch?v=2LAooUu1IjQ
https://www.youtube.com/watch?v=xO3ED27rMHs
Previous Next Home
No comments:
Post a Comment