How to Compare Histograms of Two Images Using OpenCV Python?
When it comes to image processing tasks, comparing histograms of two images is a common and important task. A histogram of an image shows the frequency distribution of its pixel intensities. Comparing the histograms of two images helps in understanding the similarity between them. In this article, we will learn how to compare histograms of two images using OpenCV Python.
Prerequisites
To follow along with this article, we need to have the following installed:
- OpenCV for Python
- Matplotlib for Python
- Numpy for Python
We can install these libraries using pip, by running the following commands:
pip install opencv-python
pip install matplotlib
pip install numpy
Understanding Histograms
A histogram of an image represents the distribution of pixel values in that image. The x-axis of a histogram shows the different pixel values, and the y-axis shows the frequency of occurrence of each pixel value. Histograms help in understanding the overall brightness and contrast of an image, and thus they are widely used in image processing tasks.
Types of Histograms
There are different types of histograms used in image processing. The most commonly used types are grayscale histograms, color histograms, and 2D histograms.
Grayscale Histograms
Grayscale histograms represent the distribution of pixel intensities in a grayscale image. Grayscale images use only one channel for representing pixel intensities, and thus their histograms have only one channel as well. We can use the OpenCV function cv2.calcHist()
to calculate grayscale histograms.
Let’s see an example of how to calculate and visualize the grayscale histogram of an image.
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Load the image in grayscale mode
img = cv2.imread('lena.png', cv2.IMREAD_GRAYSCALE)
# Calculate the histogram
hist = cv2.calcHist([img], [0], None, [256], [0, 256])
# Visualize the histogram
plt.hist(img.ravel(), 256, [0, 256])
plt.show()
In the above code, we first loaded the image in grayscale mode using the cv2.imread()
function. Then we calculated the histogram using the cv2.calcHist()
function, which takes the following arguments:
images
: The input image(s) for which the histogram needs to be calculated. It should be enclosed in square brackets.channels
: The channel(s) on which the histogram will be calculated. For grayscale images, there is only one channel, so we pass[0]
.mask
: An optional mask used to specify which pixels to include in the histogram calculation. If None, all pixels are included.histSize
: The number of bins in the histogram.ranges
: The range of pixel values to be included in the histogram. The range is[0, 256]
for grayscale images.
After calculating the histogram, we visualized it using the plt.hist()
function from the Matplotlib library.
Color Histograms
Color histograms represent the distribution of pixel intensities in a color image. Color images use three channels (Red, Green, and Blue) for representing pixel intensities, and thus their histograms have three channels as well. We can use the OpenCV function cv2.calcHist()
to calculate color histograms.
Let’s see an example of how to calculate and visualize the color histogram of an image.
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Load the image in color mode
img = cv2.imread('lena.png', cv2.IMREAD_COLOR)
# Calculate the histogram
hist = cv2.calcHist([img], [0, 1, 2], None, [256, 256, 256], [0, 256, 0, 256, 0, 256])
# Visualize the histogram
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
r, g, b = cv2.split(img)
ax.scatter(r.ravel(), g.ravel(), b.ravel(), c='b', marker='o')
ax.set_xlabel('Red')
ax.set_ylabel('Green')
ax.set_zlabel('Blue')
plt.show()
In the above code, we first loaded the image in color mode using the cv2.imread()
function. Then we calculated the histogram using the cv2.calcHist()
function, which takes the following arguments:
images
: The input image(s) for which the histogram needs to be calculated. It should be enclosed in square brackets.channels
: The channel(s) on which the histogram will be calculated. For color images, we pass[0, 1, 2]
to calculate histograms for all three channels.mask
: An optionalmask used to specify which pixels to include in the histogram calculation. If None, all pixels are included.histSize
: The number of bins in each channel of the histogram. In our example, we used[256, 256, 256]
to have 256 bins for each channel.ranges
: The range of pixel values to be included in the histogram for each channel. The range is[0, 256]
for each channel of a color image.
After calculating the histogram, we visualized it using a 3D scatter plot from the Matplotlib library.
2D Histograms
2D histograms represent the joint distribution of pixel intensities in two channels of a color image. They help in understanding the correlation between two color channels. We can use the OpenCV function cv2.calcHist()
to calculate 2D histograms.
Let’s see an example of how to calculate and visualize the 2D histogram of an image.
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Load the image in color mode
img = cv2.imread('lena.png', cv2.IMREAD_COLOR)
# Calculate the histogram
hist = cv2.calcHist([img], [0, 1], None, [256, 256], [0, 256, 0, 256])
# Visualize the histogram
plt.imshow(hist, interpolation = 'nearest')
plt.show()
In the above code, we first loaded the image in color mode using the cv2.imread()
function. Then we calculated the histogram using the cv2.calcHist()
function, which takes the following arguments:
images
: The input image(s) for which the histogram needs to be calculated. It should be enclosed in square brackets.channels
: The channel(s) on which the histogram will be calculated. For a 2D histogram, we pass[0, 1]
to calculate histograms for two channels.mask
: An optional mask used to specify which pixels to include in the histogram calculation. If None, all pixels are included.histSize
: The number of bins in each channel of the histogram. In our example, we used[256, 256]
to have 256 bins for each channel.ranges
: The range of pixel values to be included in the histogram for each channel. The range is[0, 256]
for each channel of a color image.
After calculating the histogram, we visualized it using plt.imshow()
function from the Matplotlib library.
Comparing Histograms of Two Images
Now that we know how to calculate histograms of different types, let’s move on to comparing histograms of two images. There are different methods used for comparing histograms, and we will discuss some of them here.
Correlation Method
The correlation method is one of the simplest methods for comparing histograms. It calculates the correlation coefficient between the histograms of two images. The correlation coefficient measures the degree of linear relationship between two variables, and it has a value between -1 and 1. A value of -1 means a perfect negative linear relationship, 0 means no linear relationship, and 1 means a perfect positive linear relationship.
Let’s see an example of how to compare histograms of two images using the correlation method.
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Load the images in grayscale mode
img1 = cv2.imread('lena.png', cv2.IMREAD_GRAYSCALE)
img2 = cv2.imread('lena_modified.png', cv2.IMREAD_GRAYSCALE)
# Calculate the histograms
hist1 = cv2.calcHist([img1], [0], None, [256], [0, 256])
hist2 = cv2.calcHist([img2], [0], None, [256], [0, 256])
# Compare the histograms using the correlation method
corr = cv2.compareHist(hist1, hist2, cv2.HISTCMP_CORREL)
# Print the correlation coefficient
print(corr)
In the above code, we first loaded two images in grayscale mode using the cv2.imread()
function. Then we calculated the histograms of both images using the cv2.calcHist()
function. After that, we compared the histograms using the cv2.compareHist()
function, which takes the following arguments:
H1
: The first histogram to be compared.H2
: The second histogram to be compared.method
: The method used for comparing histograms. We usedcv2.HISTCMP_CORREL
for the correlation method.
The cv2.compareHist()
function returns a value between 0 and 1, where a value closer to 1 means a better match between the histograms.
Chi-Square Method
The Chi-Square method isanother method commonly used for comparing histograms. It calculates the Chi-Square distance between the histograms of two images. The Chi-Square distance is a non-negative value that measures the distance between two probability distributions. A smaller value means a better match between the histograms.
Let’s see an example of how to compare histograms of two images using the Chi-Square method.
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Load the images in grayscale mode
img1 = cv2.imread('lena.png', cv2.IMREAD_GRAYSCALE)
img2 = cv2.imread('lena_modified.png', cv2.IMREAD_GRAYSCALE)
# Calculate the histograms
hist1 = cv2.calcHist([img1], [0], None, [256], [0, 256])
hist2 = cv2.calcHist([img2], [0], None, [256], [0, 256])
# Compare the histograms using the Chi-Square method
chi = cv2.compareHist(hist1, hist2, cv2.HISTCMP_CHISQR)
# Print the Chi-Square distance
print(chi)
In the above code, we followed the same steps as in the correlation method, except that we used cv2.HISTCMP_CHISQR
for the Chi-Square method in the cv2.compareHist()
function.
Intersection Method
The Intersection method is another method used for comparing histograms. It calculates the intersection between the histograms of two images. The intersection represents the overlapped area between the two histograms, and a larger intersection means a better match between the histograms.
Let’s see an example of how to compare histograms of two images using the Intersection method.
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Load the images in grayscale mode
img1 = cv2.imread('lena.png', cv2.IMREAD_GRAYSCALE)
img2 = cv2.imread('lena_modified.png', cv2.IMREAD_GRAYSCALE)
# Calculate the histograms
hist1 = cv2.calcHist([img1], [0], None, [256], [0, 256])
hist2 = cv2.calcHist([img2], [0], None, [256], [0, 256])
# Compare the histograms using the intersection method
inter = cv2.compareHist(hist1, hist2, cv2.HISTCMP_INTERSECT)
# Print the intersection value
print(inter)
In the above code, we followed the same steps as in the correlation method, except that we used cv2.HISTCMP_INTERSECT
for the Intersection method in the cv2.compareHist()
function.
Conclusion
In this article, we learned how to compare histograms of two images using OpenCV Python. We covered different types of histograms, such as grayscale histograms, color histograms, and 2D histograms. We also discussed some common methods used for comparing histograms, such as the correlation method, the Chi-Square method, and the Intersection method. By using these methods, we can compare histograms of two images and determine how similar or different they are.