Setup¶
We begin by importing the necessary libraries and defining a utility class for timing our code block executions.
import time
from IPython.display import display, HTML
def countdown_timer(minutes, seconds):
display(HTML(f"""
<p id="timer" style="font-size:300px; color: red;"></p>
<script>
var minutes = {minutes};
var seconds = {seconds};
var timer = document.getElementById("timer");
function updateTimer() else else
var minutesDisplay = minutes.toString().padStart(2, '0');
var secondsDisplay = seconds.toString().padStart(2, '0');
timer.innerHTML = minutesDisplay + ":" + secondsDisplay;
setTimeout(updateTimer, 1000);
}}
}}
updateTimer();
</script>
"""))
# Run the countdown timer for x minutes and x seconds
countdown_timer(6, 0)
Imports¶
Loading NumPy, Matplotlib, OpenCV, and specialized segmentation tools from skimage.
import cv2
import numpy as np
import matplotlib.pyplot as plt
from skimage import data
from skimage import graph
from skimage.segmentation import slic, mark_boundaries, join_segmentations
from skimage.measure import label
from sklearn.cluster import MeanShift
1. Image Segmentation¶
Segmentation aims to simplify the representation of an image into something more meaningful and easier to analyze.
Image segmentation is the process of dividing an image into meaningful regions or segments, typically based on similarities in color, intensity, or texture. The goal is to identify objects or regions of interest to simplify subsequent analysis.
Traditional segmentation techniques fall into several categories:
- Thresholding: Separating pixels based on their intensity values.
- Clustering-based methods: Grouping pixels with similar characteristics (e.g., K-Means).
- Edge-based methods: Detecting boundaries or discontinuities in the image.
- Region-based methods: Grouping pixels into regions based on some criteria.
- Graph-based methods: Treating segmentation as a graph partitioning problem.
While deep learning has become the standard for semantic and instance segmentation, traditional methods provide foundational intuition for understanding how regions can be mathematically separated.
2. Region Merging (Agglomerative Clustering)¶
Region merging is a bottom-up segmentation strategy.
In Region Merging, individual pixels or small regions are merged together to form larger regions based on certain similarity criteria. It is the opposite of region splitting (a top-down approach).
Agglomerative Clustering is a specific method used in region merging. Each pixel starts as its own cluster, and clusters are successively merged based on similarity. The merging process continues until a stopping condition is reached (e.g., a predefined number of clusters or a similarity threshold).
The algorithm typically follows these steps:
- Initialization: Each pixel or small group of pixels is treated as an individual cluster.
- Similarity Measure: Calculate a similarity measure (such as color, intensity, or texture) between all pairs of neighboring clusters.
- Merge: Merge the two most similar clusters based on the similarity measure.
- Update Similarity: Recalculate the similarity measure between the newly formed cluster and its neighbors.
- Repeat: Continue merging until the stopping condition is met.
# Load the astronaut image and convert it to RGB
astronaut_image = data.astronaut()
astronaut_image_rgb = astronaut_image
# Step 1: Apply mean shift filtering to smooth the image and enhance regions
spatial_radius = 21 # Spatial radius for mean shift filtering
color_radius = 51 # Color radius for mean shift filtering
filtered_image = cv2.pyrMeanShiftFiltering(astronaut_image_rgb, spatial_radius, color_radius)
# Step 2: Convert the filtered image to grayscale for region merging
gray_image = cv2.cvtColor(filtered_image, cv2.COLOR_RGB2GRAY)
# Step 3: Perform thresholding to separate regions
# Since we are doing a binary masking, we will only get two regions
_, binary_image = cv2.threshold(gray_image, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
# Step 4: Find connected components (initial regions)
num_labels, labels = cv2.connectedComponents(binary_image)
# Step 5: Merge regions based on color similarity
# We will calculate the mean color of each region and merge similar regions
def get_region_mean_color(image, label_image, label):
mask = (label_image == label)
mean_color = cv2.mean(image, mask.astype(np.uint8))[:3]
return mean_color
# Create a result image with the same shape as the original
result_image = np.zeros_like(astronaut_image_rgb)
# Map each region to a random color
for label in range(1, num_labels):
mean_color = get_region_mean_color(filtered_image, labels, label)
result_image[labels == label] = mean_color
# Display the original and region-merged images
fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(9, 9))
# Display the original image
ax1.imshow(astronaut_image_rgb)
ax1.set_title('Original Astronaut Image')
ax1.axis('off')
ax2.imshow(filtered_image)
ax2.set_title('Mean Shift Filtered Image')
ax2.axis('off')
ax3.imshow(binary_image)
ax3.set_title('Binary Image')
ax3.axis('off')
ax4.imshow(result_image)
ax4.set_title('Region Merging (Agglomerative Clustering)')
ax4.axis('off')
plt.tight_layout()
plt.show()
3. Watershed Algorithm¶
The Watershed algorithm leverages topological representations to segment images.
The watershed algorithm treats the image as a topographic surface. The brightness (intensity) of each pixel represents its elevation.
The goal is to find the "watershed lines" that separate different regions (catchment basins). The algorithm simulates "flooding" the image from different local minima starting points (markers). Boundaries are built where the waters from different sources meet. This approach is highly effective for segmenting objects that are touching or have overlapping edges.
For a deeper mathematical intuition of the watershed algorithm, you can review the detailed OpenCV Watershed Documentation.
# Load the coins image and convert it to grayscale
astronaut_image = data.astronaut()
gray_astronaut = cv2.cvtColor(astronaut_image, cv2.COLOR_RGB2GRAY)
#Threshold Processing
ret, bin_img = cv2.threshold(gray_astronaut, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
thresholded_image = bin_img.copy()
# noise removal
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
bin_img = cv2.morphologyEx(bin_img, cv2.MORPH_OPEN, kernel, iterations=2)
noise_reduced_image = bin_img.copy()
# Plot intermediate results to show each segmentation step
fig, axs = plt.subplots(1, 2, figsize=(6, 10))
axs[0].imshow(thresholded_image, cmap='gray')
axs[0].set_title('Thresholded image')
axs[0].axis('off')
axs[1].imshow(noise_reduced_image, cmap='gray')
axs[1].set_title('After removing noise')
axs[1].axis('off')
plt.tight_layout()
plt.show()
# Create subplots with 1 row and 2 columns
fig, axes = plt.subplots(nrows=2, ncols=2, figsize=(8, 8))
# sure background area
sure_bg = cv2.dilate(bin_img, kernel, iterations=3)
axes[0, 0].imshow(sure_bg, cmap='gray')
axes[0, 0].set_title('Sure Background')
axes[0, 0].axis('off')
# Distance transform
dist = cv2.distanceTransform(bin_img, cv2.DIST_L2, 5)
axes[0, 1].imshow(dist, cmap='gray')
axes[0, 1].set_title('Distance Transform')
axes[0, 1].axis('off')
#foreground area
ret, sure_fg = cv2.threshold(dist, 0.5 * dist.max(), 255, cv2.THRESH_BINARY)
sure_fg = sure_fg.astype(np.uint8)
axes[1, 0].imshow(sure_fg, cmap='gray')
axes[1, 0].set_title('Sure Foreground')
axes[1, 0].axis('off')
# unknown area
unknown = cv2.subtract(sure_bg, sure_fg)
axes[1, 1].imshow(unknown, cmap='gray')
axes[1, 1].set_title('Unknown')
axes[1, 1].axis('off')
plt.show()
ret, markers = cv2.connectedComponents(sure_fg)
# Add one to all labels so that background is not 0, but 1
markers += 1
# mark the region of unknown with zero
markers[unknown == 255] = 0
fig, ax = plt.subplots(figsize=(6, 6))
ax.imshow(markers, cmap="tab20b")
ax.axis('off')
plt.show()
# watershed Algorithm
markers = cv2.watershed(astronaut_image, markers)
fig, ax = plt.subplots(figsize=(5, 5))
ax.imshow(markers, cmap="tab20b")
ax.axis('off')
plt.show()
labels = np.unique(markers)
coins = []
for label in labels[2:]:
# Create a binary image in which only the area of the label is in the foreground
#and the rest of the image is in the background
target = np.where(markers == label, 255, 0).astype(np.uint8)
# Perform contour extraction on the created binary image
contours, hierarchy = cv2.findContours(
target, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE
)
coins.append(contours[0])
# Draw the outline
img = cv2.drawContours(astronaut_image, coins, -1, color=(0, 23, 223), thickness=2)
fig, ax = plt.subplots(figsize=(6, 6))
ax.imshow(img)
ax.axis('off')
(-0.5, 511.5, 511.5, -0.5)
4. Graph-based Segmentation¶
Graph-based methods provide a powerful framework for grouping pixels by mapping the image to a mathematical graph structure.
In graph-based segmentation, an image is represented as an undirected graph $G = (V, E)$, where each pixel or region is treated as a node $v \in V$, and the edges $e \in E$ represent the relationships (such as intensity or color differences) between nodes.
The goal is to partition the graph into segments such that we minimize the similarity between distinct regions while maximizing the similarity within a single region.
A typical graph-based pipeline involves:
- SLIC Superpixel Segmentation: The image is first divided into smaller, coherent regions using SLIC (Simple Linear Iterative Clustering). This dramatically reduces the number of nodes in the graph, making the actual graph segmentation computationally tractable.
- Region Adjacency Graph (RAG): A graph is constructed where each superpixel is a node, and edges connect neighboring superpixels. Edge weights denote intensity/color differences.
- Graph-based Segmentation: The graph is segmented by evaluating and cutting edges (e.g., using a thresholding method
cut_threshold) to merge regions with similar properties.
4.1 SLIC (Simple Linear Iterative Clustering)¶
SLIC clusters pixels based on color similarity and spatial proximity, producing small, contiguous regions known as superpixels.
- Initialization: The algorithm starts by dividing the image into a grid. The number of blocks corresponds to the desired number of superpixels (
n_segments). The centers of these blocks are adjusted to the nearest gradient minimum to avoid placing them exactly on an edge. - Distance Calculation: SLIC uses a 5D distance metric combining color distance ($d_c$) in CIELAB space and spatial distance ($d_s$): $$D = \sqrt{d_{c}^{2} + \left( \frac{d_{s}}{m} \right)^{2}}$$ where $m$ is a parameter controlling the compactness.
- Iterative Assignment: Pixels in a neighborhood are assigned to the nearest cluster center based on distance $D$.
- Recalculation: Cluster centers are recomputed as the mean of all assigned pixels.
- Repeat: The process iterates until convergence.
- Connectivity: Post-processing ensures all superpixels are fully connected.
Key SLIC Parameters:
- n_segments: The target number of superpixels. Higher values yield finer segmentation.
- compactness: Balances color and spatial proximity. Higher values produce squarer, more spatially compact superpixels.
- sigma: Gaussian smoothing applied before clustering to reduce noise.
# Load the coins image
astronaut_image = data.astronaut()
# Step 1: Apply SLIC superpixel segmentation to divide the image into regions
segments = slic(astronaut_image, n_segments=200, compactness=10, sigma=1, start_label=1)
# Step 2: Construct a Region Adjacency Graph (RAG) based on the mean color of each segment
rag = graph.rag_mean_color(astronaut_image, segments)
# Step 3: Apply graph-based segmentation using threshold-based merging
# Merge regions based on intensity similarity with a threshold
normal_labels = graph.cut_threshold(segments, rag, 29)
# Step 4: Display the original image, segments, and graph-based segmentation results
fig, ax = plt.subplots(1, 3, figsize=(15, 5))
# Original Image
ax[0].imshow(astronaut_image, cmap='gray')
ax[0].set_title('Original Image')
ax[0].axis('off')
# Superpixel Segmentation
ax[1].imshow(mark_boundaries(astronaut_image, segments))
ax[1].set_title('SLIC Superpixels')
ax[1].axis('off')
# Graph-based Segmentation
ax[2].imshow(mark_boundaries(astronaut_image, normal_labels))
ax[2].set_title('Graph-based Segmentation')
ax[2].axis('off')
plt.tight_layout()
plt.show()
4.2 Probabilistic Aggregation¶
Probabilistic aggregation groups regions based on a statistical model of their similarity.
- It treats each pixel as an individual region. Similarity is computed based on color, texture, and spatial properties.
- Regions are merged based on a probabilistic criterion, determining the likelihood that two adjacent regions belong to the same object.
- It typically employs hierarchical clustering techniques.
# Reference:
# https://scikit-image.org/docs/dev/auto_examples/segmentation/plot_rag_merge.html#sphx-glr-auto-examples-segmentation-plot-rag-merge-py
def _weight_mean_color(graph, src, dst, n):
diff = graph.nodes[dst]['mean color'] - graph.nodes[n]['mean color']
diff = np.linalg.norm(diff)
return {'weight': diff}
def merge_mean_color(graph, src, dst):
graph.nodes[dst]['total color'] += graph.nodes[src]['total color']
graph.nodes[dst]['pixel count'] += graph.nodes[src]['pixel count']
graph.nodes[dst]['mean color'] = (
graph.nodes[dst]['total color'] / graph.nodes[dst]['pixel count']
)
# Start with SLIC superpixels as before
segments = slic(astronaut_image, n_segments=100, compactness=10, sigma=1, start_label=1)
# Create Region Adjacency Graph (RAG)
rag = graph.rag_mean_color(astronaut_image, segments)
# Perform probabilistic aggregation using hierarchical clustering
pa_labels = graph.merge_hierarchical(segments, rag, thresh=35, rag_copy=False,
in_place_merge=True, merge_func=merge_mean_color,
weight_func=_weight_mean_color)
# Display result
plt.figure(figsize=(8, 8))
plt.imshow(mark_boundaries(astronaut_image, pa_labels))
plt.title('Probabilistic Aggregation Segmentation')
plt.axis('off')
plt.show()
4.3 Mean Shift¶
Mean Shift is a non-parametric, density-based clustering algorithm.
- It treats each pixel as a data point in a feature space (e.g., spatial coordinates + color).
- The algorithm iteratively shifts each point toward the region of the highest density (the mode) using a kernel density estimate.
- Pixels converging to the same mode are assigned to the same segment.
- A major advantage is that it does not require specifying the number of clusters a priori.
# Convert astronaut image to grayscale
gray_astronaut = cv2.cvtColor(astronaut_image, cv2.COLOR_RGB2GRAY)
# Reshape the image to a 2D array of pixel values
flat_image = gray_astronaut.flatten().reshape(-1, 1)
# Apply Mean Shift clustering
mean_shift = MeanShift(bandwidth=1, n_jobs=-1).fit(flat_image)
# Reshape the labels back to the original image shape
ms_labels = mean_shift.labels_.reshape(gray_astronaut.shape)
# Display result
plt.figure(figsize=(8, 8))
plt.imshow(ms_labels, cmap='nipy_spectral')
plt.title('Mean Shift Segmentation')
plt.axis('off')
plt.show()
# Display result
plt.figure(figsize=(8, 8))
plt.imshow(mark_boundaries(astronaut_image, ms_labels))
plt.title('Probabilistic Aggregation Segmentation')
plt.axis('off')
plt.show()
4.4 Normalized Cuts¶
Normalized Cuts (N-Cuts) elegantly formalizes segmentation as an eigenvalue problem.
- N-Cuts aims to partition a graph into segments while minimizing the "cut" cost (the sum of the weights of the edges that must be removed to separate the partitions).
- Standard minimum cut algorithms tend to isolate small, outlier regions. N-Cuts prevents this by normalizing the cut cost against the total edge weights of all nodes in the partition, favoring balanced segment sizes.
- The exact solution is NP-hard, but it can be efficiently approximated by solving a generalized eigenvalue problem on the graph's Laplacian matrix.
# Reuse SLIC superpixels for initial segmentation
segments = slic(astronaut_image, n_segments=200, compactness=10, sigma=1, start_label=1)
# Create a Region Adjacency Graph (RAG)
rag = graph.rag_mean_color(astronaut_image, segments, mode='similarity')
# Perform Normalized Cut on the RAG
nc_labels = graph.cut_normalized(segments, rag)
# Display result
plt.figure(figsize=(8, 8))
plt.imshow(mark_boundaries(astronaut_image, nc_labels))
plt.title('Normalized Cuts Segmentation')
plt.axis('off')
plt.show()
5. Method Comparison¶
Let's visualize and compare the performance and characteristics of these graph-based segmentation methods.
fig, axs = plt.subplots(2, 2, figsize=(9, 9))
axs[0, 0].imshow(mark_boundaries(astronaut_image, normal_labels))
axs[0, 0].set_title('Normal Graph-based')
axs[0, 0].axis('off')
axs[0, 1].imshow(mark_boundaries(astronaut_image, pa_labels))
axs[0, 1].set_title('Probabilistic Aggregation')
axs[0, 1].axis('off')
axs[1, 0].imshow(ms_labels, cmap='nipy_spectral')
axs[1, 0].set_title('Mean Shift')
axs[1, 0].axis('off')
axs[1, 1].imshow(mark_boundaries(astronaut_image, nc_labels))
axs[1, 1].set_title('Normalized Cuts')
axs[1, 1].axis('off')
plt.tight_layout()
plt.show()