Setup¶
We begin by importing the necessary libraries and defining a utility class for timing our code block executions.
import time
from IPython.display import display, HTML
def countdown_timer(minutes, seconds):
display(HTML(f"""
<p id="timer" style="font-size:300px; color: red;"></p>
<script>
var minutes = {minutes};
var seconds = {seconds};
var timer = document.getElementById("timer");
function updateTimer() else else
var minutesDisplay = minutes.toString().padStart(2, '0');
var secondsDisplay = seconds.toString().padStart(2, '0');
timer.innerHTML = minutesDisplay + ":" + secondsDisplay;
setTimeout(updateTimer, 1000);
}}
}}
updateTimer();
</script>
"""))
# Run the countdown timer for x minutes and x seconds
countdown_timer(6, 0)
1. High Performance Computing (HPC) Basics¶
High Performance Computing allows us to process large datasets and train complex models efficiently.
NYU HPC has a documentation site: HPC Documentation
To get HPC access, follow this link: Creating an HPC account
Once you get an HPC account, don't create CONDA envs directly, follow this link: Using Singularity with Miniconda
Why should you not create a conda env and work directly? Each directory has a limited amount of files it can handle. Installing python modules directly prompts the system to download everything in the home directory and this directory doesn't have the capacity to handle that many packages.
Once you have HPC access and Singularity working, you will need to schedule your jobs.
I am assuming your code needs considerable time to run (That's the type of code you are going to put on HPC anyway) and because of that you need some kind of scheduler that runs your code. NYU HPC uses SLURM jobs to help you achieve that hurdle. Make sure to follow these documentations while creating a SLURM job:
There are some limitations to how long you can keep your jobs running. Refer to this documentation to follow the best practices for running a job: NYU HPC Best Practices
There have been some updates to the NYU HPC and because of that each SLURM job requires a job id for the job to go through. If the project you are building is related to the final project for this class (Computer Vision CS-GY 6643) then add its respective project id under that class.
1.1 Sample SLURM Job Script¶
Here is a template for a SLURM job script that you can use as a starting point. It includes necessary commands to request resources and execute a Python script within a Singularity container.
#!/bin/bash
# This is the shebang line, specifying the script should be executed using bash.
#SBATCH --nodes=x
# Request x compute node for the job.
#SBATCH --account=pr_130_general
# Specify the account under which the job should be charged (Computer Vision in this case)
#SBATCH --ntasks-per-node=x
# Request x task (process) per node.
#SBATCH --cpus-per-task=x
# Allocate x CPU cores to the task.
#SBATCH --time=hh:mm:ss
# Set the maximum runtime to hh:mm:ss. The job will be killed if it exceeds this time limit.
#SBATCH --mem=xGB
# Request x GB of memory for the job.
#SBATCH --job-name=<job_name>
# Name the job for easy identification.
#SBATCH --mail-type=END
# Send an email when the job ends.
#SBATCH --mail-user=<user_id>@nyu.edu
# Specify the email address to receive notifications about the job.
#SBATCH --output=<name_output_file>_%j.out
# Save the job’s output to a file. The "%j" will be replaced with the job ID.
module purge
# Remove all loaded modules to ensure a clean environment.
cd /scratch/<net_id>/<singularity_directory>/
# Change directory to the working directory on the scratch filesystem where your data is stored.
singularity exec \
--overlay /scratch/mkp6112/LFP/overlay-15GB-500K.ext3:rw \
/scratch/work/public/singularity/cuda12.3.2-cudnn9.0.0-ubuntu-22.04.4.sif \
/bin/bash -c "source /ext3/env.sh; cd region_decoding/script/; python save_features.py"
# This is my script, use that as a base to understand how to create the command
# Use Singularity to run a container. The 'overlay' allows you to write data to the container.
# The Singularity image is a pre-built CUDA 12.3.2 environment with cuDNN 9.
# Inside the container, source the environment and run the 'save_features.py' Python script
# from the 'region_decoding/script/' directory.
echo "Job Over"
# Print a message to indicate the job is done.
Imports¶
import numpy as np
import cv2
import matplotlib.pyplot as plt
import matplotlib_inline.backend_inline
from skimage import data, color
matplotlib_inline.backend_inline.set_matplotlib_formats('pdf', 'svg')
2. Fast Normalized Cross Correlation (NCC)¶
Normalized Cross Correlation is widely used for template matching, but a naive implementation is extremely slow.
This method is one of the ways of solving Q4 of Project 1. Here's the paper: Clicky-Clicky
Basic (Slow) NCC Implementation¶
The traditional method of computing NCC requires calculating the mean intensity of the template and the image patch, subtracting the mean from the intensities, and computing the cross-correlation in a brute-force manner. This is computationally expensive, especially for large images.
# Function to compute traditional NCC
def normalized_cross_correlation(img, template):
h, w = template.shape
img_h, img_w = img.shape
ncc_map = np.zeros((img_h - h + 1, img_w - w + 1))
# Sliding the template over the image
for i in range(img_h - h + 1):
for j in range(img_w - w + 1):
# Extract a patch from the image
img_patch = img[i:i + h, j:j + w]
# Calculate means
img_mean = np.mean(img_patch)
template_mean = np.mean(template)
# Zero-mean versions
img_patch_zero_mean = img_patch - img_mean
template_zero_mean = template - template_mean
# Compute the numerator of the NCC formula (cross-correlation)
numerator = np.sum(img_patch_zero_mean * template_zero_mean)
# Compute the denominator (normalization)
denominator = np.sqrt(np.sum(img_patch_zero_mean ** 2) * np.sum(template_zero_mean ** 2))
# Avoid division by zero
if denominator == 0:
ncc_map[i, j] = 0
else:
ncc_map[i, j] = numerator / denominator
return ncc_map
%%time
# Load an example image
image = color.rgb2gray(data.astronaut())
template = image[25:175, 150:300] # Sample template
# Compute NCC using the traditional slow method
ncc_slow = normalized_cross_correlation(image, template)
# Plot the result
plt.figure(figsize=(12, 6))
plt.subplot(1, 3, 1)
plt.imshow(image, cmap='gray')
plt.title("Original Image")
plt.axis('off')
plt.subplot(1, 3, 2)
plt.imshow(template, cmap='gray')
plt.title("Template Image")
plt.axis('off')
plt.subplot(1, 3, 3)
plt.imshow(ncc_slow, cmap='viridis')
plt.title("Traditional NCC Map")
plt.axis('off')
plt.show()
CPU times: user 39.5 s, sys: 245 ms, total: 39.7 s Wall time: 43 s
Why Naive NCC is Slow:
- Redundant Calculations: The same values for sums and mean are recalculated multiple times as the template slides over the image.
- Nested Loops: The method involves iterating over every possible patch in the image, leading to $O(n^4)$ complexity for large images.
2.1 Optimization Introduced by J.P. Lewis (Fast NCC)¶
The fast method proposed by J.P. Lewis focuses on precomputing sums for both the image and template and using a sliding window to avoid recalculating the mean and standard deviation from scratch every time.
Key Optimizations:
- Precompute sum tables (integral images) for each image patch.
- Use a sliding window to update sums rather than recalculating them entirely.
# Function to compute fast NCC using precomputed sums
def fast_ncc(img, template):
h, w = template.shape
img_h, img_w = img.shape
# Precompute template values
template_mean = np.mean(template)
template_zero_mean = template - template_mean
template_sum_square = np.sum(template_zero_mean ** 2)
# Compute integral images (sum of pixel intensities and squared pixel intensities)
integral_img = np.cumsum(np.cumsum(img, axis=0), axis=1)
integral_img_square = np.cumsum(np.cumsum(img**2, axis=0), axis=1)
ncc_map = np.zeros((img_h - h + 1, img_w - w + 1))
for i in range(img_h - h + 1):
for j in range(img_w - w + 1):
# Define the region (i:i+h, j:j+w) using integral image
sum_region = integral_img[i + h - 1, j + w - 1]
sum_square_region = integral_img_square[i + h - 1, j + w - 1]
if i > 0:
sum_region -= integral_img[i - 1, j + w - 1]
sum_square_region -= integral_img_square[i - 1, j + w - 1]
if j > 0:
sum_region -= integral_img[i + h - 1, j - 1]
sum_square_region -= integral_img_square[i + h - 1, j - 1]
if i > 0 and j > 0:
sum_region += integral_img[i - 1, j - 1]
sum_square_region += integral_img_square[i - 1, j - 1]
# Compute mean and variance from the integral image
img_mean = sum_region / (h * w)
img_variance = sum_square_region / (h * w) - img_mean ** 2
# Calculate the normalized cross-correlation
img_patch_zero_mean = img[i:i + h, j:j + w] - img_mean
numerator = np.sum(img_patch_zero_mean * template_zero_mean)
denominator = np.sqrt(template_sum_square * img_variance * h * w)
# Avoid division by zero
if denominator == 0:
ncc_map[i, j] = 0
else:
ncc_map[i, j] = numerator / denominator
return ncc_map
%%time
# Load an example image
image = color.rgb2gray(data.astronaut())
template = image[25:175, 150:300] # Sample template
# Compute NCC using the fast method
ncc_fast = fast_ncc(image, template)
# Plot the fast NCC result
plt.figure(figsize=(12, 6))
plt.subplot(1, 3, 1)
plt.imshow(image, cmap='gray')
plt.title("Original Image")
plt.axis('off')
plt.subplot(1, 3, 2)
plt.imshow(template, cmap='gray')
plt.title("Template Image")
plt.axis('off')
plt.subplot(1, 3, 3)
plt.imshow(ncc_fast, cmap='viridis')
plt.title("Fast NCC Map")
plt.axis('off')
plt.show()
CPU times: user 14.9 s, sys: 449 ms, total: 15.4 s Wall time: 17.7 s
# Plot the NCC results
plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 1)
plt.imshow(ncc_slow, cmap='viridis')
plt.title("Slow NCC Map")
plt.axis('off')
plt.subplot(1, 2, 2)
plt.imshow(ncc_fast, cmap='viridis')
plt.title("Fast NCC Map")
plt.axis('off')
plt.show()
3. Image Warping¶
Image warping refers to transforming an image to a new geometry by applying a spatial transformation. This involves changing the positions of pixels based on a mathematical mapping function, essentially 'warping' or distorting the image. It’s widely used in computer vision tasks such as correcting lens distortions, aligning images, and applying artistic effects.
Common types of warping include:
- Affine transformations: Preserves lines and parallelism (translation, rotation, scaling, shearing).
- Projective transformations (Homographies): Maps any quadrilateral to any other quadrilateral, often used for changing perspectives.
# Affine Transformation
# Load the astronaut image
image = data.astronaut()
image_gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
# Get the dimensions of the image
rows, cols = image_gray.shape
# Define points for affine transformation (moving 3 points)
pts1 = np.float32([[50, 50], [200, 50], [50, 200]]) # Before
pts2 = np.float32([[10, 100], [200, 50], [100, 250]]) # After
# Get the affine transformation matrix
M = cv2.getAffineTransform(pts1, pts2)
# Apply affine warp
warped_image = cv2.warpAffine(image_gray, M, (cols, rows))
# Plot original and warped images
plt.figure(figsize=(10, 5))
plt.subplot(1, 2, 1)
plt.imshow(image_gray, cmap='gray')
plt.title('Original Image')
plt.axis('off')
plt.subplot(1, 2, 2)
plt.imshow(warped_image, cmap='gray')
plt.title('Warped Image (Affine)')
plt.axis('off')
plt.show()
4. Image Registration¶
Image registration is the process of aligning two or more images of the same scene taken at different times, from different viewpoints, or by different sensors. The goal is to transform the images to a common coordinate system to facilitate comparison or combination.
Once registered, the images can be stitched together, or used for temporal analysis such as tracking changes over time.
# Load the image and create a transformed version (rotation + translation)
original_image = data.astronaut()
rows, cols, _ = original_image.shape
# Create a rotation matrix and apply transformation
M = cv2.getRotationMatrix2D((cols/2, rows/2), 40, 1) # Rotate 40 degrees
transformed_image = cv2.warpAffine(original_image, M, (cols, rows))
# Plot the images
plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 1)
plt.imshow(original_image)
plt.title("Image 1")
plt.axis('off')
plt.subplot(1, 2, 2)
plt.imshow(transformed_image)
plt.title("Image 2")
plt.axis('off')
(-0.5, 511.5, 511.5, -0.5)
# Convert images to grayscale
gray_original = cv2.cvtColor(original_image, cv2.COLOR_BGR2GRAY)
gray_transformed = cv2.cvtColor(transformed_image, cv2.COLOR_BGR2GRAY)
# Initialize ORB detector
orb = cv2.ORB_create()
# Find keypoints and descriptors with ORB
keypoints1, descriptors1 = orb.detectAndCompute(gray_original, None)
keypoints2, descriptors2 = orb.detectAndCompute(gray_transformed, None)
# Visualize keypoints
img1_kp = cv2.drawKeypoints(gray_original, keypoints1, None, color=(0,255,0))
img2_kp = cv2.drawKeypoints(gray_transformed, keypoints2, None, color=(0,255,0))
# Plot the keypoints
plt.figure(figsize=(10,5))
plt.subplot(1, 2, 1)
plt.imshow(cv2.cvtColor(img1_kp, cv2.COLOR_BGR2RGB))
plt.title("Keypoints in Image 1")
plt.axis('off')
plt.subplot(1, 2, 2)
plt.imshow(cv2.cvtColor(img2_kp, cv2.COLOR_BGR2RGB))
plt.title("Keypoints in Image 2")
plt.axis('off')
plt.show()
# Create a BFMatcher object
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
# Match descriptors
matches = bf.match(descriptors1, descriptors2)
# Sort matches based on distance
matches = sorted(matches, key=lambda x: x.distance)
# Draw the matches
matched_image = cv2.drawMatches(original_image, keypoints1, transformed_image, keypoints2, matches[:30], None, flags=cv2.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS)
# Display results
plt.figure(figsize=(12, 6))
plt.imshow(matched_image)
plt.title("Feature Matches")
plt.axis('off')
plt.show()
5. Image Stitching¶
Stitching in Computer Vision is typically used to combine multiple images into a larger panorama. It involves aligning overlapping areas of images, finding keypoints, matching features, and blending the images seamlessly.
Let's walk through the manual and automated ways to stitch images.
5.1 Manual Image Stitching Pipeline¶
Building a panorama from scratch involves several key steps: slicing, keypoint detection, homography calculation, and blending.
Step 1: Slicing the Image¶
We start by dividing a wide image into overlapping segments to simulate taking multiple photos of a scene.
# Load the astronaut image
astronaut_image = data.astronaut()
# Convert the image to grayscale
gray_astronaut = cv2.cvtColor(astronaut_image, cv2.COLOR_RGB2GRAY)
# Dimensions of the image
height, width = gray_astronaut.shape
# Slice the image into two halves with a 30% overlap
slice_position = int(width * 0.7) # 70% width for the first image, 30% overlap
image1 = gray_astronaut[:, :slice_position] # First slice
image2 = gray_astronaut[:, int(width * 0.3):] # Second slice (30% overlap with image1)
# Plot the images
plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 1)
plt.imshow(image1, cmap='gray')
plt.title("Image 1")
plt.axis('off')
plt.subplot(1, 2, 2)
plt.imshow(image2, cmap='gray')
plt.title("Image 2")
plt.axis('off')
(-0.5, 358.5, 511.5, -0.5)
Step 2: Keypoint Detection and Feature Matching¶
Using algorithms like SIFT or ORB, we detect distinctive keypoints in both images and match them. RANSAC is often used here to filter out incorrect matches.
# ORB feature detection and description
orb = cv2.ORB_create()
# Find keypoints and descriptors for both slices
keypoints1, descriptors1 = orb.detectAndCompute(image1, None)
keypoints2, descriptors2 = orb.detectAndCompute(image2, None)
# Brute Force matcher
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
matches = bf.match(descriptors1, descriptors2)
# Sort matches by distance (best matches first)
matches = sorted(matches, key=lambda x: x.distance)
# Draw matches
match_img = cv2.drawMatches(image1, keypoints1, image2, keypoints2, matches[:20], None, flags=2)
# Display the matched keypoints
plt.figure(figsize=(12, 6))
plt.imshow(match_img, cmap='gray')
plt.title('Matched Keypoints Between Slices')
plt.axis('off')
plt.show()
Step 3: Homography Calculation and Image Warping¶
Using the matched keypoints, we compute the homography matrix to warp one image so it aligns with the perspective of the other.
# Extract matched points
src_pts = np.float32([keypoints1[m.queryIdx].pt for m in matches]).reshape(-1, 1, 2)
dst_pts = np.float32([keypoints2[m.trainIdx].pt for m in matches]).reshape(-1, 1, 2)
# Find the homography matrix
H, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC, 5.0)
# Warp the second image to align with the first
warped_image2 = cv2.warpPerspective(image2, H, (width, height))
# Plot the warped image
plt.figure(figsize=(10, 5))
plt.imshow(warped_image2, cmap='gray')
plt.title('Stitched Image from Sliced Astronaut Image')
plt.axis('off')
plt.show()
Step 4: Image Blending¶
Finally, we blend the overlapping regions. Simple overwriting can cause visible seams, so gradient blending or feathering is preferred for a smooth transition.
# Create a canvas for the panorama
stitched_image = np.zeros((height, width), dtype=np.uint8)
# Place the first image and the warped second image on the canvas
stitched_image[:, :slice_position] = image1
stitched_image[:, slice_position:] = warped_image2[:, 50:204] # Had to manually adjust the slice positions
# Display the final stitched result
plt.figure(figsize=(10, 5))
plt.imshow(stitched_image, cmap='gray')
plt.title('Stitched Image from Sliced Astronaut Image')
plt.axis('off')
plt.show()
5.2 Automated Stitching with OpenCV¶
OpenCV provides a highly optimized Stitcher class that encapsulates all the manual steps above into a single robust pipeline.
# Load the astronaut image
astronaut_image = data.astronaut()
# Convert the image to BGR for OpenCV compatibility
astronaut_bgr = cv2.cvtColor(astronaut_image, cv2.COLOR_RGB2BGR)
# Dimensions of the image
height, width, _ = astronaut_bgr.shape
# Slice the image into two halves with a 30% overlap
slice_position = int(width * 0.7) # 70% width for the first image, 30% overlap
image1 = astronaut_bgr[:, :slice_position] # First slice
image2 = astronaut_bgr[:, int(width * 0.3):] # Second slice (30% overlap with image1)
# Plot the images
plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 1)
plt.imshow(cv2.cvtColor(image1, cv2.COLOR_BGR2RGB))
plt.title("Image 1")
plt.axis('off')
plt.subplot(1, 2, 2)
plt.imshow(cv2.cvtColor(image2, cv2.COLOR_BGR2RGB))
plt.title("Image 2")
plt.axis('off')
(-0.5, 358.5, 511.5, -0.5)
# Create a stitcher object
stitcher = cv2.Stitcher_create()
# Stitch the images
_, stitched_image = stitcher.stitch([image1, image2])
# Convert back to RGB
stitched_image_rgb = cv2.cvtColor(stitched_image, cv2.COLOR_BGR2RGB)
# Display the stitched image
plt.figure(figsize=(10, 5))
plt.imshow(stitched_image_rgb)
plt.title('Stitched Image using cv2.Stitcher')
plt.axis('off')
plt.show()
Saving notebook as pdf¶
from google.colab import drive
drive.mount('/content/drive')
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
%%capture
!sudo apt-get install texlive-xetex texlive-fonts-recommended texlive-plain-generic pandoc
%%capture
!jupyter nbconvert --to pdf /content/drive/MyDrive/Colab\ Notebooks/5.\ Warping,\ Stitching\ and\ Image\ Registration.ipynb