Date: September 26, 2024
Lecture Duration: 1.5 hours
Topic Overview: In this lecture, we move beyond spatial pixel manipulation to explore images in the Frequency Domain. We also tackle the fundamental challenge of matching parts of images using Local Features (Keypoints) and explore how to represent images at multiple scales and in different color spaces to solve specific vision tasks.


1. The Frequency Domain (Fourier Transforms)

We often think of images as a grid of spatial intensities \((x, y)\). However, images can also be viewed as a sum of waves. We introduced the Fourier Transform (FT), a mathematical tool that decomposes a signal into its constituent frequencies.

2. Feature Detection and Matching

To stitch panoramas or track objects, we need to find “interesting” points that are unique and stable.

3. Multi-Scale Representations (Pyramids)

Real-world objects appear at different sizes depending on their distance from the camera. To handle this, we introduced Scale Space Theory via Image Pyramids.

4. Color Spaces

Finally, we moved beyond the standard RGB model. We analyzed why RGB is often poor for computer vision analysis and explored alternatives:


Interactive Demonstration

Below is the complete Jupyter Notebook used in class. It contains the Python implementations for Fourier Analysis, the Harris Corner Detector, BRIEF matching, and Color Space segmentation.


← Back to Computer Vision