Module 1: Introduction to Computer Vision

1. What is Computer Vision?

Definition: Computer Vision is the field of study focused on enabling machines to process and interpret visual information from the world, similar to how humans do using their vision system.

2. Key Concepts and Components

3. Applications of Computer Vision

4. Image Understanding vs Image Processing

Image Processing: Basic transformations to improve or analyze an image (e.g., filters, enhancement).

Image Understanding: Extracting semantic meaning from visual data (e.g., classifying an image as a cat).

Image Processing → Feature Extraction → Image Understanding

5. Basic Equation of Image Formation

g(x, y) = f(x, y) * h(x, y) + n(x, y)

Where:
    - f(x, y): original image (ideal signal)
    - h(x, y): point spread function (camera optics)
    - n(x, y): additive noise
    - g(x, y): observed image captured by camera
    

This equation is a basic model for how digital images are formed considering system imperfections.

6. Visual Perception Pipeline

Diagram (conceptual):

Scene → Camera Sensor → Digital Image → Preprocessing → Feature Extraction → Inference → Output

7. Real-World Example: Autonomous Vehicles

Self-driving cars use computer vision to interpret their surroundings.

8. Timeline of Key Milestones

9. Hands-on Practice (Colab Compatible)

OpenCV Canny Edge Detection on a user-uploaded image:

!pip install opencv-python-headless matplotlib

import cv2
import numpy as np
import matplotlib.pyplot as plt
from google.colab.patches import cv2_imshow
from google.colab import files

uploaded = files.upload()
img_path = next(iter(uploaded))

# Load and show original
img = cv2.imread(img_path)
cv2_imshow(img)

# Convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2_imshow(gray)

# Gaussian blur
blur = cv2.GaussianBlur(gray, (5, 5), 0)

# Canny edge detection
edges = cv2.Canny(blur, 100, 200)
cv2_imshow(edges)

cv2.imwrite("canny_edges.jpg", edges)
    

10. Assignment

Objective: Understand foundational CV concepts, run code, and reflect on vision applications.

Due: End of Week 1