Key Idea? Look for strong gradients, and then post-process.
Due to discontinuties in:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import datasets, misc
from PIL import Image
from scipy.ndimage import convolve, correlate, gaussian_filter, filters, laplace
import cv2 as cv
im = np.zeros((5,5))
for i in range(5):
im[i, 2] = 1
f,ax = plt.subplots(1, 2)
ax[0].imshow(im, cmap='gray')
ax[0].title.set_text("Image")
ax[1].plot(im[0])
ax[1].title.set_text("Intensity Function")
For discrete data, we can approximate using finite differences: $\frac{\partial{f(x,y)}}{\partial{x}} \approx \frac{f(x+1, y) - f(x, y)}{1}$
Example of such a filter: $\begin{bmatrix}-1 & 1\end{bmatrix}$, $\begin{bmatrix}-1 \\ 1\end{bmatrix}$, $\begin{bmatrix}1 \\ -1\end{bmatrix}$
derivative_filter = np.matrix([-1, 1])
derivative_of_image = convolve(im, derivative_filter)
plt.plot(derivative_of_image[0])
[<matplotlib.lines.Line2D at 0x751d345827d0>]
The Extrema correspond to the edges
ascent = datasets.ascent().astype('int32')
plt.imshow(ascent, cmap='gray')
<matplotlib.image.AxesImage at 0x751d384299f0>
Prewitt (Vertical Version): $\begin{bmatrix} -1 & 0 & 1 \\ -1 & 0 & 1 \\ -1 & 0 & 1 \end{bmatrix}$
Sobel (Vertical Version): $\begin{bmatrix} -1 & 0 & 1 \\ -2 & 0 & 2 \\ -1 & 0 & 1 \end{bmatrix}$
Roberts (Diagnol Edges Detection): $\begin{bmatrix} 0 & 1 \\ -1 & 0 \end{bmatrix}$
The above given version of Prewitt & Sobel detect edges going from black to white.
If the 1st and 3rd columns are interchanged the filters would detect edges going from white to black
Assumption: $0 \rightarrow black$, $255 \rightarrow white$
prewitt = np.matrix([[-1,0,1], [-1,0,1], [-1,0,1]])
sobel = np.matrix([[1,0,-1], [2,0,-2], [1, 0, -1]])
roberts = np.matrix([[0, 1], [-1, 0]])
sobel_v = convolve(ascent, sobel)
sobel_h = convolve(ascent, sobel.T)
magnitude = np.sqrt(sobel_v**2 + sobel_h**2)
magnitude *= 255 / np.max(magnitude) # To bring the max of magnitude to 255
f, ax = plt.subplots(2,2)
ax[0,0].imshow(ascent, cmap='gray')
ax[0,0].title.set_text("Original")
ax[0,1].imshow(sobel_h, cmap='gray')
ax[0,1].title.set_text("Horizontal")
ax[1,0].imshow(sobel_v, cmap='gray')
ax[1,0].title.set_text("Vertical")
ax[1,1].imshow(magnitude, cmap='gray')
ax[1,1].title.set_text("Magnitude")
x = np.linspace(-5, 5, 256)
denoised_y = 1/(1+np.exp(-x))
noise = np.random.normal(0,0.01,256)
y = denoised_y + noise
to_image = y.reshape(16, 16)
post_gaussian = gaussian_filter(to_image, sigma=2)
to_function = post_gaussian.reshape(256,)
f, ax = plt.subplots(2, 2)
ax[0,0].plot(x, y)
ax[0,0].title.set_text("Image function with Noise")
ax[0,1].imshow(to_image, cmap='gray')
ax[0,1].title.set_text("Image with Noise")
ax[1,0].imshow(post_gaussian, cmap='gray')
ax[1,0].title.set_text("Image De-noised")
ax[1,1].plot(x, to_function)
ax[1,1].title.set_text("Image function post-smoothing")
We can now take the derivative of the function to detect edges.
Differentiation is acheived through convolution, and convolution is associative. Since, we know that gradient calculation can also be written as a convoluton operation
$$ \frac{d(f*g)}{dx} = f * \frac{d(g)}{dx} = g * \frac{d(f)}{dx} $$This saves us an operation, because we can pre-calculate the derivative of the Gaussian.
f, ax = plt.subplots(1,2)
f.set_figwidth(15)
ax[0].plot(x, np.gradient(denoised_y))
ax[0].title.set_text("Image convolved by Gaussian")
ax[1].plot(x, np.gradient(np.gradient(denoised_y)))
ax[1].title.set_text("Image convolved by Laplacian")
The zero-crossing in the image convolved by the second-derivative indicates edge.
$\nabla^2{f} = \frac{\partial^2{f}}{\partial{x^2}} + \frac{\partial^2{f}}{\partial{y^2}}$
from scipy import signal
gs = signal.gaussian(49, 7)
dr = np.gradient(gs)
la = np.gradient(dr)
f, ax = plt.subplots(1, 3)
f.set_figheight(5)
f.set_figwidth(20)
ax[0].plot(gs)
ax[0].title.set_text("Gaussian")
ax[1].plot(dr)
ax[1].title.set_text("Derivative")
ax[2].plot(la)
ax[2].title.set_text("Laplacian")
Assume the orange curve is the edge of an image and is full of noise, we would supress that noise by:
A technique to handle discontinuities.
Check for well-connected egdes. How?
Use high-threshold to start edges curves and low threshold to continue them
canny_edges = cv.Canny(np.uint8(ascent),100,200, apertureSize=3)
f, ax = plt.subplots(1, 3)
f.set_figwidth(15)
ax[0].imshow(magnitude, cmap='gray')
ax[0].title.set_text("Sobel Filter")
ax[1].imshow(canny_edges, cmap='gray')
ax[1].title.set_text("Canny Edges")
ax[2].imshow(canny_edges)
<matplotlib.image.AxesImage at 0x751d1c744280>
Discrete approximation of Second-Derivative
$$
\frac{\partial^2{f}}{\partial{x^2}} = f(x+1, y) + f(x-1, y) - 2f(x, y) \\
\frac{\partial^2{f}}{\partial{y^2}} = f(x, y+1) + f(x, y-1) - 2f(x, y)
$$
Substituting in Laplacian Operator
$$
f(x+1, y) + f(x-1, y) + f(x, y+1) + f(x, y-1) - 4f(x,y) \\
\begin{bmatrix} 0 & 1 & 0 \\ 1 & -4 & 1 \\ 0 & 1 & 0\end{bmatrix}
$$
def gaussian_kernel(size: int, sigma: float) -> np.ndarray:
"""Generates a (size x size) Gaussian kernel."""
kernel = np.fromfunction(
lambda x, y: (1/(2*np.pi*sigma**2)) * np.exp(-((x-(size-1)/2)**2 + (y-(size-1)/2)**2) / (2*sigma**2)),
(size, size)
)
return kernel / np.sum(kernel)
gaussian_kernel_31x31 = gaussian_kernel(31, 5)
log = laplace(gaussian_kernel_31x31)
plt.imshow(log, cmap='gray')
gaussian_kernel_17x17 = gaussian_kernel(17, 5)
log = laplace(gaussian_kernel_17x17)
Convolution with a filter can be viewed as comparing a little image of what you want to find against all the local regions of the image.
Since, LoG as an image looks like a Blob, it can be used for blob detection.
sunflower = cv2.imread('sunflower.jpg', cv2.IMREAD_GRAYSCALE)
plt.imshow(sunflower, cmap='gray')
<matplotlib.image.AxesImage at 0x751d1c432d10>
plt.imshow(convolve(sunflower, log), cmap='gray')
<matplotlib.image.AxesImage at 0x751d036d91e0>
Motivation: What are the interesting features that set apart an image from every other image?
Possible Solution:
A feature may appear as 2 different things in 2 different scales(??)
Correlation with itself
$
E_{AC}[\Delta{u}] = \Sigma_{x,y} W(P_i)[I(P_i + \partial{u}) - I(P_i)]^2
$, in other words: Sum of change in pixel intensities as we move the patch
$W(P_i)$ is the window function
Taylor Series Expansion of $I[P_i + \Delta{u}] = I(P_i) + \nabla{I}(P_i)\Delta{u}$, where $\Delta{u}$ indicates the shift.
Therefore, $$ E_{AC}[\Delta{u}] = \Sigma_{x,y} W(P_i)[I(P_i + \partial{u}) - I(P_i)]^2 \\ \approx \Sigma_{x,y} W(P_i)[I(P_i) + \nabla{I}(P_i)\Delta{u} - I(P_i)]^2 \\ \approx \Sigma_{x,y} W(P_i)[\nabla{I}(P_i)\Delta{u}]^2 \\ \approx \Delta{u^T} A \Delta{u} $$
A tells us how much did the gradients change as the position of the patch is changed and the eigenvalues of A reveal the amount of intensity change in the two principal orthogonal directions in the window.
*How do eigenvalues determine if an image point is a corner?
a. Compute gradients at each point in the image.
b. Compute A for each image window to get its cornerness score
c. Compute eigenvalues or Compute $M_c$, where $M_c = \lambda_1\lambda_2 - k(\lambda_1 + \lambda_2)^2$ = $det - k * trace^2$
d. Find points who have $M_c > threshold$
e. Perform non-maxima supression of the points
## Implementation
gray = ascent.astype("uint8")
dst = cv.cornerHarris(gray, 2, 3, 0.04)
dst = cv.dilate(dst, None)
gray[dst>0.01*dst.max()]=255
plt.imshow(gray, cmap='gray')
<matplotlib.image.AxesImage at 0x79cd26d955a0>
How can we select interest points in each image such that the detections are repeatable across different scales?
filename = 'harris.png'
img = cv.imread(filename)
img_ = cv.imread(filename)
gray = cv.cvtColor(img,cv.COLOR_BGR2GRAY)
gray = np.float32(gray)
dst = cv.cornerHarris(gray,2,3,0.04)
dst_ = cv.cornerHarris(gray,10,3,0.04)
#result is dilated for marking the corners, not important
dst = cv.dilate(dst, None)
dst_ = cv.dilate(dst_, None)
# Threshold for an optimal value, it may vary depending on the image.
img[dst>0.1*dst.max()]=[0,0,255]
img_[dst_>0.1*dst_.max()] = [0,0,255]
f, ax = plt.subplots(1,2)
ax[0].imshow(img)
ax[1].imshow(img_)
<matplotlib.image.AxesImage at 0x7d9d7c4bfa90>
Design a function that takes a region around the point and outputs its response as a scalar given the region.
We will observe that the function outputs different response values depends on the region size. And more importantly, the local maxima of this function is a strong clue indicating that the region size corresponding to the local maxima should be invariant to image scale.
Regular or stochastic patterns caused by bumps, grooves or markings.
Why care about textures? Conveys more information that can be exploited to match regions of interest in images.
How to represent textures? Compute responses of blobs and edges at various orientations and scales.
Examples: Gabor, Steerable, Isotropic Gaussian
A special class of band-pass filters. They mimic the human visual system
It can be viewed as a product of the convolution between gaussian wave and a sinusoidal signal with a certain frequency and modulation.
Gabor Filters at an orientation of 15 degrees
A class of oriented filters that can be steered or expressed as a linear combination of a set of basis filters.
What does it do? Extacts features and their descriptors that are invariant to translation, rotation, scale, and shear.
Steps:
What does it output? It outputs keypoints(Location, Scale and Orientation information), descriptors (A 128-D vector that captures gradient information.)
We are trying to nudge the detected extrema towards true extrema.
Solution: Taylor Series Expansion: $D(s_0 + \Delta s) = D(s_0) + \frac{\partial{D^T}}{\partial s} |_{s_0} \Delta s + \frac{1}{2}\Delta{s} \frac{\partial^2{D^T}}{\partial s} |_{s_0} \Delta{s}$, where $s_0 = (x_0, y_0, \sigma_0)^T$ and $\Delta{s} = (\delta{x}, \delta{y}, \delta{\sigma})^T$
The location of the extremum $\hat{s}$ is determined by taking the derivative of $D$ wrt to $\Delta{s}$ and setting it to $\hat{0}$
$$ \frac{\partial{D^T}}{\partial s} + \frac{\partial^2{D^T}}{\partial s} \Delta{s} = 0 \\ \hat{s} = (\frac{\partial^2{D^T}}{\partial s})^-1 \frac{\partial{D^T}}{\partial s}|_{s_0} $$Low contrast Point Elimination: Reject $D(\hat{s})$ if smaller than 0.03
Edge Elimination: Hessian(H) = \begin{bmatrix} D_{xx} & D_{xy} \\ D_{xy} & D_{yy} \end{bmatrix}
$\alpha = $ largest eigenvalue, and $\beta = $ smallest eigenvalue.
$Tr(H) = \alpha + \beta$
$Det(H) = \alpha\beta$
$\frac{Tr(H)^2}{Det(H)} = \frac{(\alpha + \beta)^2}{\alpha \beta} = \frac{(r\beta + \beta)^2}{r \beta^2} = \frac{(r+1)^2}{r}, r = \frac{\alpha}{\beta}$. Minimum value when r = 1.
Reject if $\frac{Tr(H)^2}{Det(H)}$ > Threshold
DoG: Think of Gaussian Blur as a socialist government. The DoG calculates who were the features that had rapid changhe in intensity by subtracting 'money' after a 'mild' socialist government and an 'extreme' socialist government. Rest of it is written above.
To reduce effects of contrast or gain, the 128-D vector is normalized to unit length and values are cropped to 0.2, and again normalized...
3X faster than SIFT. Uses box filters instead of Gaussians to approximate Laplacians and many more. Uses Haar wavelets to get keypoint orientations. Haar Wavelets are very fast to compute
Rotate patch according to its dominant graident direction.
## Sift Implementation
Co-ordinate system in 2 dimensions parametrized by logarithmic distance from origin and angle. It is location and scale invariant.
Two points are said to be in correspondance if they minimize value of $C_{ij} = \frac{1}{2} \Sigma_{k=1}^K \frac{(h_i(k) - h_j(k))^2}{h_i(k) + h_j(k)}$
Distance less than a threshold implies shape match
Method for Blob Detection. Identify regions that stay same despite fluctuations in gray-level threshold.
Methodology:
Regions at a particular 'g' level denoted as $R_1^g, R_2^g, ..., R_n^g$, where $|R_i^g|$ = total number of pixels in $R_i^g$
$
\psi(R_i^g) = \frac{|R_j^{g-\Delta}| - |R_k^{g+\Delta}|}P
$
## HoG
## Pyramidal HoG
## LBP