Lecture 2C: Digital Halftoning and Visual Psychophysics

Open Table of Contents

Digital Halftoning: Theory and Mathematical Framework
Spatial Frequency Analysis: Advanced Mathematical Framework
Human Visual System as a Fourier Analyzer: Psychophysical Evidence

Digital Halftoning: Theory and Mathematical Framework

Introduction to Digital Halftoning

Digital halftoning is a spatial quantization process that converts continuous-tone grayscale images into binary (black and white) images while preserving the visual perception of intermediate gray levels. This technique is fundamental to printing technology and display systems with limited dynamic range.

Mathematical Definition

Given a continuous-tone input image $f(x,y)$ with pixel values in the range $[0, L-1]$ where $L$ is the number of gray levels, halftoning produces a binary output image $g(x,y)$ where each pixel is either black (1) or white (0).

The fundamental challenge is to maintain the local average gray level while using only binary values:

$\mathbb{E}[g(x,y)] \approx \frac{f(x,y)}{L-1}$

Human Visual System and Spatial Filtering

Low-Pass Filtering Model

The human visual system acts as a low-pass spatial filter with approximate transfer function:

$H_{\text{eye}}(f_x, f_y) = e^{-\alpha\sqrt{f_x^2 + f_y^2}}$

where $f_x, f_y$ are spatial frequencies and $\alpha$ determines the cutoff characteristics.

Critical Insight: If the halftoning pattern has frequency content above the eye’s cutoff frequency, the binary pattern will be perceived as continuous gray levels.

Viewing Distance and Spatial Resolution

For a viewing distance $d$ and pixel size $\Delta x$ , the effective spatial frequency is:

$f_{\text{spatial}} = \frac{1}{\Delta x \cdot d} \text{ cycles/radian}$

Design Criterion: Halftone patterns should have dominant frequency components above: $f_{\text{threshold}} \approx 30 \text{ cycles/degree} \times \frac{\pi}{180} \approx 0.52 \text{ cycles/radian}$

Thresholding-Based Halftoning

Simple Thresholding

The most basic halftoning method applies a global threshold $T$ :

Binary output determined by: $g(x,y) = 1$ if $f(x,y) \geq T$ , otherwise $g(x,y) = 0$

Problems with Simple Thresholding:

Loss of spatial detail in regions near the threshold
Contour artifacts at gray level boundaries
Poor reproduction of intermediate gray levels

Adaptive Thresholding with Spatial Modulation

To address simple thresholding limitations, we introduce spatially varying thresholds:

$T(x,y) = T_0 + A \cdot p(x,y)$

where:

$T_0$ : Base threshold level
$A$ : Modulation amplitude
$p(x,y)$ : Spatial pattern function

Common Pattern Functions:

Periodic Screen Pattern: $p(x,y) = \cos\left(\frac{2\pi x}{P_x}\right) + \cos\left(\frac{2\pi y}{P_y}\right)$
Random Noise Pattern: $p(x,y) = \mathcal{N}(0, \sigma^2)$ (Gaussian white noise)
Blue Noise Pattern (optimal for human vision): $P(f_x, f_y) = \text{constant for } f > f_c, \text{ zero for } f < f_c$

Mathematical Analysis of Halftoning Quality

Mean Squared Error (MSE) Criterion

For a halftoned image $g(x,y)$ and original image $f(x,y)$ :

$\text{MSE} = \frac{1}{MN} \sum_{x=0}^{M-1} \sum_{y=0}^{N-1} [f(x,y) - g(x,y)]^2$

Frequency Domain Analysis

The power spectral density of halftone error reveals perceptual quality:

$E(f_x, f_y) = |F(f_x, f_y) - G(f_x, f_y)|^2$

Quality Metrics:

Low-frequency error (visible as intensity variations)
High-frequency content (contributes to texture appearance)
Spectral concentration around specific frequencies (causes visible patterns)

Noise Addition for Improved Halftoning

Dithering with Additive Noise

Pre-processing with noise before thresholding:

$f'(x,y) = f(x,y) + n(x,y)$

followed by binary quantization based on the modified values.

Mathematical Benefits of Noise Addition

Linearization Effect: For small noise variance $\sigma_n^2$ , the expected output becomes:

$\mathbb{E}[g(x,y)] \approx \Phi\left(\frac{f(x,y) - T}{\sigma_n}\right)$

where $\Phi(\cdot)$ is the cumulative distribution function.

Optimal Noise Characteristics:

White noise: Uniform power distribution across all frequencies
Blue noise: Concentrated at high frequencies (less visible)
Noise variance: $\sigma_n^2 = \frac{(L-1)^2}{12}$ for uniform quantization

Blue Noise Optimization

Blue noise patterns minimize low-frequency error while maintaining randomness:

Optimization Criterion: $\min_{p(x,y)} \int_0^{f_c} |P(f_x, f_y)|^2 \, df_x \, df_y$

subject to the constraint that $p(x,y)$ produces the desired gray level distribution.

Advanced Halftoning Techniques

Error Diffusion Algorithm

Mathematical Formulation:

Quantization: $g[i,j] = \text{round}(f'[i,j])$
Error calculation: $e[i,j] = f'[i,j] - g[i,j]$
Error diffusion: $f'[i+m,j+n] = f[i+m,j+n] + \sum_{k,l} w[k,l] \cdot e[i-k,j-l]$

Popular Error Diffusion Filters:

Floyd-Steinberg Filter: Distributes quantization error to neighboring pixels with weights designed to preserve local average intensity while minimizing visible artifacts.

Jarvis Filter: Uses a larger neighborhood for error distribution, providing smoother results at the cost of increased computational complexity.

Clustered Dot Screening

Mathematical Model: Screen function $S(x,y)$ with period $(P_x, P_y)$ :

$S(x,y) = \cos\left(\frac{2\pi x}{P_x}\right) + \cos\left(\frac{2\pi y}{P_y}\right) + \cos\left(\frac{2\pi(x+y)}{P_x}\right)$

Threshold Modulation: $T(x,y) = \frac{L-1}{2} + A \cdot S(x,y)$

Dot Size Control: The area of printed dots varies continuously with input gray level:

$\text{Dot Area} = \frac{\pi r^2(g)}{P_x P_y}$

where $r(g)$ is the dot radius as a function of gray level $g$ .

Perceptual Optimization

Contrast Sensitivity Function (CSF)

The human visual system’s contrast sensitivity varies with spatial frequency:

$\text{CSF}(f) = af \cdot e^{-bf} \cdot \sqrt{1 + cf^2}$

Perceptually Weighted Error: $E_{\text{weighted}}(f_x, f_y) = |F(f_x, f_y) - G(f_x, f_y)|^2 \cdot \text{CSF}^2(\sqrt{f_x^2 + f_y^2})$

Quality Assessment Metrics

Weighted Signal-to-Noise Ratio (WSNR): $\text{WSNR} = 10\log_{10}\left(\frac{\sum_{f} |F(f)|^2 \cdot \text{CSF}^2(f)}{\sum_{f} |F(f) - G(f)|^2 \cdot \text{CSF}^2(f)}\right)$
Delta-E Color Difference (for color halftoning): $\Delta E = \sqrt{(\Delta L^*)^2 + (\Delta a^*)^2 + (\Delta b^*)^2}$

Spatial Frequency Analysis: Advanced Mathematical Framework

Fundamental Concepts in Spatial Frequency

Spatial frequency represents the rate of change of image intensity across spatial dimensions. Unlike temporal frequency measured in Hertz (cycles per second), spatial frequency is measured in cycles per unit distance (e.g., cycles/mm, cycles/pixel, or cycles/degree of visual angle).

Mathematical Definition

For a 2D image $f(x,y)$ , the spatial frequency content is revealed through the 2D Fourier Transform:

$F(u,v) = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} f(x,y) e^{-j2\pi(ux + vy)} \, dx \, dy$

where:

$(u,v)$ : Spatial frequency coordinates (cycles per unit distance)
$F(u,v)$ : Complex-valued frequency domain representation
Magnitude $|F(u,v)|$ : Amplitude of frequency component
Phase $\arg[F(u,v)]$ : Phase of frequency component

Discrete 2D Fourier Transform

For digitized images with $M \times N$ pixels:

$F[k,l] = \frac{1}{MN} \sum_{m=0}^{M-1} \sum_{n=0}^{N-1} f[m,n] e^{-j2\pi(km/M + ln/N)}$

Frequency Mapping:

$u = \frac{k}{M \cdot \Delta x}$ where $\Delta x$ is pixel spacing in x-direction
$v = \frac{l}{N \cdot \Delta y}$ where $\Delta y$ is pixel spacing in y-direction

Spatial Frequency Characteristics

Low Spatial Frequencies

Range: $0 \leq f \leq f_{\text{low}}$ (typically $< 1$ cycle/degree)

Characteristics:

Represent broad intensity variations and overall illumination
Control global contrast and brightness perception
Correspond to large-scale features in images

Mathematical Description: For slowly varying functions: $f(x,y) \approx f_0 + a_1 x + a_2 y + a_3 xy$

The Fourier transform concentrates energy near DC (zero frequency): $|F(0,0)| = \left|\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} f(x,y) \, dx \, dy\right|$

Mid Spatial Frequencies

Range: $f_{\text{low}} < f \leq f_{\text{mid}}$ (typically 1-10 cycles/degree)

Characteristics:

Encode structural information and object boundaries
Critical for pattern recognition and scene understanding
Most perceptually significant for human vision

Mathematical Analysis: Edge content contributes significantly to mid-frequencies. For a step edge: $f(x) = A \cdot H(x - x_0)$

where $H(\cdot)$ is the Heaviside function, the Fourier transform is: $F(u) = \frac{A}{j2\pi u} e^{-j2\pi u x_0}$

Energy decays as $|F(u)| \propto \frac{1}{|u|}$ .

High Spatial Frequencies

Range: $f > f_{\text{mid}}$ (typically $> 10$ cycles/degree)

Characteristics:

Contain fine detail and texture information
Represent noise and small-scale variations
Often attenuated by human visual system

Mathematical Properties: For white noise with variance $\sigma^2$ : $\mathbb{E}[|F(u,v)|^2] = \sigma^2 \cdot \delta(u) \cdot \delta(v)$

Nyquist Frequency and Sampling Theory

Spatial Nyquist Frequency

For images sampled with pixel spacing $\Delta x$ and $\Delta y$ :

Nyquist Frequencies:

$f_{N,x} = \frac{1}{2\Delta x}$ cycles per unit distance in x-direction
$f_{N,y} = \frac{1}{2\Delta y}$ cycles per unit distance in y-direction

Critical Insight: Spatial frequencies above the Nyquist frequency cause aliasing artifacts.

Aliasing in Spatial Domain

When input contains frequencies $f > f_N$ , they appear as false lower frequencies:

$f_{\text{alias}} = |f_{\text{input}} - k \cdot 2f_N|$

where $k$ is chosen to minimize $f_{\text{alias}}$ .

Mathematical Analysis: For a sinusoidal pattern with frequency $f > f_N$ : $g(x) = \cos(2\pi f x)$

After sampling with $\Delta x$ : $g[n] = \cos(2\pi f n \Delta x)$

If $f = f_N + \Delta f$ where $\Delta f < f_N$ : $g[n] = \cos(2\pi (f_N + \Delta f) n \Delta x) = \cos(\pi n + 2\pi \Delta f n \Delta x)$

This appears as frequency $\Delta f$ instead of the true frequency $f_N + \Delta f$ .

Spatial Filtering and Convolution

Convolution in Spatial Domain

Mathematical Definition: $(f * h)(x,y) = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} f(\xi, \eta) h(x-\xi, y-\eta) \, d\xi \, d\eta$

For discrete images: $(f * h)[m,n] = \sum_{k=-\infty}^{\infty} \sum_{l=-\infty}^{\infty} f[k,l] \cdot h[m-k, n-l]$

Frequency Domain Filtering

Convolution-Multiplication Duality: $f(x,y) * h(x,y) \xleftrightarrow{\mathcal{F}} F(u,v) \cdot H(u,v)$

Practical Filtering Steps:

Forward FFT: $F(u,v) = \mathcal{F}\{f(x,y)\}$
Multiply by filter: $G(u,v) = F(u,v) \cdot H(u,v)$
Inverse FFT: $g(x,y) = \mathcal{F}^{-1}\{G(u,v)\}$

Common Spatial Filters

Low-Pass Filter (removes high frequencies):

H_{\text{LP}}(u,v) = \begin{cases} 1 & \text{if } \sqrt{u^2 + v^2} \leq D_0 \\ 0 & \text{if } \sqrt{u^2 + v^2} > D_0 \end{cases}

High-Pass Filter (removes low frequencies): $H_{\text{HP}}(u,v) = 1 - H_{\text{LP}}(u,v)$

Butterworth Filter (smooth transition): $H_{\text{Butter}}(u,v) = \frac{1}{1 + \left[\frac{\sqrt{u^2 + v^2}}{D_0}\right]^{2n}}$

where $n$ controls the sharpness of the transition.

Applications in Image Processing

Edge Detection via High-Pass Filtering

Gradient-Based Edge Detection:

Sobel Operator: Approximates the gradient using discrete convolution masks
Prewitt Operator: Alternative gradient approximation with equal weights

Edge Magnitude: $|\nabla f| = \sqrt{(G_x * f)^2 + (G_y * f)^2}$

Edge Direction: $\theta = \arctan\left(\frac{G_y * f}{G_x * f}\right)$

Laplacian of Gaussian (LoG) Filter

Mathematical Form: $\nabla^2 G(x,y) = \frac{1}{\pi \sigma^4}\left[1 - \frac{x^2 + y^2}{2\sigma^2}\right] e^{-\frac{x^2 + y^2}{2\sigma^2}}$

Properties:

Zero-crossing detection identifies edges
Scale parameter $\sigma$ controls feature size sensitivity
Mexican hat appearance in spatial domain

Frequency Domain Analysis for Quality Assessment

Power Spectral Density (PSD)

Definition: $P(u,v) = |F(u,v)|^2$

Radial Power Spectrum: $P(f) = \int_0^{2\pi} P(f\cos\theta, f\sin\theta) f \, d\theta$

where $f = \sqrt{u^2 + v^2}$ is the radial frequency.

Image Quality Metrics

Total Variation (measures smoothness): $\text{TV} = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} \sqrt{\left(\frac{\partial f}{\partial x}\right)^2 + \left(\frac{\partial f}{\partial y}\right)^2} \, dx \, dy$

Frequency Domain Equivalent: $\text{TV} = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} 2\pi\sqrt{u^2 + v^2} |F(u,v)| \, du \, dv$

Spectral Entropy (measures frequency distribution): $H = -\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} p(u,v) \log p(u,v) \, du \, dv$

where $p(u,v) = \frac{|F(u,v)|^2}{\int |F(u,v)|^2 \, du \, dv}$ is the normalized power spectrum.

Human Visual System as a Fourier Analyzer: Psychophysical Evidence

Campbell-Robson Paradigm: Mathematical Foundation

The groundbreaking work of Campbell and Robson (1968) provided compelling evidence that the human visual system performs spatial frequency analysis analogous to Fourier decomposition. Their psychophysical experiments revealed that visual perception operates through independent spatial frequency channels.

Experimental Setup and Mathematical Analysis

Sinusoidal Gratings: Test patterns with luminance modulation: $L(x) = L_0 \left[1 + M \cos\left(2\pi f x + \phi\right)\right]$

where:

$L_0$ : Mean luminance (background brightness)
$M$ : Modulation depth (contrast)
$f$ : Spatial frequency (cycles/degree)
$\phi$ : Phase offset

Square Wave Gratings: Periodic step functions with Fourier expansion: $L_{\text{square}}(x) = L_0 \left[1 + \frac{4M}{\pi} \sum_{n=1,3,5,\ldots} \frac{1}{n} \cos(2\pi n f x)\right]$

Key Mathematical Relationship: The fundamental frequency component has amplitude $\frac{4M}{\pi} \approx 1.27M$ , higher than the sine wave amplitude $M$ .

Threshold Detection Experiments

Detection Threshold Measurement:

For sinusoidal gratings, the contrast sensitivity at threshold is: $S_{\sin}(f) = \frac{1}{M_{\text{threshold}}(f)}$

For square wave gratings: $S_{\text{square}}(f) = \frac{1}{M_{\text{threshold,square}}(f)}$

Campbell-Robson Discovery: $S_{\sin}(f) \approx S_{\text{square}}(f)$

Mathematical Interpretation: If the visual system detected integrated energy, square waves should be easier to detect due to higher harmonic content. The equal thresholds indicate frequency-selective detection.

Fourier Analysis of Visual Processing

Linear Systems Model: Visual system response to input $L(x,y)$ : $R(u,v) = L(u,v) \cdot H_{\text{visual}}(u,v)$

where $H_{\text{visual}}(u,v)$ is the modulation transfer function of the visual system.

Channel-Based Processing: Multiple parallel channels with different frequency tuning: $R_i(x,y) = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} L(u,v) H_i(u,v) e^{j2\pi(ux + vy)} \, du \, dv$

where $H_i(u,v)$ represents the $i$ -th spatial frequency channel.

Mathematical Models of Visual Spatial Frequency Channels

Gabor Filter Model

Visual neurons are modeled as Gabor filters - Gaussian-windowed sinusoids:

$g(x,y) = \frac{1}{2\pi\sigma_x\sigma_y} \exp\left(-\frac{x^2}{2\sigma_x^2} - \frac{y^2}{2\sigma_y^2}\right) \cos(2\pi f_0 x + \phi)$

Parameters:

$\sigma_x, \sigma_y$ : Spatial extent (receptive field size)
$f_0$ : Preferred spatial frequency
$\phi$ : Phase preference

Frequency Response: $G(u,v) = \exp\left(-2\pi^2\sigma_x^2(u-f_0)^2 - 2\pi^2\sigma_y^2 v^2\right) + \exp\left(-2\pi^2\sigma_x^2(u+f_0)^2 - 2\pi^2\sigma_y^2 v^2\right)$

Difference of Gaussians (DoG) Model

Mathematical Form: $\text{DoG}(x,y) = \frac{1}{2\pi\sigma_1^2} e^{-\frac{x^2+y^2}{2\sigma_1^2}} - k \frac{1}{2\pi\sigma_2^2} e^{-\frac{x^2+y^2}{2\sigma_2^2}}$

where $\sigma_2 > \sigma_1$ and $k$ is the amplitude ratio.

Frequency Response: $\text{DoG}(u,v) = e^{-2\pi^2\sigma_1^2(u^2+v^2)} - k \cdot e^{-2\pi^2\sigma_2^2(u^2+v^2)}$

Biological Significance: Models center-surround organization of retinal ganglion cells and lateral geniculate nucleus (LGN) neurons.

Contrast Sensitivity Function: Quantitative Analysis

Mathematical Formulation

The Contrast Sensitivity Function (CSF) describes visual sensitivity across spatial frequencies:

$\text{CSF}(f) = a \cdot f^b \cdot e^{-c f^d}$

Typical Parameters (for photopic conditions):

$a = 540$ : Scaling factor
$b = 0.2$ : Low-frequency slope
$c = 0.0016$ : Decay rate
$d = 1.1$ : High-frequency rolloff

Daly’s CSF Model

More sophisticated model accounting for luminance adaptation:

$\text{CSF}(f, L) = \frac{1.4 \cdot f \cdot e^{-0.114f}}{\sqrt{1 + 0.06 \cdot e^{0.3f}}} \cdot \sqrt{\frac{L + 0.15}{0.15}}^{0.5}$

where $L$ is the adaptation luminance in cd/m².

Peak Sensitivity Analysis

Peak Frequency: Occurs at $f_{\text{peak}} \approx 3-5$ cycles/degree for normal viewing conditions.

Mathematical Derivation: Setting $\frac{d}{df}\text{CSF}(f) = 0$ :

For the simplified form $\text{CSF}(f) = af \cdot e^{-bf}$ : $\frac{d}{df}(af \cdot e^{-bf}) = a(1 - bf) e^{-bf} = 0$

Solving: $f_{\text{peak}} = \frac{1}{b}$

Spatial Frequency Adaptation and Masking

Selective Adaptation Experiments

Paradigm: Prolonged exposure to specific spatial frequency reduces sensitivity to that frequency and nearby frequencies.

Mathematical Model: After adaptation to frequency $f_{\text{adapt}}$ : $\text{CSF}_{\text{adapted}}(f) = \frac{\text{CSF}_{\text{baseline}}(f)}{1 + G(f, f_{\text{adapt}})}$

where $G(f, f_{\text{adapt}})$ is the adaptation gain function: $G(f, f_{\text{adapt}}) = A \cdot e^{-\left(\frac{\ln(f/f_{\text{adapt}})}{\sigma}\right)^2}$

Parameters:

$A$ : Maximum adaptation effect
$\sigma$ : Bandwidth of adaptation (typically $\sigma \approx 1.4$ octaves)

Spatial Frequency Masking

Simultaneous Masking: Presence of one frequency component affects detection of another.

Mathematical Formulation: For target frequency $f_t$ in presence of mask frequency $f_m$ : $\text{Threshold}_{\text{masked}}(f_t) = \text{Threshold}_{\text{unmasked}}(f_t) \cdot \left[1 + \left(\frac{C_m}{C_{m,\text{threshold}}}\right)^p \cdot W(f_t, f_m)\right]$

Masking Function: $W(f_t, f_m) = e^{-\left(\frac{\ln(f_t/f_m)}{\beta}\right)^2}$

where $\beta$ determines the masking bandwidth (typically $\beta \approx 1.5$ octaves).

Multichannel Visual Processing Theory

Wilson-Gelb Model

Channel Definition: $N$ overlapping bandpass channels with center frequencies: $f_i = f_0 \cdot 2^{i/2}, \quad i = 0, 1, 2, \ldots, N-1$

Channel Response: $R_i(f) = \frac{(f/f_i)^{n_1}}{(f/f_i)^{n_1} + 1} \cdot \frac{1}{(f/f_i)^{n_2} + 1}$

Parameters: Typically $n_1 = 2, n_2 = 3$ for asymmetric bandpass characteristics.

Detection Probability Theory

Multiple Channel Decision: Detection occurs when any channel exceeds its threshold: $P_{\text{detection}} = 1 - \prod_{i=1}^N \left(1 - P_i\right)$

where $P_i$ is the detection probability for channel $i$ : $P_i = \frac{1}{2}\left[1 + \text{erf}\left(\frac{S_i - T_i}{\sqrt{2}\sigma_i}\right)\right]$

$S_i$ : Signal strength in channel $i$
$T_i$ : Detection threshold for channel $i$
$\sigma_i$ : Internal noise in channel $i$

Applications to Image Processing and Display Technology

Perceptually-Based Image Compression

Quantization Matrix Design: Based on CSF to minimize visible artifacts: $Q(u,v) = \frac{Q_0}{\text{CSF}(\sqrt{u^2 + v^2}) \cdot V(u,v)}$

where $V(u,v)$ accounts for viewing conditions and masking effects.

Display Calibration and Gamma Correction

Perceptual Uniformity: Ensure equal just-noticeable differences (JNDs) across gray levels: $\Delta L = k \cdot L^{\gamma}$

where $\gamma \approx 0.5$ for Weber-Fechner law approximation.

Mathematical Implementation: $L_{\text{display}} = L_{\text{max}} \left(\frac{I_{\text{digital}}}{I_{\text{max}}}\right)^{2.2}$

This comprehensive mathematical framework demonstrates how psychophysical experiments revealed the Fourier-like processing capabilities of human vision, leading to quantitative models that inform modern image processing and display technologies.