Time, Frequency, and the DFT
Sound Is Frequencies
Most sounds have high periodicity
Fourier’s Theorem (FOO-ree-YAY or thereabouts) says that an infinitely repeating sound can be represented as a sum of sinusoids
The ear hears/decomposes a sum of sinusoids
Yet PCM is a sequence of samples over time: frequency is not represented
- The Nyquist Limit is hard to think of as a signal change over time thing
Time and Frequency
We have: a continuous waveform, a function f(t) representing sound pressure
We want: a continuous spectrum, showing the amplitude and phase of sine waves at every frequency f̂(ω)
Wait, amplitude and phase from a single function? Yes, representing a frequency as a complex number with the usual geometric interpretation
\[ f = a + b i, |f| = \sqrt{a^2 - b^2}, \theta(f) = tan^{-1}(a, b) \]
Note: you will see both i and j for \(\sqrt{-1}\) in different contexts
Note: we freely mix between angular frequency \(\omega\) and “normal” frequency f (dammit) using
\[ \omega = 2 \pi f \]
because once around the circle is one cycle
The Euler Formula
Euler’s Formula says complex exponential is a sinusoid:
\[ e^{i (\omega t + \theta)} = cos(\omega t + \theta) + i~sin(\omega t + \theta) \]
\[ = e^{i \omega t} e^{i \theta} \]
Starting point for “phasor analysis”
Now our sum of sinusoids can be represented as a sum of exponentials, making things easier (?)
The Time Domain and the Frequency Domain
Reminder: Fourier claims that every \(f(t)\) can be represented as some \(\hat{f}(\omega)\) (more or less)
We think of the first kind of thing as “in the frequency domain”, the second as “in the time domain”
Converting from a single frequency to its time domain representation is “easy”:
\[ f(t) = e^{-i \omega t} \]
Even for a single sinusoid, converting the other way isn’t immediately obvious
Continuous Fourier Decomposition: The Fourier Transform
Let’s just get the Fourier Transform out there:
\[ \hat{f}(\omega) = \int_{-\infty}^{\infty} f(t) e^{-i \omega t} dt \]
The minus sign in the exponent is fiddly
The infinite integral is alarming
Still, we now can math up what we wanted
What about going the other way? Turns out that
\[ f(t) = \frac{1}{2 \pi} \int_{-\infty}^{\infty} \hat{f}(\omega) e^{i \omega t} d\omega \]
So the transform is almost self-inverse
The Discrete-Time Fourier Transform
Let’s “get rid of” the infinities by just taking limits
\[ \hat{f}(\omega) = \lim_{W \rightarrow \infty} \int_{-W}^{W} f(t) e^{-i \omega t} dt \]
A very long signal relative to its period should be well-approximated
Now a fiddly argument ensues. If the signal is periodic, we should be able to replace the limits by the period: after all, it doesn’t change anything
\[ \hat{f}(\omega) = \int_{0}^{P} f(t) e^{-i \omega t} dt \]
Let’s turn the integral into a sum by treating \(f(t)\) as a series of impulses: nonzero only at discrete timesteps
\[ X[k] = \sum_{n=0}^{N-1} x[n] e^{-i k n / N} \]
The frequency-domain value at unit frequency k, namely \(X[k]\) has been written as a weighted sum of the discrete time-domain values \(x[i]\) over the period \(N\)
The inverse transformation works similarly
\[ x[n] = \frac{1}{2 \pi N} \sum_{k=0}^{N-1} X[n] e^{i k n / N} \]
Notes On The DFT
The Xs are still complex (!)
\(x[n]\) is treated as cyclic
The inverse is computed only as the sum of discrete frequencies
The sampling rate is handled here by unit frequency. Frequency needs to be scaled by the sample rate to be meaningful
This math is so full of fiddly that it is really easy to get it screwed up
Frequency resolution is a function of sampled period
The transforms as given are \(O(N^2)\)
The DFT can be thought of as a Gaussian Filter bank
The Fast Fourier Transform
A dynamic programming trick gets the time complexity of the DFT down to \(O(N \lg N)\)
No, I’m not going to explain it here: treat it as a black box
The FFT gives exactly the same answers as the DFT, just faster
Limitation: The FFT requires a power-of-two number of samples
Limitation: The output frequencies must be linear, which may not be what you wished for
The Discrete Cosine Transform
Usually you don’t care about phase information
The DCT gives you something similar to the FFT, but only magnitudes
Still can be done \(O(N lg N)\)
The DCT has a different set of boundary conditions
Spectra of Common Signals
Triangle, square: Fundamental plus “odd harmonics”
Difference between triangle and square is decay rate, phase
Odd harmonics are a characteristic of many kinds of distortion
“White noise” (random samples): Flattish noise spectrum (“noise is a sum of random frequencies”)
Frequency Localization In Time
Fundamentally ill-defined: FT only works right on an infinite periodic signal, throws away all information about changes in frequency
Heisenbergy: Smaller DFT gives more local information about frequency, but at the cost of poorer accuracy. Low frequency signals, in particular, will not be captured
“Overlapping” heuristic: Do a small DFT, slide over a few samples, do it again
Windowing
DFT treats input as cyclic: if not analyzing over exactly an integer number of periods, trouble
In particular, a step change corresponds to a smooth spectrum, so endpoints must match (also all derivatives)
“Windowing” heuristic: Taper the time-domain signal off toward the edges
A window corresponds to a low-pass filter: more next lecture
DFT Application: Frequency Analysis
Pretty straightforward: do the DFT, look for peaks etc in the spectrum
Still have to do the analysis of the spectrum, which involves some kind of other math or heuristics
There are more sensitive frequency analysis methods that do not involve the DFT
DFT Application: Filtering
Pretty straightforward: do the DFT, adjust frequencies as desired, do the inverse DFT
See localization in time: will end up doing “overlapping windowed” DFT to track changes input frequencies
Expensive compared to “direct” filtering we will discuss next time
DFT Application: Lossy Compression
Do the DFT, remove “unwanted” stuff
Depends pretty hard on psychoacoustics