# Compression Methods

## Stereo

Idea: Left and right stereo channels are highly correlated

Typical to take a stereo pair and turn it into a mono channel

*(l + r) / 2*and a side channel*(l - r)*The side channel is typically low amplitude, and so can be compressed easily

Side benefit: mono channel is easily extracted

## Downsampling

Idea: Most audio has low amplitudes at higher frequencies

Downsample the signal, transmit that

Loss is pretty noticeable at high compression rates; maybe need some residue coding

The signal path may be band-limited anyhow: embedded devices, guitar pedals, etc

MP3 (discussed in a bit) is a surprisingly close cousin to this scheme

## Companding

Idea: Small differences in large amplitudes matter less. In particular, human hearing is log-amplitude

To best represent a signal in a fixed number of bits, squash the encoding so that there are fewer codes for larger amplitudes

µ-Law: 14 bits in, 8 bits out

Continuous

$$y(t) = \mathrm{sgn}(x(t)) \frac{\ln(1 + \mu |x(t)|)}{\ln(1 + \mu)}$$

where µ is 255

Discrete version is given by big approximation table

## POTS

US Plain Ol' Telephone Service compression is downsampling to 8000 sps and then µ-Law encoding to 8 bits, so 64000 bps

Lossy, but turns out to be good enough to sound OK for voice

Originally implemented entirely analog: the digital thing is a replicant

Characteristic telephone sound is mostly this

## FLAC

Predict in time domain using polynomial model or Linear Predictive Code

Encode residue using Rice codes (related to Huffman codes)

Reliable compression of about 2×

Remember: the noise must be compressed and recreated also

## Lossy Compression ala MP3

Good Ars Technica MP3 tutorial

High-level view:

Split the input signal up into a bunch of frequency bands using a "polyphase filter"

In each band:

Use an FFT to figure out what's going on

Use a DCT to get a power spectrum (noise subframes are speshul)

Quantize the spectrum to reduce the number of bits (giving power errors due to noise)

Huffman-encode the quantized coefficients to get a compact representation

Combine all the compressed quantized coefficients to get a frame

The details are quite complex: see something like Ogg Vorbis for a cleaner version