A Look At A Sample

  • Let's consider a 16-bit audio sample:

    00 010100000101 01
    
  • Sample consists of

    • Headroom: Not recorded at maximum amplitude to avoid clipping

    • Signal: The actual audio

    • Noise: The low-order bits are typically garbage

  • The signal and noise blend together

  • As signals are manipulated, more noise creeps up into the signal bits because addition and multiplication

Clipping

  • Artifact of fixed-range representation of PCM sample: floating-point samples are basically unclippable

  • If the amplitude to be represented goes over the max possible value or under the min, what to do?

  • Not much except clamp (clip) the sample as close as you can get it

  • Net effect: tops clipped off waves

  • This happens in analog systems also because max/min voltages

  • Discontinuity introduces harmonics: bad distortion

  • This is why headroom

Noise

  • Several kinds of common audio noise

    • Uniform "white noise": easy to make with a computer

    • "Pink noise" that rolls off linearly with frequency ("1/f noise", "flicker noise")

    • "Brownian noise" ("1/f^2 noise") from random walk in time domain

    • Check out this Wikipedia article on noise colors

Audio Compression — Model and Residue

  • Idea: Build a simplified model of the audio signal — requires less space to represent.

  • Maybe the model is good enough for purpose: lossy compression

  • Otherwise send the model along with the residue — the difference between the model and the true signal: lossless compression

  • The residue might be a bit compressible too

Audio Compression — Model

  • Goal: build a simple parameterized approximate model of the audio signal

  • Time domain model

    • Audio signals tend to be "smooth" continuous and have continuous derivatives

    • Model the signal as lines or polynomials or splines

  • Frequency domain model

    • Audio signals tend to be periodic; frequencies tend to vary slowly

    • Model the spectrum with quantization in frequency and amplitude

    • Phase is confusing

Audio Compression — Residue

  • The residue can often look like noise

  • Still, the amplitude of the residue should be small relative to the signal amplitude

  • Common compression schemes — Huffman coding, Rice coding, arithmetic coding — take advantage of frequency distribution of residue

Lossless vs Lossy Compression

  • Lossless (e.g. FLAC) is going to be limited for a lot of kinds of sounds. The fancier the model, the more kinds of sounds that can be compressed well

  • Doing lossy well is harder, because mustn't throw away stuff that wrecks the sound. Psychoacoustics is needed. Tends to be done in frequency domain; models are generalized

Last modified: Monday, 20 April 2020, 12:32 PM