Parallel Programming

Why Parallel?

  • Hot topic. Meh

  • CPUs/memory aren't getting faster, just more parallel

  • Big Data / Machine Learning etc

  • Some things are naturally concurrent: user input, agent-based systems, etc

  • Note that parallel ⇒ concurrent but not vice-versa

Why Not Parallel?

  • Generally harder to get right

  • Only gives linear speedups

Rust's Parallel Story

  • Eliminates shared-memory errors

  • Eliminates data races

  • No significant overhead

  • Parallel systems are still hard: partitioning, deadlocks, termination, etc

Approaches To Parallelism

  • "Background threads"

  • Worker pool

  • Pipeline

  • Data parallel

  • "Sea of synchronized objects"

Thread

  • Thread of control, with own registers and stack

  • Shares its memory with all other threads in its process

  • Rust uses "OS threads"

  • (Other languages sometimes use "green threads")

Approaches To Parallel Comms / Sync

  • Channels

  • Shared memory with mutexes, semaphores, etc

  • Maybe CPU-atomic operations

Thread Fork/Join, Shared Memory

Getting Memory To Threads

  • Can use move closure to move data into thread

  • Can use Arc to share data across threads

Rayon

  • Uses "map-reduce" paradigm with worker pool

  • The demo here is not too good a match to Rayon's use case

Channels

  • Nice way to get values sent for long-running threads

  • mpsc: allows building DAGs of channels

  • Need mpmc to distribute messages among threads

  • Data is actually moved or shared across channels; efficient

Pipeline

  • Easy-to-get-right thing; used in CPUs a lot

  • A bit tricky to set up

  • Only gives parallelism with multiple pipeline operations

Send + Sync

  • Marker traits for thread safety

  • Automatically maintained (for the most part) by the compiler

  • Send: Safe to move value to another thread

  • Sync: Safe to share a non-mut reference to value with another thread

  • Send and Sync are auto-derived for structs / enums

A Pipeline Iterator

  • Nice book example of iterator construction for pipelining

Synchronization

  • Mutex<T>: Wraps a value in a mutex lock

    • For example mutex = Mutex::new(0_usize)
    • Only one thread may complete mutex.lock() at a time
    • That thread gets a mutable "guard" for the wrapped data
    • Other threads are blocked until first releases lock
    • Returns a Result because mutex poisoning
    • For example *mutex.lock().unwrap() = 5
  • RwLock<T>: A mutex that will give out many shared immutable refs or one mutable ref at a time

  • Condvar: Just don't

  • AtomicUsize et al: Ensures that read-modify-write operations happen as an atomic thing

    • Provides methods like .fetch_add(1, Ordering::SeqCst)

    • The "memory ordering" should probably just always be set to that

    • Not as great as they sound

Book Misc

  • MPMC (with a funky implementation)

  • Mandelbrot

  • lazy_static!

Last modified: Tuesday, 18 May 2021, 2:03 PM