Collections
An Interesting Blog Post
A Critique of Rust's
std::collections
Vec, Array, Slice: Review
Arrays, Vecs and slices can only contain sized objects of a single type. They are accessed by computing a fixed offset using the usual index calculation
An "array" has a type that includes its size. It is an object, not a reference: stored on the stack sometimes, so watch out
A Vec is a fat pointer to a chunk of heap; the Vec includes the length of the pointed-to chunk. Vecs can be grown or shrunk
A slice is also a fat pointer with length, but points to an arbitrary chunk of memory. It can thus not be resized, and has a borrow of the memory it points to
You can make a slice from a Vec or array by
&
, because they implementDeref
. (Also&mut
, becauseDerefMut
.) Note that a cast may be necessary in some contexts, because a reference to an array or slice is also a legitimate thing
Vec, Array, Slice: Methods
Vec makes a good stack. Use the
push
,pop
andis_empty
methodsThe Vec
retain
anddedup
methods are pretty handy. The book describes the non-sorted dedup trickThe
split
andjoin
methods are used a lot in practice.chunks
andwindows
are really handy in a lot of kinds of analysisSorting and searching have straightforward methods. Having
binary_search
built-in is a fantastic idea, as this is a common source of bugs in other languagesLexicographic comparison of these is supported
VecDeque, LinkedList
VecDeque
is a circular queue stored in aVec
. Because it uses stop and start pointers, there will always be at least one dead cell. Normally insert withpush_back()
, remove withpop_front()
, but the queue is double-endedLinkedList
is a doubly-linked list. Don't use it unless you have to: its memory performance is terrible and it's pretty error-prone. As of Rust 1.26, its API is still sparse
BinaryHeap
A max-heap, which is annoying. Not a keyed heap: no way to mess with an element in the middle
I've used https://crates.io/crates/min-max-heap successfully, but it's still not a keyed heap
HashMap, HashSet, BTreeMap, BTreeSet
Sets are just maps with
()
for content. Since()
is a zero-sized type, this works fineHash
variants are open hash tables. Lots of extra storage (but arguably not enough). Cost of hashing, bad memory locality. Still, "constant-time" accessBTree
variants are, well, B-trees. Storage-efficient, better memory locality, but require a tree traversalBTree
variants can be nested because they areOrd
.Hash
variants cannotOwnership and mutability matter here. A map owns its keys and values: there is no way to mutate the key in-place
The
Entry
interface avoids some extra lookups by getting a key-value pair that can be modified or inserted. It's anenum
that depends on whether the entry currently is in the map. Normally, you will use the many methods provided for manipulating entries: this is classic combinator-chain stuffhttps://github.com/pdx-cs-rust/rust-misc/blob/master/histogram.rs
Sets include the usual set operators with infix equivalents.
is_subset()
is not defined as infix<=
Hashing
It's a complicated mess. The default hasher is a compromise between security and performance
You can specify your own hasher, but this is rarely desirable
Third-Party Containers
Some stuff on http://crates.io is actually a pretty accepted part of the Rust ecosystem
This is problematic: hard to tell the "really standard" stuff from the good stuff from the bad stuff
Worth looking at a few common crates
We have already seen
crossbeam
andcomplex
at the start of this course
rand
Crate for random number generation and use
Split into crates for generation and processing
Produces both pseudo-random and (if possible) hardware-random numbers
Key entry point is
rand::thread_rng()
, which gives a per-thread OK-secure PRNG seeded from hardwareYou want the
gen_range(*lower*, *upper*)
method normally, for uniform random numbers in a range. This requires therand::Rng
traithttps://play.rust-lang.org/?gist=19e5539fd0b8f5ce3c149d80340861b9
serde
Several crates for serializing and deserializing data: converting between Rust data structures and some kind of input/output format
Huge variety of serialization formats supported