Move the files to final resting location

2025-07-01 05:46:13 -04:00 · 2019-02-09 23:31:11 -05:00
parent 543e4253cc
commit e9099d191e
6 changed files with 12 additions and 9 deletions
--- a/_posts/2019-02-04-understanding-allocations-in-rust.md
+++ b/_posts/2019-02-04-understanding-allocations-in-rust.md
@ -0,0 +1,108 @@
+---
+layout: post
+title: "Allocations in Rust"
+description: "An introduction to the memory model."
+category: 
+tags: [rust, understanding-allocations]
+---
+
+There's an alchemy of distilling complex technical topics into articles and videos
+that change the way programmers see the tools they interact with on a regular basis.
+I knew what a linker was, but there's a staggering amount of complexity in between
+[`main()` and your executable](https://www.youtube.com/watch?v=dOfucXtyEsU).
+Rust programmers use the [`Box`](https://doc.rust-lang.org/stable/std/boxed/struct.Box.html)
+type all the time, but there's a rich history of the Rust language itself wrapped up in
+[how special it is](https://manishearth.github.io/blog/2017/01/10/rust-tidbits-box-is-special/).
+
+In a similar vein, I want you to look at code and understand how memory is used;
+the complex choreography of operating system, compiler, and program that frees you
+to focus on functionality far-flung from frivolous book-keeping. The Rust compiler relieves
+a great deal of the cognitive burden associated with memory management, but we're going
+to step into its world for a while.
+
+Let's learn a bit about memory in Rust.
+
+# Table of Contents
+
+This post is intended as both guide and reference material; we'll work to establish
+an understanding of the different memory types Rust makes use of, then summarize each
+section at the end for easy future citation. To that end, a table of contents is in order:
+
+- Foreword
+- [Global Memory Usage: The Whole World](/2019/02/the-whole-world)
+- [Fixed Memory: Stacking Up](/2019/02/stacking-up)
+- [Dynamic Memory: A Heaping Helping](/2019/02/a-heaping-helping)
+- [Compiler Optimizations: What It's Done For You Lately](/2019/02/compiler-optimizations)
+- [Summary: What Are the Rules?](/2019/02/summary)
+
+# Foreword
+
+Rust's three defining features of [Performance, Reliability, and Productivity](https://www.rust-lang.org/)
+are all driven to a great degree by the how the Rust compiler understands
+[memory ownership](https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html). Unlike managed memory
+languages (Java, Python), Rust [doesn't really](https://words.steveklabnik.com/borrow-checking-escape-analysis-and-the-generational-hypothesis)
+garbage collect, leading to fast code when [dynamic (heap) memory](https://en.wikipedia.org/wiki/Memory_management#Dynamic_memory_allocation)
+isn't necessary. When heap memory is necessary, Rust ensures you can't accidentally mis-manage it.
+And because the compiler handles memory "ownership" for you, developers never need to worry about
+accidentally deleting data that was needed somewhere else.
+
+That said, there are situations where you won't benefit from work the Rust compiler is doing.
+If you:
+
+1. Never use `unsafe`
+2. Never use `#![feature(alloc)]` or the [`alloc` crate](https://doc.rust-lang.org/alloc/index.html)
+
+...then it's not possible for you to use dynamic memory! 
+
+For some uses of Rust, typically embedded devices, these constraints make sense.
+They have very limited memory, and the program binary size itself may significantly
+affect what's available! There's no operating system able to manage
+this ["virtual memory"](https://en.wikipedia.org/wiki/Virtual_memory) junk, but that's
+not an issue because there's only one running application. The
+[embedonomicon](https://docs.rust-embedded.org/embedonomicon/preface.html) is ever in mind,
+and interacting with the "real world" through extra peripherals is accomplished by
+reading and writing to [specific memory addresses](https://bob.cs.sonoma.edu/IntroCompOrg-RPi/sec-gpio-mem.html).
+
+Most Rust programs find these requirements overly burdensome though. C++ developers
+would struggle without access to [`std::vector`](https://en.cppreference.com/w/cpp/container/vector)
+(except those hardcore no-STL people), and Rust developers would struggle without
+[`std::vec`](https://doc.rust-lang.org/std/vec/struct.Vec.html). But in this scenario,
+`std::vec` is actually aliased to a part of the
+[`alloc` crate](https://doc.rust-lang.org/alloc/vec/struct.Vec.html), and thus off-limits.
+`Box`, `Rc`, etc., are also unusable for the same reason.
+
+Whether writing code for embedded devices or not, the important thing in both situations
+is how much you know *before your application starts* about what its memory usage will look like.
+In embedded devices, there's a small, fixed amount of memory to use.
+In a browser, you have no idea how large [google.com](https://www.google.com)'s home page is until you start
+trying to download it. The compiler uses this information (or lack thereof) to optimize
+how memory is used; put simply, your code runs faster when the compiler can guarantee exactly
+how much memory your program needs while it's running. This post is all about understanding
+how the compiler reasons about your program, with an emphasis on how to design your programs
+for performance.
+
+Now let's address some conditions and caveats before going much further:
+
+- We'll focus on "safe" Rust only; `unsafe` lets you use platform-specific allocation API's
+  ([`malloc`](https://www.tutorialspoint.com/c_standard_library/c_function_malloc.htm)) that we'll ignore.
+- We'll assume a "debug" build of Rust code (what you get with `cargo run` and `cargo test`)
+  and address (pun intended) release mode at the end (`cargo run --release` and `cargo test --release`).
+- All content will be run using Rust 1.32, as that's the highest currently supported in the
+  [Compiler Exporer](https://godbolt.org/). As such, we'll avoid upcoming innovations like
+  [compile-time evaluation of `static`](https://github.com/rust-lang/rfcs/blob/master/text/0911-const-fn.md)
+  that are available in nightly.
+- Because of the nature of the content, some (very simple) assembly-level code is involved.
+  We'll keep this simple, but I [found](https://stackoverflow.com/a/4584131/1454178)
+  a [refresher](https://stackoverflow.com/a/26026278/1454178) on the `push` and `pop`
+  [instructions](http://www.cs.virginia.edu/~evans/cs216/guides/x86.html)
+  was helpful while writing this post.
+- I've tried to be precise in saying only what I can prove using the tools (ASM, docs)
+  that are available. That said, if there's something said in error, please reach out
+  and let me know - [bradlee@speice.io](mailto:bradlee@speice.io)
+
+Finally, I'll do what I can to flag potential future changes but the Rust docs
+have a notice worth repeating:
+
+> Rust does not currently have a rigorously and formally defined memory model.
+>  
+> -- [the docs](https://doc.rust-lang.org/std/ptr/fn.read_volatile.html)
--- a/_posts/2019-02-05-the-whole-world.md
+++ b/_posts/2019-02-05-the-whole-world.md
@ -0,0 +1,294 @@
+---
+layout: post
+title: "Global Memory Usage: The Whole World"
+description: "Static considered slightly less harmful."
+category: 
+tags: [rust, understanding-allocations]
+---
+
+The first memory type we'll look at is pretty special: when Rust can prove that
+a *value* is fixed for the life of a program (`const`), and when a *reference* is valid for
+the duration of the program (`static` as a declaration, not
+[`'static`](https://doc.rust-lang.org/book/ch10-03-lifetime-syntax.html#the-static-lifetime)
+as a lifetime).
+Understanding the distinction between value and reference is important for reasons
+we'll go into below. The
+[full specification](https://github.com/rust-lang/rfcs/blob/master/text/0246-const-vs-static.md)
+for these two memory types is available, but we'll take a hands-on approach to the topic.
+
+# **const**
+
+The quick summary is this: `const` declares a read-only block of memory that is loaded
+as part of your program binary (during the call to [exec(3)](https://linux.die.net/man/3/exec)).
+Any `const` value resulting from calling a `const fn` is guaranteed to be materialized
+at compile-time (meaning that access at runtime will not invoke the `const fn`),
+even though the `const fn` functions are available at run-time as well. The compiler
+can choose to copy the constant value wherever it is deemed practical. Getting the address
+of a `const` value is legal, but not guaranteed to be the same even when referring to the
+same named identifier.
+
+The first point is a bit strange - "read-only memory".
+[The Rust book](https://doc.rust-lang.org/book/ch03-01-variables-and-mutability.html#differences-between-variables-and-constants)
+mentions in a couple places that using `mut` with constants is illegal,
+but it's also important to demonstrate just how immutable they are. *Typically* in Rust
+you can use "inner mutability" to modify things that aren't declared `mut`.
+[`RefCell`](https://doc.rust-lang.org/std/cell/struct.RefCell.html) provides an API
+to guarantee at runtime that some consistency rules are enforced:
+
+```rust
+use std::cell::RefCell;
+
+fn my_mutator(cell: &RefCell<u8>) {
+    // Even though we're given an immutable reference,
+    // the `replace` method allows us to modify the inner value.
+    cell.replace(14);
+}
+
+fn main() {
+    let cell = RefCell::new(25);
+    // Prints out 25
+    println!("Cell: {:?}", cell);
+    my_mutator(&cell);
+    // Prints out 14
+    println!("Cell: {:?}", cell);
+}
+```
+-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8e4bea1a718edaff4507944e825a54b2)
+
+When `const` is involved though, modifications are silently ignored:
+
+```rust
+use std::cell::RefCell;
+
+const CELL: RefCell<u8> = RefCell::new(25);
+
+fn my_mutator(cell: &RefCell<u8>) {
+    cell.replace(14);
+}
+
+fn main() {
+    // First line prints 25 as expected
+    println!("Cell: {:?}", &CELL);
+    my_mutator(&CELL);
+    // Second line *still* prints 25
+    println!("Cell: {:?}", &CELL);
+}
+```
+-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=88fe98110c33c1b3a51e341f48b8ae00)
+
+And a second example using [`Once`](https://doc.rust-lang.org/std/sync/struct.Once.html):
+
+```rust
+use std::sync::Once;
+
+const SURPRISE: Once = Once::new();
+
+fn main() {
+    // This is how `Once` is supposed to be used
+    SURPRISE.call_once(|| println!("Initializing..."));
+    // Because `Once` is a `const` value, we never record it
+    // having been initialized the first time, and this closure
+    // will also execute.
+    SURPRISE.call_once(|| println!("Initializing again???"));
+}
+```
+-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=c3cc5979b5e5434eca0f9ec4a06ee0ed)
+
+When the [`const` specification](https://github.com/rust-lang/rfcs/blob/26197104b7bb9a5a35db243d639aee6e46d35d75/text/0246-const-vs-static.md)
+refers to ["rvalues"](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3055.pdf), this is
+what they mean. [Clippy](https://github.com/rust-lang/rust-clippy) will treat this as an error,
+but it's still something to be aware of.
+
+The next thing to mention is that `const` values are loaded into memory *as part of your program binary*.
+Because of this, any `const` values declared in your program will be "realized" at compile-time;
+accessing them may trigger a main-memory lookup (with a fixed address, so your CPU may
+be able to prefetch the value), but that's it.
+
+```rust
+use std::cell::RefCell;
+
+const CELL: RefCell<u32> = RefCell::new(24);
+
+pub fn multiply(value: u32) -> u32 {
+    value * (*CELL.get_mut())
+}
+```
+-- [Compiler Explorer](https://godbolt.org/z/2KXUcN)
+
+The compiler only creates one `RefCell`, and uses it everywhere. However, that value
+is fully realized at compile time, and is fully stored in the `.L__unnamed_1` section.
+
+If it's helpful though, the compiler can choose to copy `const` values.
+
+```rust
+const FACTOR: u32 = 1000;
+
+pub fn multiply(value: u32) -> u32 {
+    value * FACTOR
+}
+
+pub fn multiply_twice(value: u32) -> u32 {
+    value * FACTOR * FACTOR
+}
+```
+-- [Compiler Explorer](https://godbolt.org/z/_JiT9O)
+
+In this example, the `FACTOR` value is turned into the `mov edi, 1000` instruction
+in both the `multiply` and `multiply_twice` functions; the "1000" value is never
+"stored" anywhere, as it's small enough to inline into the assembly instructions.
+
+Finally, getting the address of a `const` value is possible but not guaranteed
+to be unique (given that the compiler can choose to copy values). In my testing
+I was never able to get the compiler to copy a `const` value and get differing pointers,
+but the specifications are clear enough: *don't rely on pointers to `const`
+values being consistent*. To be frank, caring about locations for `const` values
+is almost certainly a code smell.
+
+# **static**
+
+Static variables are related to `const` variables, but take a slightly different approach.
+When the compiler can guarantee that a *reference* is fixed for the life of a program,
+you end up with a `static` variable (as opposed to *values* that are fixed for the
+duration a program is running). Because of this reference/value distinction, 
+static variables behave much more like what people expect from "global" variables.
+We'll look at regular static variables first, and then address the `lazy_static!()`
+and `thread_local!()` macros later.
+
+More generally, `static` variables are globally unique locations in memory,
+the contents of which are loaded as part of your program being read into main memory.
+They allow initialization with both raw values and `const fn` calls, and the initial
+value is loaded along with the program/library binary. All static variables must
+be of a type that implements the [`Sync`](https://doc.rust-lang.org/std/marker/trait.Sync.html)
+marker trait. And while `static mut` variables are allowed, mutating a static is considered
+an `unsafe` operation.
+
+The single biggest difference between `const` and `static` is the guarantees
+provided about uniqueness. Where `const` variables may or may not be copied
+in code, `static` variables are guarantee to be unique. If we take a previous
+`const` example and change it to `static`, the difference should be clear:
+
+```rust
+static FACTOR: u32 = 1000;
+
+pub fn multiply(value: u32) -> u32 {
+    value * FACTOR
+}
+
+pub fn multiply_twice(value: u32) -> u32 {
+    value * FACTOR * FACTOR
+}
+```
+-- [Compiler Explorer](https://godbolt.org/z/bSfBxn)
+
+Where [previously](https://godbolt.org/z/_JiT90) there were plenty of
+references to multiplying by 1000, the new assembly refers to `FACTOR`
+as a named memory location instead. No initialization work needs to be done,
+but the compiler can no longer prove the value never changes during execution.
+
+Next, let's talk about initialization. The simplest case is initializing
+static variables with either scalar or struct notation:
+
+```rust
+#[derive(Debug)]
+struct MyStruct {
+    x: u32
+}
+
+static MY_STRUCT: MyStruct = MyStruct {
+    // You can even reference other statics
+    // declared later
+    x: MY_VAL
+};
+
+static MY_VAL: u32 = 24;
+
+fn main() {
+    println!("Static MyStruct: {:?}", MY_STRUCT);
+}
+```
+-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=b538dbc46076f12db047af4f4403ee6e)
+
+Things get a bit weirder when using `const fn`. In most cases, things just work:
+
+```rust
+#[derive(Debug)]
+struct MyStruct {
+    x: u32
+}
+
+impl MyStruct {
+    const fn new() -> MyStruct {
+        MyStruct { x: 24 }
+    }
+}
+
+static MY_STRUCT: MyStruct = MyStruct::new();
+
+fn main() {
+    println!("const fn Static MyStruct: {:?}", MY_STRUCT);
+}
+```
+-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8c796a6e7fc273c12115091b707b0255)
+
+However, there's a caveat: you're currently not allowed to use `const fn` to initialize
+static variables of types that aren't marked `Sync`. As an example, even though
+[`RefCell::new()`](https://doc.rust-lang.org/std/cell/struct.RefCell.html#method.new)
+is `const fn`, because [`RefCell` isn't `Sync`](https://doc.rust-lang.org/std/cell/struct.RefCell.html#impl-Sync),
+you'll get an error at compile time:
+
+```rust
+use std::cell::RefCell;
+
+// error[E0277]: `std::cell::RefCell<u8>` cannot be shared between threads safely
+static MY_LOCK: RefCell<u8> = RefCell::new(0);
+```
+-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=c76ef86e473d07117a1700e21fd45560)
+
+It's likely that this will [change in the future](https://github.com/rust-lang/rfcs/blob/master/text/0911-const-fn.md) though.
+
+Which leads well to the next point: static variable types must implement the
+[`Sync` marker](https://doc.rust-lang.org/std/marker/trait.Sync.html).
+Because they're globally unique, it must be safe for you to access static variables
+from any thread at any time. Most `struct` definitions automatically implement the
+`Sync` trait because they contain only elements which themselves
+implement `Sync`. This is why earlier examples could get away with initializing
+statics, even though we never included an `impl Sync for MyStruct` in the code.
+For more on the `Sync` trait, the [Nomicon](https://doc.rust-lang.org/nomicon/send-and-sync.html)
+has a much more thorough treatment. But as an example, Rust refuses to compile
+our earlier example if we add a non-`Sync` element to the `struct` definition:
+
+```rust
+use std::cell::RefCell;
+
+struct MyStruct {
+    x: u32,
+    y: RefCell<u8>,
+}
+
+// error[E0277]: `std::cell::RefCell<u8>` cannot be shared between threads safely
+static MY_STRUCT: MyStruct = MyStruct {
+    x: 8,
+    y: RefCell::new(8)
+};
+```
+-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=40074d0248f056c296b662dbbff97cfc)
+
+Finally, while `static mut` variables are allowed, mutating them is an `unsafe` operation.
+Unlike `const` however, interior mutability is acceptable. To demonstrate:
+
+```rust
+use std::sync::Once;
+
+// This example adapted from https://doc.rust-lang.org/std/sync/struct.Once.html#method.call_once
+static INIT: Once = Once::new();
+
+fn main() {
+    // Note that while `INIT` is declared immutable, we're still allowed
+    // to mutate its interior
+    INIT.call_once(|| println!("Initializing..."));
+    // This code won't panic, as the interior of INIT was modified
+    // as part of the previous `call_once`
+    INIT.call_once(|| panic!("INIT was called twice!"));
+}
+```
+-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=3ba003a981a7ed7400240caadd384d59)
--- a/_posts/2019-02-06-stacking-up.md
+++ b/_posts/2019-02-06-stacking-up.md
@ -0,0 +1,558 @@
+---
+layout: post
+title: "Fixed Memory: Stacking Up"
+description: "We don't need no allocator."
+category: 
+tags: [rust, understanding-allocations]
+---
+
+`const` and `static` are perfectly fine, but it's very rare that we know
+at compile-time about either values or references that will be the same for the
+duration of our program. Put another way, it's not often the case that either you
+or your compiler knows how much memory your entire program will need.
+
+However, there are still some optimizations the compiler can do if it knows how much
+memory individual functions will need. Specifically, the compiler can make use of
+"stack" memory (as opposed to "heap" memory) which can be managed far faster in
+both the short- and long-term. When requesting memory, the
+[`push` instruction](http://www.cs.virginia.edu/~evans/cs216/guides/x86.html)
+can typically complete in [1 or 2 cycles](https://agner.org/optimize/instruction_tables.ods)
+(<1 nanosecond on modern CPUs). Contrast that to heap memory which requires an allocator
+(specialized software to track what memory is in use) to reserve space.
+And when you're finished with your memory, the `pop` instruction likewise runs in
+1-3 cycles, as opposed to an allocator needing to worry about memory fragmentation
+and other issues. All sorts of incredibly sophisticated techniques have been used
+to design allocators:
+- [Garbage Collection](https://en.wikipedia.org/wiki/Garbage_collection_(computer_science))
+  strategies like [Tracing](https://en.wikipedia.org/wiki/Tracing_garbage_collection)
+  (used in [Java](https://www.oracle.com/technetwork/java/javase/tech/g1-intro-jsp-135488.html))
+  and [Reference counting](https://en.wikipedia.org/wiki/Reference_counting)
+  (used in [Python](https://docs.python.org/3/extending/extending.html#reference-counts))
+- Thread-local structures to prevent locking the allocator in [tcmalloc](https://jamesgolick.com/2013/5/19/how-tcmalloc-works.html)
+- Arena structures used in [jemalloc](http://jemalloc.net/), which
+  [until recently](https://blog.rust-lang.org/2019/01/17/Rust-1.32.0.html#jemalloc-is-removed-by-default)
+  was the primary allocator for Rust programs!
+
+But no matter how fast your allocator is, the principle remains: the
+fastest allocator is the one you never use. As such, we're not going to discuss how exactly the
+[`push` and `pop` instructions work](http://www.cs.virginia.edu/~evans/cs216/guides/x86.html),
+but we'll focus instead on the conditions that enable the Rust compiler to use
+the faster stack-based allocation for variables.
+
+So, **how do we know when Rust will or will not use stack allocation for objects we create?**
+Looking at other languages, it's often easy to delineate
+between stack and heap. Managed memory languages (Python, Java,
+[C#](https://blogs.msdn.microsoft.com/ericlippert/2010/09/30/the-truth-about-value-types/))
+place everything on the heap. JIT compilers ([PyPy](https://www.pypy.org/),
+[HotSpot](https://www.oracle.com/technetwork/java/javase/tech/index-jsp-136373.html)) may
+optimize some heap allocations away, but you should never assume it will happen.
+C makes things clear with calls to special functions ([malloc(3)](https://linux.die.net/man/3/malloc)
+is one) being the way to use heap memory. Old C++ has the [`new`](https://stackoverflow.com/a/655086/1454178)
+keyword, though modern C++/C++11 is more complicated with [RAII](https://en.cppreference.com/w/cpp/language/raii).
+
+For Rust specifically, the principle is this: **stack allocation will be used for everything
+that doesn't involve "smart pointers" and collections.** We'll skip over a precise definition
+of the term "smart pointer" for now, and instead discuss what we should watch for when talking
+about the memory region used for allocation:
+
+1. Stack manipulation instructions (`push`, `pop`, and `add`/`sub` of the `rsp` register)
+   indicate allocation of stack memory:
+   ```rust
+   pub fn stack_alloc(x: u32) -> u32 {
+       // Space for `y` is allocated by subtracting from `rsp`,
+       // and then populated
+       let y = [1u8, 2, 3, 4];
+       // Space for `y` is deallocated by adding back to `rsp`
+       x
+   }
+   ```
+   -- [Compiler Explorer](https://godbolt.org/z/5WSgc9)
+
+2. Tracking when exactly heap allocation calls happen is difficult. It's typically easier to
+   watch for `call core::ptr::real_drop_in_place`, and infer that a heap allocation happened
+   in the recent past:
+   ```rust
+   pub fn heap_alloc(x: usize) -> usize {
+       // Space for elements in a vector has to be allocated
+       // on the heap, and is then de-allocated once the
+       // vector goes out of scope
+       let y: Vec<u8> = Vec::with_capacity(x);
+       x
+   }
+   ```
+   -- [Compiler Explorer](https://godbolt.org/z/epfgoQ) (`real_drop_in_place` happens on line 1317)
+   <span style="font-size: .8em">Note: While the [`Drop` trait](https://doc.rust-lang.org/std/ops/trait.Drop.html)
+   is [called for stack-allocated objects](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=87edf374d8983816eb3d8cfeac657b46),
+   the Rust standard library only defines `Drop` implementations for types that involve heap allocation.</span> 
+
+3. If you don't want to inspect the assembly, use a custom allocator that's able to track
+   and alert when heap allocations occur. Crates like [`alloc_counter`](https://crates.io/crates/alloc_counter)
+   are designed for exactly this purpose.
+
+With all that in mind, let's talk about situations in which we're guaranteed to use stack memory:
+
+- Structs are created on the stack.
+- Function arguments are passed on the stack, meaning the
+  [`#[inline]` attribute](https://doc.rust-lang.org/reference/attributes.html#inline-attribute)
+  will not change the memory region used.
+- Enums and unions are stack-allocated.
+- [Arrays](https://doc.rust-lang.org/std/primitive.array.html) are always stack-allocated.
+- Closures capture their arguments on the stack.
+- Generics will use stack allocation, even with dynamic dispatch.
+- [`Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html) types are guaranteed to be
+  stack-allocated, and copying them will be done in stack memory.
+- [`Iterator`s](https://doc.rust-lang.org/std/iter/trait.Iterator.html) in the standard library
+  are stack-allocated even when iterating over heap-based collections.
+
+# Structs
+
+The simplest case comes first. When creating vanilla `struct` objects, we use stack memory
+to hold their contents:
+
+```rust
+struct Point {
+    x: u64,
+    y: u64,
+}
+
+struct Line {
+    a: Point,
+    b: Point,
+}
+
+pub fn make_line() {
+    // `origin` is stored in the first 16 bytes of memory
+    // starting at location `rsp`
+    let origin = Point { x: 0, y: 0 };
+    // `point` makes up the next 16 bytes of memory
+    let point = Point { x: 1, y: 2 };
+
+    // When creating `ray`, we just move the content out of
+    // `origin` and `point` into the next 32 bytes of memory
+    let ray = Line { a: origin, b: point };
+}
+```
+-- [Compiler Explorer](https://godbolt.org/z/vri9BE)
+
+Note that while some extra-fancy instructions are used for memory manipulation in the assembly,
+the `sub rsp, 64` instruction indicates we're still working with the stack.
+
+# Function arguments
+
+Have you ever wondered how functions communicate with each other? Like, once the variables are
+given to you, everything's fine. But how do you "give" those variables to another function?
+How do you get the results back afterward? The answer: the compiler arranges memory and
+assembly instructions using a pre-determined
+[calling convention](http://llvm.org/docs/LangRef.html#calling-conventions).
+This convention governs the rules around where arguments needed by a function will be located
+(either in memory offsets relative to the stack pointer `rsp`, or in other registers), and
+where the results can be found once the function has finished. And when multiple languages
+agree on what the calling conventions are, you can do things like having
+[Go call Rust code](https://blog.filippo.io/rustgo/)!
+
+Put simply: it's the compiler's job to figure out how to call other functions, and you can assume
+that the compiler is good at its job.
+
+We can see this in action using a simple example:
+
+```rust
+struct Point {
+    x: i64,
+    y: i64,
+}
+
+// We use integer division operations to keep
+// the assembly clean, understanding the result
+// isn't accurate.
+fn distance(a: &Point, b: &Point) -> i64 {
+    // Immediately subtract from `rsp` the bytes needed
+    // to hold all the intermediate results - this is
+    // the stack allocation step
+
+    // The compiler used the `rdi` and `rsi` registers
+    // to pass our arguments, so read them in
+    let x1 = a.x;
+    let x2 = b.x;
+    let y1 = a.y;
+    let y2 = b.y;
+
+    // Do the actual math work
+    let x_pow = (x1 - x2) * (x1 - x2);
+    let y_pow = (y1 - y2) * (y1 - y2);
+    let squared = x_pow + y_pow;
+    squared / squared
+    
+    // Our final result will be stored in the `rax` register
+    // so that our caller knows where to retrieve it.
+    // Finally, add back to `rsp` the stack memory that is
+    // now ready to be used by other functions.
+}
+
+pub fn total_distance() {
+    let start = Point { x: 1, y: 2 };
+    let middle = Point { x: 3, y: 4 };
+    let end = Point { x: 5, y: 6 };
+
+    let _dist_1 = distance(&start, &middle);
+    let _dist_2 = distance(&middle, &end);
+}
+```
+-- [Compiler Explorer](https://godbolt.org/z/Qmx4ST)
+
+As a consequence of function arguments never using heap memory, we can also
+infer that functions using the `#[inline]` attributes also do not heap-allocate.
+But better than inferring, we can look at the assembly to prove it:
+
+```rust
+struct Point {
+    x: i64,
+    y: i64,
+}
+
+// Note that there is no `distance` function in the assembly output,
+// and the total line count goes from 229 with inlining off
+// to 306 with inline on. Even still, no heap allocations occur.
+#[inline(always)]
+fn distance(a: &Point, b: &Point) -> i64 {
+    let x1 = a.x;
+    let x2 = b.x;
+    let y1 = a.y;
+    let y2 = b.y;
+
+    let x_pow = (a.x - b.x) * (a.x - b.x);
+    let y_pow = (a.y - b.y) * (a.y - b.y);
+    let squared = x_pow + y_pow;
+    squared / squared
+}
+
+pub fn total_distance() {
+    let start = Point { x: 1, y: 2 };
+    let middle = Point { x: 3, y: 4 };
+    let end = Point { x: 5, y: 6 };
+
+    let _dist_1 = distance(&start, &middle);
+    let _dist_2 = distance(&middle, &end);
+}
+```
+-- [Compiler Explorer](https://godbolt.org/z/30Sh66)
+
+Finally, passing by value (arguments with type
+[`Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html))
+and passing by reference (either moving ownership or passing a pointer) may have
+[slightly different layouts in assembly](https://godbolt.org/z/sKi_kl), but will
+still use either stack memory or CPU registers.
+
+# Enums
+
+If you've ever worried that wrapping your types in
+[`Option`](https://doc.rust-lang.org/stable/core/option/enum.Option.html) or
+[`Result`](https://doc.rust-lang.org/stable/core/result/enum.Result.html) would
+finally make them large enough that Rust decides to use heap allocation instead,
+fear no longer: `enum` and union types don't use heap allocation:
+
+```rust
+enum MyEnum {
+    Small(u8),
+    Large(u64)
+}
+
+struct MyStruct {
+    x: MyEnum,
+    y: MyEnum,
+}
+
+pub fn enum_compare() {
+    let x = MyEnum::Small(0);
+    let y = MyEnum::Large(0);
+
+    let z = MyStruct { x, y };
+
+    let opt = Option::Some(z);
+}
+```
+-- [Compiler Explorer](https://godbolt.org/z/HK7zBx)
+
+Because the size of an `enum` is the size of its largest element plus a flag,
+the compiler can predict how much memory is used no matter which variant
+of an enum is currently stored in a variable. Thus, enums and unions have no
+need of heap allocation. There's unfortunately not a great way to show this
+in assembly, so I'll instead point you to the
+[`core::mem::size_of`](https://doc.rust-lang.org/stable/core/mem/fn.size_of.html#size-of-enums)
+documentation. 
+
+# Arrays
+
+The array type is guaranteed to be stack allocated, which is why the array size must
+be declared. Interestingly enough, this can be used to cause safe Rust programs to crash:
+
+```rust
+// 256 bytes
+#[derive(Default)]
+struct TwoFiftySix {
+    _a: [u64; 32]
+}
+
+// 8 kilobytes
+#[derive(Default)]
+struct EightK {
+    _a: [TwoFiftySix; 32]
+}
+
+// 256 kilobytes
+#[derive(Default)]
+struct TwoFiftySixK {
+    _a: [EightK; 32]
+}
+
+// 8 megabytes - exceeds space typically provided for the stack,
+// though the kernel can be instructed to allocate more.
+// On Linux, you can check stack size using `ulimit -s`
+#[derive(Default)]
+struct EightM {
+    _a: [TwoFiftySixK; 32]
+}
+
+fn main() {
+    // Because we already have things in stack memory
+    // (like the current function call stack), allocating another
+    // eight megabytes of stack memory crashes the program
+    let _x = EightM::default();
+}
+```
+-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=587a6380a4914bcbcef4192c90c01dc4)
+
+There aren't any security implications of this (no memory corruption occurs),
+but it's good to note that the Rust compiler won't move arrays into heap memory
+even if they can be reasonably expected to overflow the stack.
+
+# Closures
+
+Rules for how anonymous functions capture their arguments are typically language-specific.
+In Java, [Lambda Expressions](https://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html)
+are actually objects created on the heap that capture local primitives by copying, and capture
+local non-primitives as (`final`) references.
+[Python](https://docs.python.org/3.7/reference/expressions.html#lambda) and
+[JavaScript](https://javascriptweblog.wordpress.com/2010/10/25/understanding-javascript-closures/)
+both bind *everything* by reference normally, but Python can also
+[capture values](https://stackoverflow.com/a/235764/1454178) and JavaScript has
+[Arrow functions](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/Arrow_functions).
+
+In Rust, arguments to closures are the same as arguments to other functions;
+closures are simply functions that don't have a declared name. Some weird ordering
+of the stack may be required to handle them, but it's the compiler's responsiblity
+to figure it out.
+
+Each example below has the same effect, but compile to very different programs.
+In the simplest case, we immediately run a closure returned by another function.
+Because we don't store a reference to the closure, the stack memory needed to
+store the captured values is contiguous:
+
+```rust
+fn my_func() -> impl FnOnce() {
+    let x = 24;
+    // Note that this closure in assembly looks exactly like
+    // any other function; you even use the `call` instruction
+    // to start running it.
+    move || { x; }
+}
+
+pub fn immediate() {
+    my_func()();
+    my_func()();
+}
+```
+-- [Compiler Explorer](https://godbolt.org/z/mgJ2zl), 25 total assembly instructions
+
+If we store a reference to the closure, the Rust compiler keeps values it needs
+in the stack memory of the original function. Getting the details right is a bit harder,
+so the instruction count goes up even though this code is functionally equivalent
+to our original example:
+
+```rust
+pub fn simple_reference() {
+    let x = my_func();
+    let y = my_func();
+    y();
+    x();
+}
+```
+-- [Compiler Explorer](https://godbolt.org/z/K_dj5n), 55 total assembly instructions
+
+Even things like variable order can make a difference in instruction count:
+
+```rust
+pub fn complex() {
+    let x = my_func();
+    let y = my_func();
+    x();
+    y();
+}
+```
+-- [Compiler Explorer](https://godbolt.org/z/p37qFl), 70 total assembly instructions
+
+In every circumstance though, the compiler ensured that no heap allocations were necessary.
+
+# Generics
+
+Traits in Rust come in two broad forms: static dispatch (monomorphization, `impl Trait`)
+and dynamic dispatch (trait objects, `dyn Trait`). While dynamic dispatch is often
+*associated* with trait objects being stored in the heap, dynamic dispatch can be used
+with stack allocated objects as well:
+
+```rust
+trait GetInt {
+    fn get_int(&self) -> u64;
+}
+
+// vtable stored at section L__unnamed_1
+struct WhyNotU8 {
+    x: u8
+}
+impl GetInt for WhyNotU8 {
+    fn get_int(&self) -> u64 {
+        self.x as u64
+    } 
+}
+
+// vtable stored at section L__unnamed_2
+struct ActualU64 {
+    x: u64
+}
+impl GetInt for ActualU64 {
+    fn get_int(&self) -> u64 {
+        self.x
+    }
+}
+
+// `&dyn` declares that we want to use dynamic dispatch
+// rather than monomorphization, so there is only one
+// `retrieve_int` function that shows up in the final assembly.
+// If we used generics, there would be one implementation of
+// `retrieve_int` for each type that implements `GetInt`.
+pub fn retrieve_int(u: &dyn GetInt) {
+    // In the assembly, we just call an address given to us
+    // in the `rsi` register and hope that it was set up
+    // correctly when this function was invoked.
+    let x = u.get_int();
+}
+
+pub fn do_call() {
+    // Note that even though the vtable for `WhyNotU8` and
+    // `ActualU64` includes a pointer to
+    // `core::ptr::real_drop_in_place`, it is never invoked.
+    let a = WhyNotU8 { x: 0 };
+    let b = ActualU64 { x: 0 };
+
+    retrieve_int(&a);
+    retrieve_int(&b);
+}
+```
+-- [Compiler Explorer](https://godbolt.org/z/u_yguS)
+
+It's hard to imagine practical situations where dynamic dispatch would be
+used for objects that aren't heap allocated, but it technically can be done.
+
+# Copy types
+
+Understanding move semantics and copy semantics in Rust is weird at first. The Rust docs
+[go into detail](https://doc.rust-lang.org/stable/core/marker/trait.Copy.html)
+far better than can be addressed here, so I'll leave them to do the job.
+Even from a memory perspective though, their guideline is reasonable:
+[if your type can implemement `Copy`, it should](https://doc.rust-lang.org/stable/core/marker/trait.Copy.html#when-should-my-type-be-copy).
+While there are potential speed tradeoffs to *benchmark* when discussing `Copy`
+(move semantics for stack objects vs. copying stack pointers vs. copying stack `struct`s), 
+*it's impossible for `Copy` to introduce a heap allocation*.
+
+But why is this the case? Fundamentally, it's because the language controls
+what `Copy` means -
+["the behavior of `Copy` is not overloadable"](https://doc.rust-lang.org/std/marker/trait.Copy.html#whats-the-difference-between-copy-and-clone)
+because it's a marker trait. From there we'll note that a type
+[can implement `Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html#when-can-my-type-be-copy)
+if (and only if) its components implement `Copy`, and that
+[no heap-allocated types implement `Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html#implementors).
+Thus, assignments involving heap types are always move semantics, and new heap
+allocations won't occur without explicit calls to
+[`clone()`](https://doc.rust-lang.org/std/clone/trait.Clone.html#tymethod.clone).
+
+```rust
+#[derive(Clone)]
+struct Cloneable {
+    x: Box<u64>
+}
+
+// error[E0204]: the trait `Copy` may not be implemented for this type
+#[derive(Copy, Clone)]
+struct NotCopyable {
+    x: Box<u64>
+}
+```
+-- [Compiler Explorer](https://godbolt.org/z/VToRuK)
+
+# Iterators
+
+In [managed memory languages](https://www.youtube.com/watch?v=bSkpMdDe4g4&feature=youtu.be&t=357)
+(like Java), there's a subtle difference between these two code samples:
+
+```java
+public static int sum_for(List<Long> vals) {
+    long sum = 0;
+    // Regular for loop
+    for (int i = 0; i < vals.length; i++) {
+        sum += vals[i];
+    }
+    return sum;
+}
+
+public static int sum_foreach(List<Long> vals) {
+    long sum = 0;
+    // "Foreach" loop - uses iteration
+    for (Long l : vals) {
+        sum += l;
+    }
+    return sum;
+}
+```
+
+In the `sum_for` function, nothing terribly interesting happens. In `sum_foreach`,
+an object of type [`Iterator`](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Iterator.html)
+is allocated on the heap, and will eventually be garbage-collected. This isn't a great design;
+iterators are often transient objects that you need during a function and can discard
+once the function ends. Sounds exactly like the issue stack-allocated objects address, no?
+
+In Rust, iterators are allocated on the stack. The objects to iterate over are almost
+certainly in heap memory, but the iterator itself
+([`Iter`](https://doc.rust-lang.org/std/slice/struct.Iter.html)) doesn't need to use the heap.
+In each of the examples below we iterate over a collection, but will never need to allocate
+a object on the heap to clean up:
+
+```rust
+use std::collections::HashMap;
+// There's a lot of assembly generated, but if you search in the text,
+// there are no references to `real_drop_in_place` anywhere.
+
+pub fn sum_vec(x: &Vec<u32>) {
+    let mut s = 0;
+    // Basic iteration over vectors doesn't need allocation
+    for y in x {
+        s += y;
+    }
+}
+
+pub fn sum_enumerate(x: &Vec<u32>) {
+    let mut s = 0;
+    // More complex iterators are just fine too
+    for (_i, y) in x.iter().enumerate() {
+        s += y;
+    }
+}
+
+pub fn sum_hm(x: &HashMap<u32, u32>) {
+    let mut s = 0;
+    // And it's not just Vec, all types will allocate the iterator
+    // on stack memory
+    for y in x.values() {
+        s += y;
+    }
+}
+```
+-- [Compiler Explorer](https://godbolt.org/z/FTT3CT)
--- a/_posts/2019-02-07-a-heaping-helping.md
+++ b/_posts/2019-02-07-a-heaping-helping.md
@ -0,0 +1,254 @@
+---
+layout: post
+title: "Dynamic Memory: A Heaping Helping"
+description: "The reason Rust exists."
+category: 
+tags: [rust, understanding-allocations]
+---
+
+Managing dynamic memory is hard. Some languages assume users will do it themselves (C, C++),
+and some languages go to extreme lengths to protect users from themselves (Java, Python). In Rust,
+how the language uses dynamic memory (also referred to as the **heap**) is a system called *ownership*.
+And as the docs mention, ownership
+[is Rust's most unique feature](https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html).
+
+The heap is used in two situations: when the compiler is unable to predict the *total size
+of memory needed*, or *how long the memory is needed for*, it will allocate space in the heap.
+This happens pretty frequently; if you want to download the Google home page, you won't know
+how large it is until your program runs. And when you're finished with Google, whenever that
+happens to be, we deallocate the memory so it can be used to store other webpages. If you're 
+interested in a slightly longer explanation of the heap, check out
+[The Stack and the Heap](https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html#the-stack-and-the-heap)
+in Rust's documentation.
+
+We won't go into detail on how the heap is managed; the
+[ownership documentation](https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html)
+does a phenomenal job explaining both the "why" and "how" of memory management. Instead,
+we're going to focus on understanding "when" heap allocations occur in Rust.
+
+To start off, take a guess for how many allocations happen in the program below:
+
+```rust
+fn main() {}
+```
+
+It's obviously a trick question; while no heap allocations happen as a result of
+the code listed above, the setup needed to call `main` does allocate on the heap.
+Here's a way to show it:
+
+```rust
+#![feature(integer_atomics)]
+use std::alloc::{GlobalAlloc, Layout, System};
+use std::sync::atomic::{AtomicU64, Ordering};
+
+static ALLOCATION_COUNT: AtomicU64 = AtomicU64::new(0);
+
+struct CountingAllocator;
+
+unsafe impl GlobalAlloc for CountingAllocator {
+    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
+        ALLOCATION_COUNT.fetch_add(1, Ordering::SeqCst);
+        System.alloc(layout)
+    }
+    
+    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
+        System.dealloc(ptr, layout);
+    }
+}
+
+#[global_allocator]
+static A: CountingAllocator = CountingAllocator;
+
+fn main() {
+    let x = ALLOCATION_COUNT.fetch_add(0, Ordering::SeqCst);
+    println!("There were {} allocations before calling main!", x);
+}
+```
+-- [Rust Playground](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=fb5060025ba79fc0f906b65a4ef8eb8e)
+
+As of the time of writing, there are five allocations that happen before `main`
+is ever called.
+
+But when we want to understand more practically where heap allocation happens,
+we'll follow this guide:
+
+- Smart pointers hold their contents in the heap
+- Collections are smart pointers for many objects at a time, and reallocate
+  when they need to grow
+
+Finally, there are two "addendum" issues that are important to address when discussing
+Rust and the heap:
+- Stack-based alternatives to some standard library types are available
+- Special allocators to track memory behavior are available
+
+# Smart pointers
+
+The first thing to note are the "smart pointer" types.
+When you have data that must outlive the scope in which it is declared,
+or your data is of unknown or dynamic size, you'll make use of these types.
+
+The term [smart pointer](https://en.wikipedia.org/wiki/Smart_pointer)
+comes from C++, and while it's closely linked to a general design pattern of
+["Resource Acquisition Is Initialization"](https://en.cppreference.com/w/cpp/language/raii),
+we'll use it here specifically to describe objects that are responsible for managing
+ownership of data allocated on the heap. The smart pointers available in the `alloc`
+crate should look mostly familiar:
+- [`Box`](https://doc.rust-lang.org/alloc/boxed/struct.Box.html)
+- [`Rc`](https://doc.rust-lang.org/alloc/rc/struct.Rc.html)
+- [`Arc`](https://doc.rust-lang.org/alloc/sync/struct.Arc.html)
+- [`Cow`](https://doc.rust-lang.org/alloc/borrow/enum.Cow.html)
+
+The [standard library](https://doc.rust-lang.org/std/) also defines some smart pointers
+to manage heap objects, though more than can be covered here. Some examples:
+- [`RwLock`](https://doc.rust-lang.org/std/sync/struct.RwLock.html)
+- [`Mutex`](https://doc.rust-lang.org/std/sync/struct.Mutex.html)
+
+Finally, there is one ["gotcha"](https://www.merriam-webster.com/dictionary/gotcha):
+**cell types** (like [`RefCell`](https://doc.rust-lang.org/stable/core/cell/struct.RefCell.html))
+look and behave similarly, but **don't involve heap allocation**. The
+[`core::cell` docs](https://doc.rust-lang.org/stable/core/cell/index.html)
+have more information.
+
+When a smart pointer is created, the data it is given is placed in heap memory and
+the location of that data is recorded in the smart pointer. Once the smart pointer
+has determined it's safe to deallocate that memory (when a `Box` has
+[gone out of scope](https://doc.rust-lang.org/stable/std/boxed/index.html) or when
+reference count for an object [goes to zero](https://doc.rust-lang.org/alloc/rc/index.html)),
+the heap space is reclaimed. We can prove these types use heap memory by
+looking at code:
+
+```rust
+use std::rc::Rc;
+use std::sync::Arc;
+use std::borrow::Cow;
+
+pub fn my_box() {
+    // Drop at assembly line 1640
+    Box::new(0);
+}
+
+pub fn my_rc() {
+    // Drop at assembly line 1650
+    Rc::new(0);
+}
+
+pub fn my_arc() {
+    // Drop at assembly line 1660
+    Arc::new(0);
+}
+
+pub fn my_cow() {
+    // Drop at assembly line 1672
+    Cow::from("drop");
+}
+```
+-- [Compiler Explorer](https://godbolt.org/z/4AMQug)
+
+# Collections
+
+Collections types use heap memory because their contents have dynamic size; they will request
+more memory [when needed](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.reserve),
+and can [release memory](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.shrink_to_fit)
+when it's no longer necessary. This dynamic property forces Rust to heap allocate
+everything they contain. In a way, **collections are smart pointers for many objects at once.**
+Common types that fall under this umbrella are
+[`Vec`](https://doc.rust-lang.org/stable/alloc/vec/struct.Vec.html),
+[`HashMap`](https://doc.rust-lang.org/stable/std/collections/struct.HashMap.html), and
+[`String`](https://doc.rust-lang.org/stable/alloc/string/struct.String.html)
+(not [`&str`](https://doc.rust-lang.org/std/primitive.str.html)).
+
+But while collections store the objects they own in heap memory, *creating new collections
+will not allocate on the heap*. This is a bit weird; if we call `Vec::new()`, the
+assembly shows a corresponding call to `real_drop_in_place`:
+
+```rust
+pub fn my_vec() {
+    // Drop in place at line 481
+    Vec::<u8>::new();
+}
+```
+-- [Compiler Explorer](https://godbolt.org/z/1WkNtC)
+
+But because the vector has no elements it is managing, no calls to the allocator
+will ever be dispatched:
+
+```rust
+use std::alloc::{GlobalAlloc, Layout, System};
+use std::sync::atomic::{AtomicBool, Ordering};
+
+fn main() {
+    // Turn on panicking if we allocate on the heap
+    DO_PANIC.store(true, Ordering::SeqCst);
+    
+    // Interesting bit happens here
+    let x: Vec<u8> = Vec::new();
+    drop(x);
+    
+    // Turn panicking back off, some deallocations occur
+    // after main as well.
+    DO_PANIC.store(false, Ordering::SeqCst);
+}
+
+#[global_allocator]
+static A: PanicAllocator = PanicAllocator;
+static DO_PANIC: AtomicBool = AtomicBool::new(false);
+struct PanicAllocator;
+
+unsafe impl GlobalAlloc for PanicAllocator {
+    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
+        if DO_PANIC.load(Ordering::SeqCst) {
+            panic!("Unexpected allocation.");
+        }
+        System.alloc(layout)
+    }
+    
+    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
+        if DO_PANIC.load(Ordering::SeqCst) {
+            panic!("Unexpected deallocation.");
+        }
+        System.dealloc(ptr, layout);
+    }
+}
+```
+-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=831a297d176d015b1f9ace01ae416cc6)
+
+Other standard library types follow the same behavior; make sure to check out
+[`HashMap::new()`](https://doc.rust-lang.org/std/collections/hash_map/struct.HashMap.html#method.new),
+and [`String::new()`](https://doc.rust-lang.org/std/string/struct.String.html#method.new).
+
+# Heap Alternatives
+
+While it is a bit strange for us to talk of the stack after spending time with the heap,
+it's worth pointing out that some heap-allocated objects in Rust have stack-based counterparts
+provided by other crates. If you have need of the functionality, but want to avoid allocating,
+there are some great alternatives.
+
+When it comes to some of the standard library smart pointers
+([`RwLock`](https://doc.rust-lang.org/std/sync/struct.RwLock.html) and
+[`Mutex`](https://doc.rust-lang.org/std/sync/struct.Mutex.html)), stack-based alternatives
+are provided in crates like [parking_lot](https://crates.io/crates/parking_lot) and
+[spin](https://crates.io/crates/spin). You can check out
+[`lock_api::RwLock`](https://docs.rs/lock_api/0.1.5/lock_api/struct.RwLock.html),
+[`lock_api::Mutex`](https://docs.rs/lock_api/0.1.5/lock_api/struct.Mutex.html), and
+[`spin::Once`](https://mvdnes.github.io/rust-docs/spin-rs/spin/struct.Once.html)
+if you're in need of synchronization primitives.
+
+[thread_id](https://crates.io/crates/thread-id)
+may still be necessary if you're implementing an allocator (*cough cough* the author *cough cough*)
+because [`thread::current().id()`](https://doc.rust-lang.org/std/thread/struct.ThreadId.html)
+[uses a `thread_local!` structure](https://doc.rust-lang.org/stable/src/std/sys_common/thread_info.rs.html#22-40)
+that needs heap allocation.
+
+# Tracing Allocators
+
+When writing performance-sensitive code, there's no alternative to measuring your code.
+If you didn't write a benchmark,
+[you don't care about it's performance](https://www.youtube.com/watch?v=2EWejmkKlxs&feature=youtu.be&t=263)
+You should never rely on your instincts when
+[a microsecond is an eternity](https://www.youtube.com/watch?v=NH1Tta7purM).
+
+Similarly, there's great work going on in Rust with allocators that keep track of what
+they're doing. [`alloc_counter`](https://crates.io/crates/alloc_counter) was designed
+for exactly this purpose. When it comes to tracking heap behavior, you shouldn't just
+rely on the language; please measure and make sure that you have tools in place to catch
+any issues that come up.
--- a/_posts/2019-02-08-compiler-optimizations.md
+++ b/_posts/2019-02-08-compiler-optimizations.md
@ -0,0 +1,187 @@
+---
+layout: post
+title: "Compiler Optimizations: What It's Done Lately"
+description: "A lot. The answer is a lot."
+category: 
+tags: [rust, understanding-allocations]
+---
+
+Up to this point, we've been discussing memory usage in the Rust language
+by focusing on simple rules that are mostly right for small chunks of code.
+We've spent time showing how those rules work themselves out in practice,
+and become familiar with reading the assembly code needed to see each memory
+type (global, stack, heap) in action.
+
+But throughout the content so far, we've put a handicap on the code.
+In the name of consistent and understandable results, we've asked the
+compiler to pretty please leave the training wheels on. Now is the time
+where we throw out all the rules and take the kid gloves off. As it turns out,
+both the Rust compiler and the LLVM optimizers are incredibly sophisticated,
+and we'll step back and let them do their job.
+
+Similar to ["What Has My Compiler Done For Me Lately?"](https://www.youtube.com/watch?v=bSkpMdDe4g4),
+we're focusing on interesting things the Rust language (and LLVM!) can do
+as regards memory management. We'll still be looking at assembly code to
+understand what's going on, but it's important to mention again:
+**please use automated tools like
+[alloc-counter](https://crates.io/crates/alloc_counter) to double-check 
+memory behavior if it's something you care about**. 
+It's far too easy to mis-read assembly in large code sections, you should
+always have an automated tool verify behavior if you care about memory usage.
+
+The guiding principal as we move forward is this: *optimizing compilers
+won't produce worse assembly than we started with.* There won't be any
+situations where stack allocations get moved to heap allocations.
+There will, however, be an opera of optimization.
+
+# The Case of the Disappearing Box
+
+Our first optimization comes when LLVM can reason that the lifetime of an object
+is sufficiently short that heap allocations aren't necessary. In these cases,
+LLVM will move the allocation to the stack instead! The way this interacts
+with `#[inline]` attributes is a bit opaque, but the important part is that LLVM
+can sometimes do better than the baseline Rust language.
+
+```rust
+use std::alloc::{GlobalAlloc, Layout, System};
+use std::sync::atomic::{AtomicBool, Ordering};
+
+pub fn main() {
+    // Turn on panicking if we allocate on the heap
+    DO_PANIC.store(true, Ordering::SeqCst);
+    
+    // This code will only run with the mode set to "Release".
+    // If you try running in "Debug", you'll get a panic.
+    let x = Box::new(0);
+    drop(x);
+    
+    // Turn off panicking, as there are some deallocations
+    // when we exit main.
+    DO_PANIC.store(false, Ordering::SeqCst);
+}
+
+#[global_allocator]
+static A: PanicAllocator = PanicAllocator;
+static DO_PANIC: AtomicBool = AtomicBool::new(false);
+struct PanicAllocator;
+
+unsafe impl GlobalAlloc for PanicAllocator {
+    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
+        if DO_PANIC.load(Ordering::SeqCst) {
+            panic!("Unexpected allocation.");
+        }
+        System.alloc(layout)
+    }
+    
+    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
+        if DO_PANIC.load(Ordering::SeqCst) {
+            panic!("Unexpected deallocation.");
+        }
+        System.dealloc(ptr, layout);
+    }
+}
+```
+-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=614994a20e362bf04de868b19daf5ca4)
+
+# Vectors of Usual Size
+
+With some collections, LLVM can predict how large they will become
+and allocate the entire size on the stack instead of the heap.
+This works whether with both the pre-allocation (`Vec::with_capacity`)
+*and re-allocation* (`Vec::push`) methods for collections types.
+Not only can LLVM predict sizing if you reserve the fully size up front,
+it can see through the resizing operations and find the total size.
+While this specific optimization is unlikely to come up in production
+usage, it's cool to note that LLVM does a considerable amount of work
+to understand what code actually does.
+
+```rust
+use std::alloc::{GlobalAlloc, Layout, System};
+use std::sync::atomic::{AtomicBool, Ordering};
+
+fn main() {
+    // Turn on panicking if we allocate on the heap
+    DO_PANIC.store(true, Ordering::SeqCst);
+    
+    // If the compiler can predict how large a vector will be,
+    // it can optimize out the heap storage needed. This also
+    // works with `Vec::with_capacity()`, but the push case
+    // is a bit more interesting.
+    let mut x: Vec<u64> = Vec::new();
+    x.push(12);
+    assert_eq!(x[0], 12);
+    drop(x);
+    
+    // Turn off panicking, as there are some deallocations
+    // when we exit main.
+    DO_PANIC.store(false, Ordering::SeqCst);
+}
+
+#[global_allocator]
+static A: PanicAllocator = PanicAllocator;
+static DO_PANIC: AtomicBool = AtomicBool::new(false);
+struct PanicAllocator;
+
+unsafe impl GlobalAlloc for PanicAllocator {
+    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
+        if DO_PANIC.load(Ordering::SeqCst) {
+            panic!("Unexpected allocation.");
+        }
+        System.alloc(layout)
+    }
+    
+    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
+        if DO_PANIC.load(Ordering::SeqCst) {
+            panic!("Unexpected deallocation.");
+        }
+        System.dealloc(ptr, layout);
+    }
+}
+```
+-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=1dfccfcf63d8800e644a3b948f1eeb7b)
+
+# Dr. Array or: How I Learned to Love the Optimizer
+
+Finally, this isn't so much about LLVM figuring out different memory behavior,
+but LLVM totally stripping out code that has no side effects. Optimizations of
+this type have a lot of nuance to them; if you're not careful, they can
+make your benchmarks look
+[impossibly good](https://www.youtube.com/watch?v=nXaxk27zwlk&feature=youtu.be&t=1199).
+In Rust, the `black_box` function (in both
+[`libtest`](https://doc.rust-lang.org/1.1.0/test/fn.black_box.html) and
+[`criterion`](https://docs.rs/criterion/0.2.10/criterion/fn.black_box.html))
+will tell the compiler to disable this kind of optimization. But if you let
+LLVM remove unnecessary code, you can end up with programs that
+would have previously caused errors running just fine:
+
+```rust
+#[derive(Default)]
+struct TwoFiftySix {
+    _a: [u64; 32]
+}
+
+#[derive(Default)]
+struct EightK {
+    _a: [TwoFiftySix; 32]
+}
+
+#[derive(Default)]
+struct TwoFiftySixK {
+    _a: [EightK; 32]
+}
+
+#[derive(Default)]
+struct EightM {
+    _a: [TwoFiftySixK; 32]
+}
+
+pub fn main() {
+    // Normally this blows up because we can't reserve size on stack
+    // for the `EightM` struct. But because the compiler notices we
+    // never do anything with `_x`, it optimizes out the stack storage
+    // and the program completes successfully.
+    let _x = EightM::default();
+}
+```
+-- [Compiler Explorer](https://godbolt.org/z/daHn7P)
+-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=4c253bf26072119896ab93c6ef064dc0)
--- a/_posts/2019-02-09-summary.md
+++ b/_posts/2019-02-09-summary.md
@ -0,0 +1,45 @@
+---
+layout: post
+title: "Summary: What Are the Rules?"
+description: "A synopsis and reference."
+category: 
+tags: [rust, understanding-allocations]
+---
+
+While there's a lot of interesting detail captured in this series, it's often helpful
+to have a document that answers some "yes/no" questions. You may not care about
+what an `Iterator` looks like in assembly, you just need to know whether it allocates
+an object on the heap or not.
+
+To that end, it should be said once again: if you care about memory behavior,
+use an allocator to verify the correct behavior. Tools like
+[`alloc_counter`](https://crates.io/crates/alloc_counter) are designed to make
+testing this behavior simple easy.
+
+Finally, a summary of the content that's been covered. Rust will prioritize
+the fastest behavior it can, but here are the ground rules for understanding
+the memory model in Rust:
+
+**Heap Allocation**:
+- Smart pointers (`Box`, `Rc`, `Mutex`, etc.) allocate their contents in heap memory.
+- Collections (`HashMap`, `Vec`, `String`, etc.) allocate their contents in heap memory.
+- Some smart pointers in the standard library have counterparts in other crates that
+  don't need heap memory. If possible, use those.
+
+**Stack Allocation**:
+- Everything not using a smart pointer type will be allocated on the stack.
+- Structs, enums, iterators, arrays, and closures are all stack allocated.
+- Cell types (`RefCell`) behave like smart pointers, but are stack-allocated.
+- Inlining (`#[inline]`) will not affect allocation behavior for better or worse.
+- Types that are marked `Copy` are guaranteed to have their contents stack-allocated.
+
+**Global Allocation**:
+- `const` is a fixed value; the compiler is allowed to copy it wherever useful.
+- `static` is a fixed reference; the compiler will guarantee it is unique.
+
+---
+
+And if you've read through this series: thanks. I've enjoyed the process that went
+into writing this, both in building new tools and forcing myself to understand
+the content well enough to explain it. I hope this is valuable as a reference to you
+as well.