Allocations in Rust series

2025-07-01 05:46:13 -04:00 · 2024-11-09 21:05:00 -05:00
parent 7426890685
commit 97f997dc99
14 changed files with 2981 additions and 1 deletions
--- a/blog/2018-12-15-allocation-safety/index.mdx
+++ b/blog/2018-12-15-allocation-safety/index.mdx
@ -12,7 +12,7 @@ bit over a month ago, I was dispensing sage wisdom for the ages:
 > I had a really great idea: build a custom allocator that allows you to track your own allocations.
 > I gave it a shot, but learned very quickly: **never write your own allocator.**
 >
-> -- [me](../2018-10-08-case-study-optimization)
+> -- [me](/2018/10/case-study-optimization)
 I proceeded to ignore it, because we never really learn from our mistakes.
--- a/blog/2019-02-04-understanding-allocations-in-rust/_article.md
+++ b/blog/2019-02-04-understanding-allocations-in-rust/_article.md
@ -0,0 +1,113 @@
 ---
 layout: post
 title: "Allocations in Rust"
 description: "An introduction to the memory model."
 category:
 tags: [rust, understanding-allocations]
 ---
 There's an alchemy of distilling complex technical topics into articles and videos that change the
 way programmers see the tools they interact with on a regular basis. I knew what a linker was, but
 there's a staggering amount of complexity in between
 [the OS and `main()`](https://www.youtube.com/watch?v=dOfucXtyEsU). Rust programmers use the
 [`Box`](https://doc.rust-lang.org/stable/std/boxed/struct.Box.html) type all the time, but there's a
 rich history of the Rust language itself wrapped up in
 [how special it is](https://manishearth.github.io/blog/2017/01/10/rust-tidbits-box-is-special/).
 In a similar vein, this series attempts to look at code and understand how memory is used; the
 complex choreography of operating system, compiler, and program that frees you to focus on
 functionality far-flung from frivolous book-keeping. The Rust compiler relieves a great deal of the
 cognitive burden associated with memory management, but we're going to step into its world for a
 while.
 Let's learn a bit about memory in Rust.
 # Table of Contents
 This series is intended as both learning and reference material; we'll work through the different
 memory types Rust uses, and explain the implications of each. Ultimately, a summary will be provided
 as a cheat sheet for easy future reference. To that end, a table of contents is in order:
 - Foreword
 - [Global Memory Usage: The Whole World](/2019/02/the-whole-world.html)
 - [Fixed Memory: Stacking Up](/2019/02/stacking-up.html)
 - [Dynamic Memory: A Heaping Helping](/2019/02/a-heaping-helping.html)
 - [Compiler Optimizations: What It's Done For You Lately](/2019/02/compiler-optimizations.html)
 - [Summary: What Are the Rules?](/2019/02/summary.html)
 # Foreword
 Rust's three defining features of
 [Performance, Reliability, and Productivity](https://www.rust-lang.org/) are all driven to a great
 degree by the how the Rust compiler understands memory usage. Unlike managed memory languages (Java,
 Python), Rust
 [doesn't really](https://words.steveklabnik.com/borrow-checking-escape-analysis-and-the-generational-hypothesis)
 garbage collect; instead, it uses an
 [ownership](https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html) system to reason about
 how long objects will last in your program. In some cases, if the life of an object is fairly
 transient, Rust can make use of a very fast region called the "stack." When that's not possible,
 Rust uses
 [dynamic (heap) memory](https://en.wikipedia.org/wiki/Memory_management#Dynamic_memory_allocation)
 and the ownership system to ensure you can't accidentally corrupt memory. It's not as fast, but it
 is important to have available.
 That said, there are specific situations in Rust where you'd never need to worry about the
 stack/heap distinction! If you:
 1. Never use `unsafe`
 2. Never use `#![feature(alloc)]` or the [`alloc` crate](https://doc.rust-lang.org/alloc/index.html)
 ...then it's not possible for you to use dynamic memory!
 For some uses of Rust, typically embedded devices, these constraints are OK. They have very limited
 memory, and the program binary size itself may significantly affect what's available! There's no
 operating system able to manage this
 ["virtual memory"](https://en.wikipedia.org/wiki/Virtual_memory) thing, but that's not an issue
 because there's only one running application. The
 [embedonomicon](https://docs.rust-embedded.org/embedonomicon/preface.html) is ever in mind, and
 interacting with the "real world" through extra peripherals is accomplished by reading and writing
 to [specific memory addresses](https://bob.cs.sonoma.edu/IntroCompOrg-RPi/sec-gpio-mem.html).
 Most Rust programs find these requirements overly burdensome though. C++ developers would struggle
 without access to [`std::vector`](https://en.cppreference.com/w/cpp/container/vector) (except those
 hardcore no-STL people), and Rust developers would struggle without
 [`std::vec`](https://doc.rust-lang.org/std/vec/struct.Vec.html). But with the constraints above,
 `std::vec` is actually a part of the
 [`alloc` crate](https://doc.rust-lang.org/alloc/vec/struct.Vec.html), and thus off-limits. `Box`,
 `Rc`, etc., are also unusable for the same reason.
 Whether writing code for embedded devices or not, the important thing in both situations is how much
 you know _before your application starts_ about what its memory usage will look like. In embedded
 devices, there's a small, fixed amount of memory to use. In a browser, you have no idea how large
 [google.com](https://www.google.com)'s home page is until you start trying to download it. The
 compiler uses this knowledge (or lack thereof) to optimize how memory is used; put simply, your code
 runs faster when the compiler can guarantee exactly how much memory your program needs while it's
 running. This series is all about understanding how the compiler reasons about your program, with an
 emphasis on the implications for performance.
 Now let's address some conditions and caveats before going much further:
 - We'll focus on "safe" Rust only; `unsafe` lets you use platform-specific allocation API's
  ([`malloc`](https://www.tutorialspoint.com/c_standard_library/c_function_malloc.htm)) that we'll
  ignore.
 - We'll assume a "debug" build of Rust code (what you get with `cargo run` and `cargo test`) and
  address (pun intended) release mode at the end (`cargo run --release` and `cargo test --release`).
 - All content will be run using Rust 1.32, as that's the highest currently supported in the
  [Compiler Exporer](https://godbolt.org/). As such, we'll avoid upcoming innovations like
  [compile-time evaluation of `static`](https://github.com/rust-lang/rfcs/blob/master/text/0911-const-fn.md)
  that are available in nightly.
 - Because of the nature of the content, being able to read assembly is helpful. We'll keep it
  simple, but I [found](https://stackoverflow.com/a/4584131/1454178) a
  [refresher](https://stackoverflow.com/a/26026278/1454178) on the `push` and `pop`
  [instructions](http://www.cs.virginia.edu/~evans/cs216/guides/x86.html) was helpful while writing
  this.
 - I've tried to be precise in saying only what I can prove using the tools (ASM, docs) that are
  available, but if there's something said in error it will be corrected expeditiously. Please let
  me know at [bradlee@speice.io](mailto:bradlee@speice.io)
 Finally, I'll do what I can to flag potential future changes but the Rust docs have a notice worth
 repeating:
 > Rust does not currently have a rigorously and formally defined memory model.
 >
 > -- [the docs](https://doc.rust-lang.org/std/ptr/fn.read_volatile.html)
--- a/blog/2019-02-04-understanding-allocations-in-rust/index.mdx
+++ b/blog/2019-02-04-understanding-allocations-in-rust/index.mdx
@ -0,0 +1,102 @@
 ---
 slug: 2019/02/understanding-allocations-in-rust
 title: "Allocations in Rust: Foreword"
 date: 2019-02-04 12:00:00
 authors: [bspeice]
 tags: []
 ---
 There's an alchemy of distilling complex technical topics into articles and videos that change the
 way programmers see the tools they interact with on a regular basis. I knew what a linker was, but
 there's a staggering amount of complexity in between
 [the OS and `main()`](https://www.youtube.com/watch?v=dOfucXtyEsU). Rust programmers use the
 [`Box`](https://doc.rust-lang.org/stable/std/boxed/struct.Box.html) type all the time, but there's a
 rich history of the Rust language itself wrapped up in
 [how special it is](https://manishearth.github.io/blog/2017/01/10/rust-tidbits-box-is-special/).
 In a similar vein, this series attempts to look at code and understand how memory is used; the
 complex choreography of operating system, compiler, and program that frees you to focus on
 functionality far-flung from frivolous book-keeping. The Rust compiler relieves a great deal of the
 cognitive burden associated with memory management, but we're going to step into its world for a
 while.
 Let's learn a bit about memory in Rust.
 <!-- truncate -->
 ---
 Rust's three defining features of
 [Performance, Reliability, and Productivity](https://www.rust-lang.org/) are all driven to a great
 degree by the how the Rust compiler understands memory usage. Unlike managed memory languages (Java,
 Python), Rust
 [doesn't really](https://words.steveklabnik.com/borrow-checking-escape-analysis-and-the-generational-hypothesis)
 garbage collect; instead, it uses an
 [ownership](https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html) system to reason about
 how long objects will last in your program. In some cases, if the life of an object is fairly
 transient, Rust can make use of a very fast region called the "stack." When that's not possible,
 Rust uses
 [dynamic (heap) memory](https://en.wikipedia.org/wiki/Memory_management#Dynamic_memory_allocation)
 and the ownership system to ensure you can't accidentally corrupt memory. It's not as fast, but it
 is important to have available.
 That said, there are specific situations in Rust where you'd never need to worry about the
 stack/heap distinction! If you:
 1. Never use `unsafe`
 2. Never use `#![feature(alloc)]` or the [`alloc` crate](https://doc.rust-lang.org/alloc/index.html)
 ...then it's not possible for you to use dynamic memory!
 For some uses of Rust, typically embedded devices, these constraints are OK. They have very limited
 memory, and the program binary size itself may significantly affect what's available! There's no
 operating system able to manage this
 ["virtual memory"](https://en.wikipedia.org/wiki/Virtual_memory) thing, but that's not an issue
 because there's only one running application. The
 [embedonomicon](https://docs.rust-embedded.org/embedonomicon/preface.html) is ever in mind, and
 interacting with the "real world" through extra peripherals is accomplished by reading and writing
 to [specific memory addresses](https://bob.cs.sonoma.edu/IntroCompOrg-RPi/sec-gpio-mem.html).
 Most Rust programs find these requirements overly burdensome though. C++ developers would struggle
 without access to [`std::vector`](https://en.cppreference.com/w/cpp/container/vector) (except those
 hardcore no-STL people), and Rust developers would struggle without
 [`std::vec`](https://doc.rust-lang.org/std/vec/struct.Vec.html). But with the constraints above,
 `std::vec` is actually a part of the
 [`alloc` crate](https://doc.rust-lang.org/alloc/vec/struct.Vec.html), and thus off-limits. `Box`,
 `Rc`, etc., are also unusable for the same reason.
 Whether writing code for embedded devices or not, the important thing in both situations is how much
 you know _before your application starts_ about what its memory usage will look like. In embedded
 devices, there's a small, fixed amount of memory to use. In a browser, you have no idea how large
 [google.com](https://www.google.com)'s home page is until you start trying to download it. The
 compiler uses this knowledge (or lack thereof) to optimize how memory is used; put simply, your code
 runs faster when the compiler can guarantee exactly how much memory your program needs while it's
 running. This series is all about understanding how the compiler reasons about your program, with an
 emphasis on the implications for performance.
 Now let's address some conditions and caveats before going much further:
 - We'll focus on "safe" Rust only; `unsafe` lets you use platform-specific allocation API's
  ([`malloc`](https://www.tutorialspoint.com/c_standard_library/c_function_malloc.htm)) that we'll
  ignore.
 - We'll assume a "debug" build of Rust code (what you get with `cargo run` and `cargo test`) and
  address (pun intended) release mode at the end (`cargo run --release` and `cargo test --release`).
 - All content will be run using Rust 1.32, as that's the highest currently supported in the
  [Compiler Exporer](https://godbolt.org/). As such, we'll avoid upcoming innovations like
  [compile-time evaluation of `static`](https://github.com/rust-lang/rfcs/blob/master/text/0911-const-fn.md)
  that are available in nightly.
 - Because of the nature of the content, being able to read assembly is helpful. We'll keep it
  simple, but I [found](https://stackoverflow.com/a/4584131/1454178) a
  [refresher](https://stackoverflow.com/a/26026278/1454178) on the `push` and `pop`
  [instructions](http://www.cs.virginia.edu/~evans/cs216/guides/x86.html) was helpful while writing
  this.
 - I've tried to be precise in saying only what I can prove using the tools (ASM, docs) that are
  available, but if there's something said in error it will be corrected expeditiously. Please let
  me know at [bradlee@speice.io](mailto:bradlee@speice.io)
 Finally, I'll do what I can to flag potential future changes but the Rust docs have a notice worth
 repeating:
 > Rust does not currently have a rigorously and formally defined memory model.
 >
 > -- [the docs](https://doc.rust-lang.org/std/ptr/fn.read_volatile.html)
--- a/blog/2019-02-05-the-whole-world/_article.md
+++ b/blog/2019-02-05-the-whole-world/_article.md
@ -0,0 +1,337 @@
 ---
 layout: post
 title: "Global Memory Usage: The Whole World"
 description: "Static considered slightly less harmful."
 category:
 tags: [rust, understanding-allocations]
 ---
 The first memory type we'll look at is pretty special: when Rust can prove that a _value_ is fixed
 for the life of a program (`const`), and when a _reference_ is unique for the life of a program
 (`static` as a declaration, not
 [`'static`](https://doc.rust-lang.org/book/ch10-03-lifetime-syntax.html#the-static-lifetime) as a
 lifetime), we can make use of global memory. This special section of data is embedded directly in
 the program binary so that variables are ready to go once the program loads; no additional
 computation is necessary.
 Understanding the value/reference distinction is important for reasons we'll go into below, and
 while the
 [full specification](https://github.com/rust-lang/rfcs/blob/master/text/0246-const-vs-static.md) for
 these two keywords is available, we'll take a hands-on approach to the topic.
 # **const**
 When a _value_ is guaranteed to be unchanging in your program (where "value" may be scalars,
 `struct`s, etc.), you can declare it `const`. This tells the compiler that it's safe to treat the
 value as never changing, and enables some interesting optimizations; not only is there no
 initialization cost to creating the value (it is loaded at the same time as the executable parts of
 your program), but the compiler can also copy the value around if it speeds up the code.
 The points we need to address when talking about `const` are:
 - `Const` values are stored in read-only memory - it's impossible to modify.
 - Values resulting from calling a `const fn` are materialized at compile-time.
 - The compiler may (or may not) copy `const` values wherever it chooses.
 ## Read-Only
 The first point is a bit strange - "read-only memory."
 [The Rust book](https://doc.rust-lang.org/book/ch03-01-variables-and-mutability.html#differences-between-variables-and-constants)
 mentions in a couple places that using `mut` with constants is illegal, but it's also important to
 demonstrate just how immutable they are. _Typically_ in Rust you can use
 [interior mutability](https://doc.rust-lang.org/book/ch15-05-interior-mutability.html) to modify
 things that aren't declared `mut`.
 [`RefCell`](https://doc.rust-lang.org/std/cell/struct.RefCell.html) provides an example of this
 pattern in action:
 ```rust
 use std::cell::RefCell;
 fn my_mutator(cell: &RefCell<u8>) {
    // Even though we're given an immutable reference,
    // the `replace` method allows us to modify the inner value.
    cell.replace(14);
 }
 fn main() {
    let cell = RefCell::new(25);
    // Prints out 25
    println!("Cell: {:?}", cell);
    my_mutator(&cell);
    // Prints out 14
    println!("Cell: {:?}", cell);
 }
 ```
 --
 [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8e4bea1a718edaff4507944e825a54b2)
 When `const` is involved though, interior mutability is impossible:
 ```rust
 use std::cell::RefCell;
 const CELL: RefCell<u8> = RefCell::new(25);
 fn my_mutator(cell: &RefCell<u8>) {
    cell.replace(14);
 }
 fn main() {
    // First line prints 25 as expected
    println!("Cell: {:?}", &CELL);
    my_mutator(&CELL);
    // Second line *still* prints 25
    println!("Cell: {:?}", &CELL);
 }
 ```
 --
 [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=88fe98110c33c1b3a51e341f48b8ae00)
 And a second example using [`Once`](https://doc.rust-lang.org/std/sync/struct.Once.html):
 ```rust
 use std::sync::Once;
 const SURPRISE: Once = Once::new();
 fn main() {
    // This is how `Once` is supposed to be used
    SURPRISE.call_once(|| println!("Initializing..."));
    // Because `Once` is a `const` value, we never record it
    // having been initialized the first time, and this closure
    // will also execute.
    SURPRISE.call_once(|| println!("Initializing again???"));
 }
 ```
 --
 [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=c3cc5979b5e5434eca0f9ec4a06ee0ed)
 When the
 [`const` specification](https://github.com/rust-lang/rfcs/blob/26197104b7bb9a5a35db243d639aee6e46d35d75/text/0246-const-vs-static.md)
 refers to ["rvalues"](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3055.pdf), this
 behavior is what they refer to. [Clippy](https://github.com/rust-lang/rust-clippy) will treat this
 as an error, but it's still something to be aware of.
 ## Initialization == Compilation
 The next thing to mention is that `const` values are loaded into memory _as part of your program
 binary_. Because of this, any `const` values declared in your program will be "realized" at
 compile-time; accessing them may trigger a main-memory lookup (with a fixed address, so your CPU may
 be able to prefetch the value), but that's it.
 ```rust
 use std::cell::RefCell;
 const CELL: RefCell<u32> = RefCell::new(24);
 pub fn multiply(value: u32) -> u32 {
    // CELL is stored at `.L__unnamed_1`
    value * (*CELL.get_mut())
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/Th8boO)
 The compiler creates one `RefCell`, uses it everywhere, and never needs to call the `RefCell::new`
 function.
 ## Copying
 If it's helpful though, the compiler can choose to copy `const` values.
 ```rust
 const FACTOR: u32 = 1000;
 pub fn multiply(value: u32) -> u32 {
    // See assembly line 4 for the `mov edi, 1000` instruction
    value * FACTOR
 }
 pub fn multiply_twice(value: u32) -> u32 {
    // See assembly lines 22 and 29 for `mov edi, 1000` instructions
    value * FACTOR * FACTOR
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/ZtS54X)
 In this example, the `FACTOR` value is turned into the `mov edi, 1000` instruction in both the
 `multiply` and `multiply_twice` functions; the "1000" value is never "stored" anywhere, as it's
 small enough to inline into the assembly instructions.
 Finally, getting the address of a `const` value is possible, but not guaranteed to be unique
 (because the compiler can choose to copy values). I was unable to get non-unique pointers in my
 testing (even using different crates), but the specifications are clear enough: _don't rely on
 pointers to `const` values being consistent_. To be frank, caring about locations for `const` values
 is almost certainly a code smell.
 # **static**
 Static variables are related to `const` variables, but take a slightly different approach. When we
 declare that a _reference_ is unique for the life of a program, you have a `static` variable
 (unrelated to the `'static` lifetime). Because of the reference/value distinction with
 `const`/`static`, static variables behave much more like typical "global" variables.
 But to understand `static`, here's what we'll look at:
 - `static` variables are globally unique locations in memory.
 - Like `const`, `static` variables are loaded at the same time as your program being read into
  memory.
 - All `static` variables must implement the
  [`Sync`](https://doc.rust-lang.org/std/marker/trait.Sync.html) marker trait.
 - Interior mutability is safe and acceptable when using `static` variables.
 ## Memory Uniqueness
 The single biggest difference between `const` and `static` is the guarantees provided about
 uniqueness. Where `const` variables may or may not be copied in code, `static` variables are
 guarantee to be unique. If we take a previous `const` example and change it to `static`, the
 difference should be clear:
 ```rust
 static FACTOR: u32 = 1000;
 pub fn multiply(value: u32) -> u32 {
    // The assembly to `mul dword ptr [rip + example::FACTOR]` is how FACTOR gets used
    value * FACTOR
 }
 pub fn multiply_twice(value: u32) -> u32 {
    // The assembly to `mul dword ptr [rip + example::FACTOR]` is how FACTOR gets used
    value * FACTOR * FACTOR
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/uxmiRQ)
 Where [previously](#copying) there were plenty of references to multiplying by 1000, the new
 assembly refers to `FACTOR` as a named memory location instead. No initialization work needs to be
 done, but the compiler can no longer prove the value never changes during execution.
 ## Initialization == Compilation
 Next, let's talk about initialization. The simplest case is initializing static variables with
 either scalar or struct notation:
 ```rust
 #[derive(Debug)]
 struct MyStruct {
    x: u32
 }
 static MY_STRUCT: MyStruct = MyStruct {
    // You can even reference other statics
    // declared later
    x: MY_VAL
 };
 static MY_VAL: u32 = 24;
 fn main() {
    println!("Static MyStruct: {:?}", MY_STRUCT);
 }
 ```
 --
 [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=b538dbc46076f12db047af4f4403ee6e)
 Things can get a bit weirder when using `const fn` though. In most cases, it just works:
 ```rust
 #[derive(Debug)]
 struct MyStruct {
    x: u32
 }
 impl MyStruct {
    const fn new() -> MyStruct {
        MyStruct { x: 24 }
    }
 }
 static MY_STRUCT: MyStruct = MyStruct::new();
 fn main() {
    println!("const fn Static MyStruct: {:?}", MY_STRUCT);
 }
 ```
 --
 [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8c796a6e7fc273c12115091b707b0255)
 However, there's a caveat: you're currently not allowed to use `const fn` to initialize static
 variables of types that aren't marked `Sync`. For example,
 [`RefCell::new()`](https://doc.rust-lang.org/std/cell/struct.RefCell.html#method.new) is a
 `const fn`, but because
 [`RefCell` isn't `Sync`](https://doc.rust-lang.org/std/cell/struct.RefCell.html#impl-Sync), you'll
 get an error at compile time:
 ```rust
 use std::cell::RefCell;
 // error[E0277]: `std::cell::RefCell<u8>` cannot be shared between threads safely
 static MY_LOCK: RefCell<u8> = RefCell::new(0);
 ```
 --
 [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=c76ef86e473d07117a1700e21fd45560)
 It's likely that this will
 [change in the future](https://github.com/rust-lang/rfcs/blob/master/text/0911-const-fn.md) though.
 ## **Sync**
 Which leads well to the next point: static variable types must implement the
 [`Sync` marker](https://doc.rust-lang.org/std/marker/trait.Sync.html). Because they're globally
 unique, it must be safe for you to access static variables from any thread at any time. Most
 `struct` definitions automatically implement the `Sync` trait because they contain only elements
 which themselves implement `Sync` (read more in the
 [Nomicon](https://doc.rust-lang.org/nomicon/send-and-sync.html)). This is why earlier examples could
 get away with initializing statics, even though we never included an `impl Sync for MyStruct` in the
 code. To demonstrate this property, Rust refuses to compile our earlier example if we add a
 non-`Sync` element to the `struct` definition:
 ```rust
 use std::cell::RefCell;
 struct MyStruct {
    x: u32,
    y: RefCell<u8>,
 }
 // error[E0277]: `std::cell::RefCell<u8>` cannot be shared between threads safely
 static MY_STRUCT: MyStruct = MyStruct {
    x: 8,
    y: RefCell::new(8)
 };
 ```
 --
 [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=40074d0248f056c296b662dbbff97cfc)
 ## Interior Mutability
 Finally, while `static mut` variables are allowed, mutating them is an `unsafe` operation. If we
 want to stay in `safe` Rust, we can use interior mutability to accomplish similar goals:
 ```rust
 use std::sync::Once;
 // This example adapted from https://doc.rust-lang.org/std/sync/struct.Once.html#method.call_once
 static INIT: Once = Once::new();
 fn main() {
    // Note that while `INIT` is declared immutable, we're still allowed
    // to mutate its interior
    INIT.call_once(|| println!("Initializing..."));
    // This code won't panic, as the interior of INIT was modified
    // as part of the previous `call_once`
    INIT.call_once(|| panic!("INIT was called twice!"));
 }
 ```
 --
 [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=3ba003a981a7ed7400240caadd384d59)
--- a/blog/2019-02-05-the-whole-world/index.mdx
+++ b/blog/2019-02-05-the-whole-world/index.mdx
@ -0,0 +1,339 @@
 ---
 slug: 2019/02/the-whole-world
 title: "Allocations in Rust: Global memory"
 date: 2019-02-05 12:00:00
 authors: [bspeice]
 tags: []
 ---
 The first memory type we'll look at is pretty special: when Rust can prove that a _value_ is fixed
 for the life of a program (`const`), and when a _reference_ is unique for the life of a program
 (`static` as a declaration, not
 [`'static`](https://doc.rust-lang.org/book/ch10-03-lifetime-syntax.html#the-static-lifetime) as a
 lifetime), we can make use of global memory. This special section of data is embedded directly in
 the program binary so that variables are ready to go once the program loads; no additional
 computation is necessary.
 Understanding the value/reference distinction is important for reasons we'll go into below, and
 while the
 [full specification](https://github.com/rust-lang/rfcs/blob/master/text/0246-const-vs-static.md) for
 these two keywords is available, we'll take a hands-on approach to the topic.
 <!-- truncate -->
 ## `const` values
 When a _value_ is guaranteed to be unchanging in your program (where "value" may be scalars,
 `struct`s, etc.), you can declare it `const`. This tells the compiler that it's safe to treat the
 value as never changing, and enables some interesting optimizations; not only is there no
 initialization cost to creating the value (it is loaded at the same time as the executable parts of
 your program), but the compiler can also copy the value around if it speeds up the code.
 The points we need to address when talking about `const` are:
 - `Const` values are stored in read-only memory - it's impossible to modify.
 - Values resulting from calling a `const fn` are materialized at compile-time.
 - The compiler may (or may not) copy `const` values wherever it chooses.
 ### Read-Only
 The first point is a bit strange - "read-only memory."
 [The Rust book](https://doc.rust-lang.org/book/ch03-01-variables-and-mutability.html#differences-between-variables-and-constants)
 mentions in a couple places that using `mut` with constants is illegal, but it's also important to
 demonstrate just how immutable they are. _Typically_ in Rust you can use
 [interior mutability](https://doc.rust-lang.org/book/ch15-05-interior-mutability.html) to modify
 things that aren't declared `mut`.
 [`RefCell`](https://doc.rust-lang.org/std/cell/struct.RefCell.html) provides an example of this
 pattern in action:
 ```rust
 use std::cell::RefCell;
 fn my_mutator(cell: &RefCell<u8>) {
    // Even though we're given an immutable reference,
    // the `replace` method allows us to modify the inner value.
    cell.replace(14);
 }
 fn main() {
    let cell = RefCell::new(25);
    // Prints out 25
    println!("Cell: {:?}", cell);
    my_mutator(&cell);
    // Prints out 14
    println!("Cell: {:?}", cell);
 }
 ```
 --
 [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8e4bea1a718edaff4507944e825a54b2)
 When `const` is involved though, interior mutability is impossible:
 ```rust
 use std::cell::RefCell;
 const CELL: RefCell<u8> = RefCell::new(25);
 fn my_mutator(cell: &RefCell<u8>) {
    cell.replace(14);
 }
 fn main() {
    // First line prints 25 as expected
    println!("Cell: {:?}", &CELL);
    my_mutator(&CELL);
    // Second line *still* prints 25
    println!("Cell: {:?}", &CELL);
 }
 ```
 --
 [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=88fe98110c33c1b3a51e341f48b8ae00)
 And a second example using [`Once`](https://doc.rust-lang.org/std/sync/struct.Once.html):
 ```rust
 use std::sync::Once;
 const SURPRISE: Once = Once::new();
 fn main() {
    // This is how `Once` is supposed to be used
    SURPRISE.call_once(|| println!("Initializing..."));
    // Because `Once` is a `const` value, we never record it
    // having been initialized the first time, and this closure
    // will also execute.
    SURPRISE.call_once(|| println!("Initializing again???"));
 }
 ```
 --
 [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=c3cc5979b5e5434eca0f9ec4a06ee0ed)
 When the
 [`const` specification](https://github.com/rust-lang/rfcs/blob/26197104b7bb9a5a35db243d639aee6e46d35d75/text/0246-const-vs-static.md)
 refers to ["rvalues"](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3055.pdf), this
 behavior is what they refer to. [Clippy](https://github.com/rust-lang/rust-clippy) will treat this
 as an error, but it's still something to be aware of.
 ### Initialization
 The next thing to mention is that `const` values are loaded into memory _as part of your program
 binary_. Because of this, any `const` values declared in your program will be "realized" at
 compile-time; accessing them may trigger a main-memory lookup (with a fixed address, so your CPU may
 be able to prefetch the value), but that's it.
 ```rust
 use std::cell::RefCell;
 const CELL: RefCell<u32> = RefCell::new(24);
 pub fn multiply(value: u32) -> u32 {
    // CELL is stored at `.L__unnamed_1`
    value * (*CELL.get_mut())
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/Th8boO)
 The compiler creates one `RefCell`, uses it everywhere, and never needs to call the `RefCell::new`
 function.
 ### Copying
 If it's helpful though, the compiler can choose to copy `const` values.
 ```rust
 const FACTOR: u32 = 1000;
 pub fn multiply(value: u32) -> u32 {
    // See assembly line 4 for the `mov edi, 1000` instruction
    value * FACTOR
 }
 pub fn multiply_twice(value: u32) -> u32 {
    // See assembly lines 22 and 29 for `mov edi, 1000` instructions
    value * FACTOR * FACTOR
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/ZtS54X)
 In this example, the `FACTOR` value is turned into the `mov edi, 1000` instruction in both the
 `multiply` and `multiply_twice` functions; the "1000" value is never "stored" anywhere, as it's
 small enough to inline into the assembly instructions.
 Finally, getting the address of a `const` value is possible, but not guaranteed to be unique
 (because the compiler can choose to copy values). I was unable to get non-unique pointers in my
 testing (even using different crates), but the specifications are clear enough: _don't rely on
 pointers to `const` values being consistent_. To be frank, caring about locations for `const` values
 is almost certainly a code smell.
 ## `static` values
 Static variables are related to `const` variables, but take a slightly different approach. When we
 declare that a _reference_ is unique for the life of a program, you have a `static` variable
 (unrelated to the `'static` lifetime). Because of the reference/value distinction with
 `const`/`static`, static variables behave much more like typical "global" variables.
 But to understand `static`, here's what we'll look at:
 - `static` variables are globally unique locations in memory.
 - Like `const`, `static` variables are loaded at the same time as your program being read into
  memory.
 - All `static` variables must implement the
  [`Sync`](https://doc.rust-lang.org/std/marker/trait.Sync.html) marker trait.
 - Interior mutability is safe and acceptable when using `static` variables.
 ### Memory Uniqueness
 The single biggest difference between `const` and `static` is the guarantees provided about
 uniqueness. Where `const` variables may or may not be copied in code, `static` variables are
 guarantee to be unique. If we take a previous `const` example and change it to `static`, the
 difference should be clear:
 ```rust
 static FACTOR: u32 = 1000;
 pub fn multiply(value: u32) -> u32 {
    // The assembly to `mul dword ptr [rip + example::FACTOR]` is how FACTOR gets used
    value * FACTOR
 }
 pub fn multiply_twice(value: u32) -> u32 {
    // The assembly to `mul dword ptr [rip + example::FACTOR]` is how FACTOR gets used
    value * FACTOR * FACTOR
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/uxmiRQ)
 Where [previously](#copying) there were plenty of references to multiplying by 1000, the new
 assembly refers to `FACTOR` as a named memory location instead. No initialization work needs to be
 done, but the compiler can no longer prove the value never changes during execution.
 ### Initialization
 Next, let's talk about initialization. The simplest case is initializing static variables with
 either scalar or struct notation:
 ```rust
 #[derive(Debug)]
 struct MyStruct {
    x: u32
 }
 static MY_STRUCT: MyStruct = MyStruct {
    // You can even reference other statics
    // declared later
    x: MY_VAL
 };
 static MY_VAL: u32 = 24;
 fn main() {
    println!("Static MyStruct: {:?}", MY_STRUCT);
 }
 ```
 --
 [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=b538dbc46076f12db047af4f4403ee6e)
 Things can get a bit weirder when using `const fn` though. In most cases, it just works:
 ```rust
 #[derive(Debug)]
 struct MyStruct {
    x: u32
 }
 impl MyStruct {
    const fn new() -> MyStruct {
        MyStruct { x: 24 }
    }
 }
 static MY_STRUCT: MyStruct = MyStruct::new();
 fn main() {
    println!("const fn Static MyStruct: {:?}", MY_STRUCT);
 }
 ```
 --
 [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8c796a6e7fc273c12115091b707b0255)
 However, there's a caveat: you're currently not allowed to use `const fn` to initialize static
 variables of types that aren't marked `Sync`. For example,
 [`RefCell::new()`](https://doc.rust-lang.org/std/cell/struct.RefCell.html#method.new) is a
 `const fn`, but because
 [`RefCell` isn't `Sync`](https://doc.rust-lang.org/std/cell/struct.RefCell.html#impl-Sync), you'll
 get an error at compile time:
 ```rust
 use std::cell::RefCell;
 // error[E0277]: `std::cell::RefCell<u8>` cannot be shared between threads safely
 static MY_LOCK: RefCell<u8> = RefCell::new(0);
 ```
 --
 [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=c76ef86e473d07117a1700e21fd45560)
 It's likely that this will
 [change in the future](https://github.com/rust-lang/rfcs/blob/master/text/0911-const-fn.md) though.
 ### The `Sync` marker
 Which leads well to the next point: static variable types must implement the
 [`Sync` marker](https://doc.rust-lang.org/std/marker/trait.Sync.html). Because they're globally
 unique, it must be safe for you to access static variables from any thread at any time. Most
 `struct` definitions automatically implement the `Sync` trait because they contain only elements
 which themselves implement `Sync` (read more in the
 [Nomicon](https://doc.rust-lang.org/nomicon/send-and-sync.html)). This is why earlier examples could
 get away with initializing statics, even though we never included an `impl Sync for MyStruct` in the
 code. To demonstrate this property, Rust refuses to compile our earlier example if we add a
 non-`Sync` element to the `struct` definition:
 ```rust
 use std::cell::RefCell;
 struct MyStruct {
    x: u32,
    y: RefCell<u8>,
 }
 // error[E0277]: `std::cell::RefCell<u8>` cannot be shared between threads safely
 static MY_STRUCT: MyStruct = MyStruct {
    x: 8,
    y: RefCell::new(8)
 };
 ```
 --
 [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=40074d0248f056c296b662dbbff97cfc)
 ### Interior mutability
 Finally, while `static mut` variables are allowed, mutating them is an `unsafe` operation. If we
 want to stay in `safe` Rust, we can use interior mutability to accomplish similar goals:
 ```rust
 use std::sync::Once;
 // This example adapted from https://doc.rust-lang.org/std/sync/struct.Once.html#method.call_once
 static INIT: Once = Once::new();
 fn main() {
    // Note that while `INIT` is declared immutable, we're still allowed
    // to mutate its interior
    INIT.call_once(|| println!("Initializing..."));
    // This code won't panic, as the interior of INIT was modified
    // as part of the previous `call_once`
    INIT.call_once(|| panic!("INIT was called twice!"));
 }
 ```
 --
 [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=3ba003a981a7ed7400240caadd384d59)
--- a/blog/2019-02-06-stacking-up/_article.md
+++ b/blog/2019-02-06-stacking-up/_article.md
@ -0,0 +1,601 @@
 ---
 layout: post
 title: "Fixed Memory: Stacking Up"
 description: "We don't need no allocator."
 category:
 tags: [rust, understanding-allocations]
 ---
 `const` and `static` are perfectly fine, but it's relatively rare that we know at compile-time about
 either values or references that will be the same for the duration of our program. Put another way,
 it's not often the case that either you or your compiler knows how much memory your entire program
 will ever need.
 However, there are still some optimizations the compiler can do if it knows how much memory
 individual functions will need. Specifically, the compiler can make use of "stack" memory (as
 opposed to "heap" memory) which can be managed far faster in both the short- and long-term. When
 requesting memory, the [`push` instruction](http://www.cs.virginia.edu/~evans/cs216/guides/x86.html)
 can typically complete in [1 or 2 cycles](https://agner.org/optimize/instruction_tables.ods) (<1
 nanosecond on modern CPUs). Contrast that to heap memory which requires an allocator (specialized
 software to track what memory is in use) to reserve space. When you're finished with stack memory,
 the `pop` instruction runs in 1-3 cycles, as opposed to an allocator needing to worry about memory
 fragmentation and other issues with the heap. All sorts of incredibly sophisticated techniques have
 been used to design allocators:
 - [Garbage Collection](<https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)>)
  strategies like [Tracing](https://en.wikipedia.org/wiki/Tracing_garbage_collection) (used in
  [Java](https://www.oracle.com/technetwork/java/javase/tech/g1-intro-jsp-135488.html)) and
  [Reference counting](https://en.wikipedia.org/wiki/Reference_counting) (used in
  [Python](https://docs.python.org/3/extending/extending.html#reference-counts))
 - Thread-local structures to prevent locking the allocator in
  [tcmalloc](https://jamesgolick.com/2013/5/19/how-tcmalloc-works.html)
 - Arena structures used in [jemalloc](http://jemalloc.net/), which
  [until recently](https://blog.rust-lang.org/2019/01/17/Rust-1.32.0.html#jemalloc-is-removed-by-default)
  was the primary allocator for Rust programs!
 But no matter how fast your allocator is, the principle remains: the fastest allocator is the one
 you never use. As such, we're not going to discuss how exactly the
 [`push` and `pop` instructions work](http://www.cs.virginia.edu/~evans/cs216/guides/x86.html), but
 we'll focus instead on the conditions that enable the Rust compiler to use faster stack-based
 allocation for variables.
 So, **how do we know when Rust will or will not use stack allocation for objects we create?**
 Looking at other languages, it's often easy to delineate between stack and heap. Managed memory
 languages (Python, Java,
 [C#](https://blogs.msdn.microsoft.com/ericlippert/2010/09/30/the-truth-about-value-types/)) place
 everything on the heap. JIT compilers ([PyPy](https://www.pypy.org/),
 [HotSpot](https://www.oracle.com/technetwork/java/javase/tech/index-jsp-136373.html)) may optimize
 some heap allocations away, but you should never assume it will happen. C makes things clear with
 calls to special functions (like [malloc(3)](https://linux.die.net/man/3/malloc)) needed to access
 heap memory. Old C++ has the [`new`](https://stackoverflow.com/a/655086/1454178) keyword, though
 modern C++/C++11 is more complicated with [RAII](https://en.cppreference.com/w/cpp/language/raii).
 For Rust, we can summarize as follows: **stack allocation will be used for everything that doesn't
 involve "smart pointers" and collections**. We'll skip over a precise definition of the term "smart
 pointer" for now, and instead discuss what we should watch for to understand when stack and heap
 memory regions are used:
 1. Stack manipulation instructions (`push`, `pop`, and `add`/`sub` of the `rsp` register) indicate
   allocation of stack memory:
   ```rust
   pub fn stack_alloc(x: u32) -> u32 {
       // Space for `y` is allocated by subtracting from `rsp`,
       // and then populated
       let y = [1u8, 2, 3, 4];
       // Space for `y` is deallocated by adding back to `rsp`
       x
   }
   ```
   -- [Compiler Explorer](https://godbolt.org/z/5WSgc9)
 2. Tracking when exactly heap allocation calls occur is difficult. It's typically easier to watch
   for `call core::ptr::real_drop_in_place`, and infer that a heap allocation happened in the recent
   past:
   ```rust
   pub fn heap_alloc(x: usize) -> usize {
       // Space for elements in a vector has to be allocated
       // on the heap, and is then de-allocated once the
       // vector goes out of scope
       let y: Vec<u8> = Vec::with_capacity(x);
       x
   }
   ```
   -- [Compiler Explorer](https://godbolt.org/z/epfgoQ) (`real_drop_in_place` happens on line 1317)
   <span style="font-size: .8em">Note: While the
   [`Drop` trait](https://doc.rust-lang.org/std/ops/trait.Drop.html) is
   [called for stack-allocated objects](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=87edf374d8983816eb3d8cfeac657b46),
   the Rust standard library only defines `Drop` implementations for types that involve heap
   allocation.</span>
 3. If you don't want to inspect the assembly, use a custom allocator that's able to track and alert
   when heap allocations occur. Crates like
   [`alloc_counter`](https://crates.io/crates/alloc_counter) are designed for exactly this purpose.
 With all that in mind, let's talk about situations in which we're guaranteed to use stack memory:
 - Structs are created on the stack.
 - Function arguments are passed on the stack, meaning the
  [`#[inline]` attribute](https://doc.rust-lang.org/reference/attributes.html#inline-attribute) will
  not change the memory region used.
 - Enums and unions are stack-allocated.
 - [Arrays](https://doc.rust-lang.org/std/primitive.array.html) are always stack-allocated.
 - Closures capture their arguments on the stack.
 - Generics will use stack allocation, even with dynamic dispatch.
 - [`Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html) types are guaranteed to be
  stack-allocated, and copying them will be done in stack memory.
 - [`Iterator`s](https://doc.rust-lang.org/std/iter/trait.Iterator.html) in the standard library are
  stack-allocated even when iterating over heap-based collections.
 # Structs
 The simplest case comes first. When creating vanilla `struct` objects, we use stack memory to hold
 their contents:
 ```rust
 struct Point {
    x: u64,
    y: u64,
 }
 struct Line {
    a: Point,
    b: Point,
 }
 pub fn make_line() {
    // `origin` is stored in the first 16 bytes of memory
    // starting at location `rsp`
    let origin = Point { x: 0, y: 0 };
    // `point` makes up the next 16 bytes of memory
    let point = Point { x: 1, y: 2 };
    // When creating `ray`, we just move the content out of
    // `origin` and `point` into the next 32 bytes of memory
    let ray = Line { a: origin, b: point };
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/vri9BE)
 Note that while some extra-fancy instructions are used for memory manipulation in the assembly, the
 `sub rsp, 64` instruction indicates we're still working with the stack.
 # Function arguments
 Have you ever wondered how functions communicate with each other? Like, once the variables are given
 to you, everything's fine. But how do you "give" those variables to another function? How do you get
 the results back afterward? The answer: the compiler arranges memory and assembly instructions using
 a pre-determined [calling convention](http://llvm.org/docs/LangRef.html#calling-conventions). This
 convention governs the rules around where arguments needed by a function will be located (either in
 memory offsets relative to the stack pointer `rsp`, or in other registers), and where the results
 can be found once the function has finished. And when multiple languages agree on what the calling
 conventions are, you can do things like having [Go call Rust code](https://blog.filippo.io/rustgo/)!
 Put simply: it's the compiler's job to figure out how to call other functions, and you can assume
 that the compiler is good at its job.
 We can see this in action using a simple example:
 ```rust
 struct Point {
    x: i64,
    y: i64,
 }
 // We use integer division operations to keep
 // the assembly clean, understanding the result
 // isn't accurate.
 fn distance(a: &Point, b: &Point) -> i64 {
    // Immediately subtract from `rsp` the bytes needed
    // to hold all the intermediate results - this is
    // the stack allocation step
    // The compiler used the `rdi` and `rsi` registers
    // to pass our arguments, so read them in
    let x1 = a.x;
    let x2 = b.x;
    let y1 = a.y;
    let y2 = b.y;
    // Do the actual math work
    let x_pow = (x1 - x2) * (x1 - x2);
    let y_pow = (y1 - y2) * (y1 - y2);
    let squared = x_pow + y_pow;
    squared / squared
    // Our final result will be stored in the `rax` register
    // so that our caller knows where to retrieve it.
    // Finally, add back to `rsp` the stack memory that is
    // now ready to be used by other functions.
 }
 pub fn total_distance() {
    let start = Point { x: 1, y: 2 };
    let middle = Point { x: 3, y: 4 };
    let end = Point { x: 5, y: 6 };
    let _dist_1 = distance(&start, &middle);
    let _dist_2 = distance(&middle, &end);
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/Qmx4ST)
 As a consequence of function arguments never using heap memory, we can also infer that functions
 using the `#[inline]` attributes also do not heap allocate. But better than inferring, we can look
 at the assembly to prove it:
 ```rust
 struct Point {
    x: i64,
    y: i64,
 }
 // Note that there is no `distance` function in the assembly output,
 // and the total line count goes from 229 with inlining off
 // to 306 with inline on. Even still, no heap allocations occur.
 #[inline(always)]
 fn distance(a: &Point, b: &Point) -> i64 {
    let x1 = a.x;
    let x2 = b.x;
    let y1 = a.y;
    let y2 = b.y;
    let x_pow = (a.x - b.x) * (a.x - b.x);
    let y_pow = (a.y - b.y) * (a.y - b.y);
    let squared = x_pow + y_pow;
    squared / squared
 }
 pub fn total_distance() {
    let start = Point { x: 1, y: 2 };
    let middle = Point { x: 3, y: 4 };
    let end = Point { x: 5, y: 6 };
    let _dist_1 = distance(&start, &middle);
    let _dist_2 = distance(&middle, &end);
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/30Sh66)
 Finally, passing by value (arguments with type
 [`Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html)) and passing by reference (either
 moving ownership or passing a pointer) may have slightly different layouts in assembly, but will
 still use either stack memory or CPU registers:
 ```rust
 pub struct Point {
    x: i64,
    y: i64,
 }
 // Moving values
 pub fn distance_moved(a: Point, b: Point) -> i64 {
    let x1 = a.x;
    let x2 = b.x;
    let y1 = a.y;
    let y2 = b.y;
    let x_pow = (x1 - x2) * (x1 - x2);
    let y_pow = (y1 - y2) * (y1 - y2);
    let squared = x_pow + y_pow;
    squared / squared
 }
 // Borrowing values has two extra `mov` instructions on lines 21 and 22
 pub fn distance_borrowed(a: &Point, b: &Point) -> i64 {
    let x1 = a.x;
    let x2 = b.x;
    let y1 = a.y;
    let y2 = b.y;
    let x_pow = (x1 - x2) * (x1 - x2);
    let y_pow = (y1 - y2) * (y1 - y2);
    let squared = x_pow + y_pow;
    squared / squared
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/06hGiv)
 # Enums
 If you've ever worried that wrapping your types in
 [`Option`](https://doc.rust-lang.org/stable/core/option/enum.Option.html) or
 [`Result`](https://doc.rust-lang.org/stable/core/result/enum.Result.html) would finally make them
 large enough that Rust decides to use heap allocation instead, fear no longer: `enum` and union
 types don't use heap allocation:
 ```rust
 enum MyEnum {
    Small(u8),
    Large(u64)
 }
 struct MyStruct {
    x: MyEnum,
    y: MyEnum,
 }
 pub fn enum_compare() {
    let x = MyEnum::Small(0);
    let y = MyEnum::Large(0);
    let z = MyStruct { x, y };
    let opt = Option::Some(z);
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/HK7zBx)
 Because the size of an `enum` is the size of its largest element plus a flag, the compiler can
 predict how much memory is used no matter which variant of an enum is currently stored in a
 variable. Thus, enums and unions have no need of heap allocation. There's unfortunately not a great
 way to show this in assembly, so I'll instead point you to the
 [`core::mem::size_of`](https://doc.rust-lang.org/stable/core/mem/fn.size_of.html#size-of-enums)
 documentation.
 # Arrays
 The array type is guaranteed to be stack allocated, which is why the array size must be declared.
 Interestingly enough, this can be used to cause safe Rust programs to crash:
 ```rust
 // 256 bytes
 #[derive(Default)]
 struct TwoFiftySix {
    _a: [u64; 32]
 }
 // 8 kilobytes
 #[derive(Default)]
 struct EightK {
    _a: [TwoFiftySix; 32]
 }
 // 256 kilobytes
 #[derive(Default)]
 struct TwoFiftySixK {
    _a: [EightK; 32]
 }
 // 8 megabytes - exceeds space typically provided for the stack,
 // though the kernel can be instructed to allocate more.
 // On Linux, you can check stack size using `ulimit -s`
 #[derive(Default)]
 struct EightM {
    _a: [TwoFiftySixK; 32]
 }
 fn main() {
    // Because we already have things in stack memory
    // (like the current function call stack), allocating another
    // eight megabytes of stack memory crashes the program
    let _x = EightM::default();
 }
 ```
 --
 [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=587a6380a4914bcbcef4192c90c01dc4)
 There aren't any security implications of this (no memory corruption occurs), but it's good to note
 that the Rust compiler won't move arrays into heap memory even if they can be reasonably expected to
 overflow the stack.
 # Closures
 Rules for how anonymous functions capture their arguments are typically language-specific. In Java,
 [Lambda Expressions](https://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html) are
 actually objects created on the heap that capture local primitives by copying, and capture local
 non-primitives as (`final`) references.
 [Python](https://docs.python.org/3.7/reference/expressions.html#lambda) and
 [JavaScript](https://javascriptweblog.wordpress.com/2010/10/25/understanding-javascript-closures/)
 both bind _everything_ by reference normally, but Python can also
 [capture values](https://stackoverflow.com/a/235764/1454178) and JavaScript has
 [Arrow functions](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/Arrow_functions).
 In Rust, arguments to closures are the same as arguments to other functions; closures are simply
 functions that don't have a declared name. Some weird ordering of the stack may be required to
 handle them, but it's the compiler's responsiblity to figure that out.
 Each example below has the same effect, but a different assembly implementation. In the simplest
 case, we immediately run a closure returned by another function. Because we don't store a reference
 to the closure, the stack memory needed to store the captured values is contiguous:
 ```rust
 fn my_func() -> impl FnOnce() {
    let x = 24;
    // Note that this closure in assembly looks exactly like
    // any other function; you even use the `call` instruction
    // to start running it.
    move || { x; }
 }
 pub fn immediate() {
    my_func()();
    my_func()();
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/mgJ2zl), 25 total assembly instructions
 If we store a reference to the closure, the Rust compiler keeps values it needs in the stack memory
 of the original function. Getting the details right is a bit harder, so the instruction count goes
 up even though this code is functionally equivalent to our original example:
 ```rust
 pub fn simple_reference() {
    let x = my_func();
    let y = my_func();
    y();
    x();
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/K_dj5n), 55 total assembly instructions
 Even things like variable order can make a difference in instruction count:
 ```rust
 pub fn complex() {
    let x = my_func();
    let y = my_func();
    x();
    y();
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/p37qFl), 70 total assembly instructions
 In every circumstance though, the compiler ensured that no heap allocations were necessary.
 # Generics
 Traits in Rust come in two broad forms: static dispatch (monomorphization, `impl Trait`) and dynamic
 dispatch (trait objects, `dyn Trait`). While dynamic dispatch is often _associated_ with trait
 objects being stored in the heap, dynamic dispatch can be used with stack allocated objects as well:
 ```rust
 trait GetInt {
    fn get_int(&self) -> u64;
 }
 // vtable stored at section L__unnamed_1
 struct WhyNotU8 {
    x: u8
 }
 impl GetInt for WhyNotU8 {
    fn get_int(&self) -> u64 {
        self.x as u64
    }
 }
 // vtable stored at section L__unnamed_2
 struct ActualU64 {
    x: u64
 }
 impl GetInt for ActualU64 {
    fn get_int(&self) -> u64 {
        self.x
    }
 }
 // `&dyn` declares that we want to use dynamic dispatch
 // rather than monomorphization, so there is only one
 // `retrieve_int` function that shows up in the final assembly.
 // If we used generics, there would be one implementation of
 // `retrieve_int` for each type that implements `GetInt`.
 pub fn retrieve_int(u: &dyn GetInt) {
    // In the assembly, we just call an address given to us
    // in the `rsi` register and hope that it was set up
    // correctly when this function was invoked.
    let x = u.get_int();
 }
 pub fn do_call() {
    // Note that even though the vtable for `WhyNotU8` and
    // `ActualU64` includes a pointer to
    // `core::ptr::real_drop_in_place`, it is never invoked.
    let a = WhyNotU8 { x: 0 };
    let b = ActualU64 { x: 0 };
    retrieve_int(&a);
    retrieve_int(&b);
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/u_yguS)
 It's hard to imagine practical situations where dynamic dispatch would be used for objects that
 aren't heap allocated, but it technically can be done.
 # Copy types
 Understanding move semantics and copy semantics in Rust is weird at first. The Rust docs
 [go into detail](https://doc.rust-lang.org/stable/core/marker/trait.Copy.html) far better than can
 be addressed here, so I'll leave them to do the job. From a memory perspective though, their
 guideline is reasonable:
 [if your type can implemement `Copy`, it should](https://doc.rust-lang.org/stable/core/marker/trait.Copy.html#when-should-my-type-be-copy).
 While there are potential speed tradeoffs to _benchmark_ when discussing `Copy` (move semantics for
 stack objects vs. copying stack pointers vs. copying stack `struct`s), _it's impossible for `Copy`
 to introduce a heap allocation_.
 But why is this the case? Fundamentally, it's because the language controls what `Copy` means -
 ["the behavior of `Copy` is not overloadable"](https://doc.rust-lang.org/std/marker/trait.Copy.html#whats-the-difference-between-copy-and-clone)
 because it's a marker trait. From there we'll note that a type
 [can implement `Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html#when-can-my-type-be-copy)
 if (and only if) its components implement `Copy`, and that
 [no heap-allocated types implement `Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html#implementors).
 Thus, assignments involving heap types are always move semantics, and new heap allocations won't
 occur because of implicit operator behavior.
 ```rust
 #[derive(Clone)]
 struct Cloneable {
    x: Box<u64>
 }
 // error[E0204]: the trait `Copy` may not be implemented for this type
 #[derive(Copy, Clone)]
 struct NotCopyable {
    x: Box<u64>
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/VToRuK)
 # Iterators
 In managed memory languages (like
 [Java](https://www.youtube.com/watch?v=bSkpMdDe4g4&feature=youtu.be&t=357)), there's a subtle
 difference between these two code samples:
 ```java
 public static int sum_for(List<Long> vals) {
    long sum = 0;
    // Regular for loop
    for (int i = 0; i < vals.length; i++) {
        sum += vals[i];
    }
    return sum;
 }
 public static int sum_foreach(List<Long> vals) {
    long sum = 0;
    // "Foreach" loop - uses iteration
    for (Long l : vals) {
        sum += l;
    }
    return sum;
 }
 ```
 In the `sum_for` function, nothing terribly interesting happens. In `sum_foreach`, an object of type
 [`Iterator`](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Iterator.html)
 is allocated on the heap, and will eventually be garbage-collected. This isn't a great design;
 iterators are often transient objects that you need during a function and can discard once the
 function ends. Sounds exactly like the issue stack-allocated objects address, no?
 In Rust, iterators are allocated on the stack. The objects to iterate over are almost certainly in
 heap memory, but the iterator itself
 ([`Iter`](https://doc.rust-lang.org/std/slice/struct.Iter.html)) doesn't need to use the heap. In
 each of the examples below we iterate over a collection, but never use heap allocation:
 ```rust
 use std::collections::HashMap;
 // There's a lot of assembly generated, but if you search in the text,
 // there are no references to `real_drop_in_place` anywhere.
 pub fn sum_vec(x: &Vec<u32>) {
    let mut s = 0;
    // Basic iteration over vectors doesn't need allocation
    for y in x {
        s += y;
    }
 }
 pub fn sum_enumerate(x: &Vec<u32>) {
    let mut s = 0;
    // More complex iterators are just fine too
    for (_i, y) in x.iter().enumerate() {
        s += y;
    }
 }
 pub fn sum_hm(x: &HashMap<u32, u32>) {
    let mut s = 0;
    // And it's not just Vec, all types will allocate the iterator
    // on stack memory
    for y in x.values() {
        s += y;
    }
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/FTT3CT)
--- a/blog/2019-02-06-stacking-up/index.mdx
+++ b/blog/2019-02-06-stacking-up/index.mdx
@ -0,0 +1,604 @@
 ---
 slug: 2019/02/stacking-up
 title: "Allocations in Rust: Fixed memory"
 date: 2019-02-06 12:00:00
 authors: [bspeice]
 tags: []
 ---
 `const` and `static` are perfectly fine, but it's relatively rare that we know at compile-time about
 either values or references that will be the same for the duration of our program. Put another way,
 it's not often the case that either you or your compiler knows how much memory your entire program
 will ever need.
 However, there are still some optimizations the compiler can do if it knows how much memory
 individual functions will need. Specifically, the compiler can make use of "stack" memory (as
 opposed to "heap" memory) which can be managed far faster in both the short- and long-term.
 <!-- truncate -->
 When requesting memory, the [`push` instruction](http://www.cs.virginia.edu/~evans/cs216/guides/x86.html)
 can typically complete in [1 or 2 cycles](https://agner.org/optimize/instruction_tables.ods) (&lt;1ns
 on modern CPUs). Contrast that to heap memory which requires an allocator (specialized
 software to track what memory is in use) to reserve space. When you're finished with stack memory,
 the `pop` instruction runs in 1-3 cycles, as opposed to an allocator needing to worry about memory
 fragmentation and other issues with the heap. All sorts of incredibly sophisticated techniques have
 been used to design allocators:
 - [Garbage Collection](<https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)>)
  strategies like [Tracing](https://en.wikipedia.org/wiki/Tracing_garbage_collection) (used in
  [Java](https://www.oracle.com/technetwork/java/javase/tech/g1-intro-jsp-135488.html)) and
  [Reference counting](https://en.wikipedia.org/wiki/Reference_counting) (used in
  [Python](https://docs.python.org/3/extending/extending.html#reference-counts))
 - Thread-local structures to prevent locking the allocator in
  [tcmalloc](https://jamesgolick.com/2013/5/19/how-tcmalloc-works.html)
 - Arena structures used in [jemalloc](http://jemalloc.net/), which
  [until recently](https://blog.rust-lang.org/2019/01/17/Rust-1.32.0.html#jemalloc-is-removed-by-default)
  was the primary allocator for Rust programs!
 But no matter how fast your allocator is, the principle remains: the fastest allocator is the one
 you never use. As such, we're not going to discuss how exactly the
 [`push` and `pop` instructions work](http://www.cs.virginia.edu/~evans/cs216/guides/x86.html), but
 we'll focus instead on the conditions that enable the Rust compiler to use faster stack-based
 allocation for variables.
 So, **how do we know when Rust will or will not use stack allocation for objects we create?**
 Looking at other languages, it's often easy to delineate between stack and heap. Managed memory
 languages (Python, Java,
 [C#](https://blogs.msdn.microsoft.com/ericlippert/2010/09/30/the-truth-about-value-types/)) place
 everything on the heap. JIT compilers ([PyPy](https://www.pypy.org/),
 [HotSpot](https://www.oracle.com/technetwork/java/javase/tech/index-jsp-136373.html)) may optimize
 some heap allocations away, but you should never assume it will happen. C makes things clear with
 calls to special functions (like [malloc(3)](https://linux.die.net/man/3/malloc)) needed to access
 heap memory. Old C++ has the [`new`](https://stackoverflow.com/a/655086/1454178) keyword, though
 modern C++/C++11 is more complicated with [RAII](https://en.cppreference.com/w/cpp/language/raii).
 For Rust, we can summarize as follows: **stack allocation will be used for everything that doesn't
 involve "smart pointers" and collections**. We'll skip over a precise definition of the term "smart
 pointer" for now, and instead discuss what we should watch for to understand when stack and heap
 memory regions are used:
 1. Stack manipulation instructions (`push`, `pop`, and `add`/`sub` of the `rsp` register) indicate
   allocation of stack memory:
   ```rust
   pub fn stack_alloc(x: u32) -> u32 {
       // Space for `y` is allocated by subtracting from `rsp`,
       // and then populated
       let y = [1u8, 2, 3, 4];
       // Space for `y` is deallocated by adding back to `rsp`
       x
   }
   ```
   -- [Compiler Explorer](https://godbolt.org/z/5WSgc9)
 2. Tracking when exactly heap allocation calls occur is difficult. It's typically easier to watch
   for `call core::ptr::real_drop_in_place`, and infer that a heap allocation happened in the recent
   past:
   ```rust
   pub fn heap_alloc(x: usize) -> usize {
       // Space for elements in a vector has to be allocated
       // on the heap, and is then de-allocated once the
       // vector goes out of scope
       let y: Vec<u8> = Vec::with_capacity(x);
       x
   }
   ```
   -- [Compiler Explorer](https://godbolt.org/z/epfgoQ) (`real_drop_in_place` happens on line 1317)
   <small>Note: While the
   [`Drop` trait](https://doc.rust-lang.org/std/ops/trait.Drop.html) is
   [called for stack-allocated objects](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=87edf374d8983816eb3d8cfeac657b46),
   the Rust standard library only defines `Drop` implementations for types that involve heap
   allocation.</small>
 3. If you don't want to inspect the assembly, use a custom allocator that's able to track and alert
   when heap allocations occur. Crates like
   [`alloc_counter`](https://crates.io/crates/alloc_counter) are designed for exactly this purpose.
 With all that in mind, let's talk about situations in which we're guaranteed to use stack memory:
 - Structs are created on the stack.
 - Function arguments are passed on the stack, meaning the
  [`#[inline]` attribute](https://doc.rust-lang.org/reference/attributes.html#inline-attribute) will
  not change the memory region used.
 - Enums and unions are stack-allocated.
 - [Arrays](https://doc.rust-lang.org/std/primitive.array.html) are always stack-allocated.
 - Closures capture their arguments on the stack.
 - Generics will use stack allocation, even with dynamic dispatch.
 - [`Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html) types are guaranteed to be
  stack-allocated, and copying them will be done in stack memory.
 - [`Iterator`s](https://doc.rust-lang.org/std/iter/trait.Iterator.html) in the standard library are
  stack-allocated even when iterating over heap-based collections.
 ## Structs
 The simplest case comes first. When creating vanilla `struct` objects, we use stack memory to hold
 their contents:
 ```rust
 struct Point {
    x: u64,
    y: u64,
 }
 struct Line {
    a: Point,
    b: Point,
 }
 pub fn make_line() {
    // `origin` is stored in the first 16 bytes of memory
    // starting at location `rsp`
    let origin = Point { x: 0, y: 0 };
    // `point` makes up the next 16 bytes of memory
    let point = Point { x: 1, y: 2 };
    // When creating `ray`, we just move the content out of
    // `origin` and `point` into the next 32 bytes of memory
    let ray = Line { a: origin, b: point };
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/vri9BE)
 Note that while some extra-fancy instructions are used for memory manipulation in the assembly, the
 `sub rsp, 64` instruction indicates we're still working with the stack.
 ## Function arguments
 Have you ever wondered how functions communicate with each other? Like, once the variables are given
 to you, everything's fine. But how do you "give" those variables to another function? How do you get
 the results back afterward? The answer: the compiler arranges memory and assembly instructions using
 a pre-determined [calling convention](http://llvm.org/docs/LangRef.html#calling-conventions). This
 convention governs the rules around where arguments needed by a function will be located (either in
 memory offsets relative to the stack pointer `rsp`, or in other registers), and where the results
 can be found once the function has finished. And when multiple languages agree on what the calling
 conventions are, you can do things like having [Go call Rust code](https://blog.filippo.io/rustgo/)!
 Put simply: it's the compiler's job to figure out how to call other functions, and you can assume
 that the compiler is good at its job.
 We can see this in action using a simple example:
 ```rust
 struct Point {
    x: i64,
    y: i64,
 }
 // We use integer division operations to keep
 // the assembly clean, understanding the result
 // isn't accurate.
 fn distance(a: &Point, b: &Point) -> i64 {
    // Immediately subtract from `rsp` the bytes needed
    // to hold all the intermediate results - this is
    // the stack allocation step
    // The compiler used the `rdi` and `rsi` registers
    // to pass our arguments, so read them in
    let x1 = a.x;
    let x2 = b.x;
    let y1 = a.y;
    let y2 = b.y;
    // Do the actual math work
    let x_pow = (x1 - x2) * (x1 - x2);
    let y_pow = (y1 - y2) * (y1 - y2);
    let squared = x_pow + y_pow;
    squared / squared
    // Our final result will be stored in the `rax` register
    // so that our caller knows where to retrieve it.
    // Finally, add back to `rsp` the stack memory that is
    // now ready to be used by other functions.
 }
 pub fn total_distance() {
    let start = Point { x: 1, y: 2 };
    let middle = Point { x: 3, y: 4 };
    let end = Point { x: 5, y: 6 };
    let _dist_1 = distance(&start, &middle);
    let _dist_2 = distance(&middle, &end);
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/Qmx4ST)
 As a consequence of function arguments never using heap memory, we can also infer that functions
 using the `#[inline]` attributes also do not heap allocate. But better than inferring, we can look
 at the assembly to prove it:
 ```rust
 struct Point {
    x: i64,
    y: i64,
 }
 // Note that there is no `distance` function in the assembly output,
 // and the total line count goes from 229 with inlining off
 // to 306 with inline on. Even still, no heap allocations occur.
 #[inline(always)]
 fn distance(a: &Point, b: &Point) -> i64 {
    let x1 = a.x;
    let x2 = b.x;
    let y1 = a.y;
    let y2 = b.y;
    let x_pow = (a.x - b.x) * (a.x - b.x);
    let y_pow = (a.y - b.y) * (a.y - b.y);
    let squared = x_pow + y_pow;
    squared / squared
 }
 pub fn total_distance() {
    let start = Point { x: 1, y: 2 };
    let middle = Point { x: 3, y: 4 };
    let end = Point { x: 5, y: 6 };
    let _dist_1 = distance(&start, &middle);
    let _dist_2 = distance(&middle, &end);
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/30Sh66)
 Finally, passing by value (arguments with type
 [`Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html)) and passing by reference (either
 moving ownership or passing a pointer) may have slightly different layouts in assembly, but will
 still use either stack memory or CPU registers:
 ```rust
 pub struct Point {
    x: i64,
    y: i64,
 }
 // Moving values
 pub fn distance_moved(a: Point, b: Point) -> i64 {
    let x1 = a.x;
    let x2 = b.x;
    let y1 = a.y;
    let y2 = b.y;
    let x_pow = (x1 - x2) * (x1 - x2);
    let y_pow = (y1 - y2) * (y1 - y2);
    let squared = x_pow + y_pow;
    squared / squared
 }
 // Borrowing values has two extra `mov` instructions on lines 21 and 22
 pub fn distance_borrowed(a: &Point, b: &Point) -> i64 {
    let x1 = a.x;
    let x2 = b.x;
    let y1 = a.y;
    let y2 = b.y;
    let x_pow = (x1 - x2) * (x1 - x2);
    let y_pow = (y1 - y2) * (y1 - y2);
    let squared = x_pow + y_pow;
    squared / squared
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/06hGiv)
 ## Enums
 If you've ever worried that wrapping your types in
 [`Option`](https://doc.rust-lang.org/stable/core/option/enum.Option.html) or
 [`Result`](https://doc.rust-lang.org/stable/core/result/enum.Result.html) would finally make them
 large enough that Rust decides to use heap allocation instead, fear no longer: `enum` and union
 types don't use heap allocation:
 ```rust
 enum MyEnum {
    Small(u8),
    Large(u64)
 }
 struct MyStruct {
    x: MyEnum,
    y: MyEnum,
 }
 pub fn enum_compare() {
    let x = MyEnum::Small(0);
    let y = MyEnum::Large(0);
    let z = MyStruct { x, y };
    let opt = Option::Some(z);
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/HK7zBx)
 Because the size of an `enum` is the size of its largest element plus a flag, the compiler can
 predict how much memory is used no matter which variant of an enum is currently stored in a
 variable. Thus, enums and unions have no need of heap allocation. There's unfortunately not a great
 way to show this in assembly, so I'll instead point you to the
 [`core::mem::size_of`](https://doc.rust-lang.org/stable/core/mem/fn.size_of.html#size-of-enums)
 documentation.
 ## Arrays
 The array type is guaranteed to be stack allocated, which is why the array size must be declared.
 Interestingly enough, this can be used to cause safe Rust programs to crash:
 ```rust
 // 256 bytes
 #[derive(Default)]
 struct TwoFiftySix {
    _a: [u64; 32]
 }
 // 8 kilobytes
 #[derive(Default)]
 struct EightK {
    _a: [TwoFiftySix; 32]
 }
 // 256 kilobytes
 #[derive(Default)]
 struct TwoFiftySixK {
    _a: [EightK; 32]
 }
 // 8 megabytes - exceeds space typically provided for the stack,
 // though the kernel can be instructed to allocate more.
 // On Linux, you can check stack size using `ulimit -s`
 #[derive(Default)]
 struct EightM {
    _a: [TwoFiftySixK; 32]
 }
 fn main() {
    // Because we already have things in stack memory
    // (like the current function call stack), allocating another
    // eight megabytes of stack memory crashes the program
    let _x = EightM::default();
 }
 ```
 --
 [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=587a6380a4914bcbcef4192c90c01dc4)
 There aren't any security implications of this (no memory corruption occurs), but it's good to note
 that the Rust compiler won't move arrays into heap memory even if they can be reasonably expected to
 overflow the stack.
 ## Closures
 Rules for how anonymous functions capture their arguments are typically language-specific. In Java,
 [Lambda Expressions](https://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html) are
 actually objects created on the heap that capture local primitives by copying, and capture local
 non-primitives as (`final`) references.
 [Python](https://docs.python.org/3.7/reference/expressions.html#lambda) and
 [JavaScript](https://javascriptweblog.wordpress.com/2010/10/25/understanding-javascript-closures/)
 both bind _everything_ by reference normally, but Python can also
 [capture values](https://stackoverflow.com/a/235764/1454178) and JavaScript has
 [Arrow functions](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/Arrow_functions).
 In Rust, arguments to closures are the same as arguments to other functions; closures are simply
 functions that don't have a declared name. Some weird ordering of the stack may be required to
 handle them, but it's the compiler's responsiblity to figure that out.
 Each example below has the same effect, but a different assembly implementation. In the simplest
 case, we immediately run a closure returned by another function. Because we don't store a reference
 to the closure, the stack memory needed to store the captured values is contiguous:
 ```rust
 fn my_func() -> impl FnOnce() {
    let x = 24;
    // Note that this closure in assembly looks exactly like
    // any other function; you even use the `call` instruction
    // to start running it.
    move || { x; }
 }
 pub fn immediate() {
    my_func()();
    my_func()();
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/mgJ2zl), 25 total assembly instructions
 If we store a reference to the closure, the Rust compiler keeps values it needs in the stack memory
 of the original function. Getting the details right is a bit harder, so the instruction count goes
 up even though this code is functionally equivalent to our original example:
 ```rust
 pub fn simple_reference() {
    let x = my_func();
    let y = my_func();
    y();
    x();
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/K_dj5n), 55 total assembly instructions
 Even things like variable order can make a difference in instruction count:
 ```rust
 pub fn complex() {
    let x = my_func();
    let y = my_func();
    x();
    y();
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/p37qFl), 70 total assembly instructions
 In every circumstance though, the compiler ensured that no heap allocations were necessary.
 ## Generics
 Traits in Rust come in two broad forms: static dispatch (monomorphization, `impl Trait`) and dynamic
 dispatch (trait objects, `dyn Trait`). While dynamic dispatch is often _associated_ with trait
 objects being stored in the heap, dynamic dispatch can be used with stack allocated objects as well:
 ```rust
 trait GetInt {
    fn get_int(&self) -> u64;
 }
 // vtable stored at section L__unnamed_1
 struct WhyNotU8 {
    x: u8
 }
 impl GetInt for WhyNotU8 {
    fn get_int(&self) -> u64 {
        self.x as u64
    }
 }
 // vtable stored at section L__unnamed_2
 struct ActualU64 {
    x: u64
 }
 impl GetInt for ActualU64 {
    fn get_int(&self) -> u64 {
        self.x
    }
 }
 // `&dyn` declares that we want to use dynamic dispatch
 // rather than monomorphization, so there is only one
 // `retrieve_int` function that shows up in the final assembly.
 // If we used generics, there would be one implementation of
 // `retrieve_int` for each type that implements `GetInt`.
 pub fn retrieve_int(u: &dyn GetInt) {
    // In the assembly, we just call an address given to us
    // in the `rsi` register and hope that it was set up
    // correctly when this function was invoked.
    let x = u.get_int();
 }
 pub fn do_call() {
    // Note that even though the vtable for `WhyNotU8` and
    // `ActualU64` includes a pointer to
    // `core::ptr::real_drop_in_place`, it is never invoked.
    let a = WhyNotU8 { x: 0 };
    let b = ActualU64 { x: 0 };
    retrieve_int(&a);
    retrieve_int(&b);
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/u_yguS)
 It's hard to imagine practical situations where dynamic dispatch would be used for objects that
 aren't heap allocated, but it technically can be done.
 ## Copy types
 Understanding move semantics and copy semantics in Rust is weird at first. The Rust docs
 [go into detail](https://doc.rust-lang.org/stable/core/marker/trait.Copy.html) far better than can
 be addressed here, so I'll leave them to do the job. From a memory perspective though, their
 guideline is reasonable:
 [if your type can implemement `Copy`, it should](https://doc.rust-lang.org/stable/core/marker/trait.Copy.html#when-should-my-type-be-copy).
 While there are potential speed tradeoffs to _benchmark_ when discussing `Copy` (move semantics for
 stack objects vs. copying stack pointers vs. copying stack `struct`s), _it's impossible for `Copy`
 to introduce a heap allocation_.
 But why is this the case? Fundamentally, it's because the language controls what `Copy` means -
 ["the behavior of `Copy` is not overloadable"](https://doc.rust-lang.org/std/marker/trait.Copy.html#whats-the-difference-between-copy-and-clone)
 because it's a marker trait. From there we'll note that a type
 [can implement `Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html#when-can-my-type-be-copy)
 if (and only if) its components implement `Copy`, and that
 [no heap-allocated types implement `Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html#implementors).
 Thus, assignments involving heap types are always move semantics, and new heap allocations won't
 occur because of implicit operator behavior.
 ```rust
 #[derive(Clone)]
 struct Cloneable {
    x: Box<u64>
 }
 // error[E0204]: the trait `Copy` may not be implemented for this type
 #[derive(Copy, Clone)]
 struct NotCopyable {
    x: Box<u64>
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/VToRuK)
 ## Iterators
 In managed memory languages (like
 [Java](https://www.youtube.com/watch?v=bSkpMdDe4g4&feature=youtu.be&t=357)), there's a subtle
 difference between these two code samples:
 ```java
 public static int sum_for(List<Long> vals) {
    long sum = 0;
    // Regular for loop
    for (int i = 0; i < vals.length; i++) {
        sum += vals[i];
    }
    return sum;
 }
 public static int sum_foreach(List<Long> vals) {
    long sum = 0;
    // "Foreach" loop - uses iteration
    for (Long l : vals) {
        sum += l;
    }
    return sum;
 }
 ```
 In the `sum_for` function, nothing terribly interesting happens. In `sum_foreach`, an object of type
 [`Iterator`](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Iterator.html)
 is allocated on the heap, and will eventually be garbage-collected. This isn't a great design;
 iterators are often transient objects that you need during a function and can discard once the
 function ends. Sounds exactly like the issue stack-allocated objects address, no?
 In Rust, iterators are allocated on the stack. The objects to iterate over are almost certainly in
 heap memory, but the iterator itself
 ([`Iter`](https://doc.rust-lang.org/std/slice/struct.Iter.html)) doesn't need to use the heap. In
 each of the examples below we iterate over a collection, but never use heap allocation:
 ```rust
 use std::collections::HashMap;
 // There's a lot of assembly generated, but if you search in the text,
 // there are no references to `real_drop_in_place` anywhere.
 pub fn sum_vec(x: &Vec<u32>) {
    let mut s = 0;
    // Basic iteration over vectors doesn't need allocation
    for y in x {
        s += y;
    }
 }
 pub fn sum_enumerate(x: &Vec<u32>) {
    let mut s = 0;
    // More complex iterators are just fine too
    for (_i, y) in x.iter().enumerate() {
        s += y;
    }
 }
 pub fn sum_hm(x: &HashMap<u32, u32>) {
    let mut s = 0;
    // And it's not just Vec, all types will allocate the iterator
    // on stack memory
    for y in x.values() {
        s += y;
    }
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/FTT3CT)
--- a/blog/2019-02-07-a-heaping-helping/_article.md
+++ b/blog/2019-02-07-a-heaping-helping/_article.md
@ -0,0 +1,254 @@
 ---
 layout: post
 title: "Dynamic Memory: A Heaping Helping"
 description: "The reason Rust exists."
 category:
 tags: [rust, understanding-allocations]
 ---
 Managing dynamic memory is hard. Some languages assume users will do it themselves (C, C++), and
 some languages go to extreme lengths to protect users from themselves (Java, Python). In Rust, how
 the language uses dynamic memory (also referred to as the **heap**) is a system called _ownership_.
 And as the docs mention, ownership
 [is Rust's most unique feature](https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html).
 The heap is used in two situations; when the compiler is unable to predict either the _total size of
 memory needed_, or _how long the memory is needed for_, it allocates space in the heap. This happens
 pretty frequently; if you want to download the Google home page, you won't know how large it is
 until your program runs. And when you're finished with Google, we deallocate the memory so it can be
 used to store other webpages. If you're interested in a slightly longer explanation of the heap,
 check out
 [The Stack and the Heap](https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html#the-stack-and-the-heap)
 in Rust's documentation.
 We won't go into detail on how the heap is managed; the
 [ownership documentation](https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html) does a
 phenomenal job explaining both the "why" and "how" of memory management. Instead, we're going to
 focus on understanding "when" heap allocations occur in Rust.
 To start off, take a guess for how many allocations happen in the program below:
 ```rust
 fn main() {}
 ```
 It's obviously a trick question; while no heap allocations occur as a result of that code, the setup
 needed to call `main` does allocate on the heap. Here's a way to show it:
 ```rust
 #![feature(integer_atomics)]
 use std::alloc::{GlobalAlloc, Layout, System};
 use std::sync::atomic::{AtomicU64, Ordering};
 static ALLOCATION_COUNT: AtomicU64 = AtomicU64::new(0);
 struct CountingAllocator;
 unsafe impl GlobalAlloc for CountingAllocator {
    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
        ALLOCATION_COUNT.fetch_add(1, Ordering::SeqCst);
        System.alloc(layout)
    }
    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
        System.dealloc(ptr, layout);
    }
 }
 #[global_allocator]
 static A: CountingAllocator = CountingAllocator;
 fn main() {
    let x = ALLOCATION_COUNT.fetch_add(0, Ordering::SeqCst);
    println!("There were {} allocations before calling main!", x);
 }
 ```
 --
 [Rust Playground](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=fb5060025ba79fc0f906b65a4ef8eb8e)
 As of the time of writing, there are five allocations that happen before `main` is ever called.
 But when we want to understand more practically where heap allocation happens, we'll follow this
 guide:
 - Smart pointers hold their contents in the heap
 - Collections are smart pointers for many objects at a time, and reallocate when they need to grow
 Finally, there are two "addendum" issues that are important to address when discussing Rust and the
 heap:
 - Non-heap alternatives to many standard library types are available.
 - Special allocators to track memory behavior should be used to benchmark code.
 # Smart pointers
 The first thing to note are the "smart pointer" types. When you have data that must outlive the
 scope in which it is declared, or your data is of unknown or dynamic size, you'll make use of these
 types.
 The term [smart pointer](https://en.wikipedia.org/wiki/Smart_pointer) comes from C++, and while it's
 closely linked to a general design pattern of
 ["Resource Acquisition Is Initialization"](https://en.cppreference.com/w/cpp/language/raii), we'll
 use it here specifically to describe objects that are responsible for managing ownership of data
 allocated on the heap. The smart pointers available in the `alloc` crate should look mostly
 familiar:
 - [`Box`](https://doc.rust-lang.org/alloc/boxed/struct.Box.html)
 - [`Rc`](https://doc.rust-lang.org/alloc/rc/struct.Rc.html)
 - [`Arc`](https://doc.rust-lang.org/alloc/sync/struct.Arc.html)
 - [`Cow`](https://doc.rust-lang.org/alloc/borrow/enum.Cow.html)
 The [standard library](https://doc.rust-lang.org/std/) also defines some smart pointers to manage
 heap objects, though more than can be covered here. Some examples are:
 - [`RwLock`](https://doc.rust-lang.org/std/sync/struct.RwLock.html)
 - [`Mutex`](https://doc.rust-lang.org/std/sync/struct.Mutex.html)
 Finally, there is one ["gotcha"](https://www.merriam-webster.com/dictionary/gotcha): **cell types**
 (like [`RefCell`](https://doc.rust-lang.org/stable/core/cell/struct.RefCell.html)) look and behave
 similarly, but **don't involve heap allocation**. The
 [`core::cell` docs](https://doc.rust-lang.org/stable/core/cell/index.html) have more information.
 When a smart pointer is created, the data it is given is placed in heap memory and the location of
 that data is recorded in the smart pointer. Once the smart pointer has determined it's safe to
 deallocate that memory (when a `Box` has
 [gone out of scope](https://doc.rust-lang.org/stable/std/boxed/index.html) or a reference count
 [goes to zero](https://doc.rust-lang.org/alloc/rc/index.html)), the heap space is reclaimed. We can
 prove these types use heap memory by looking at code:
 ```rust
 use std::rc::Rc;
 use std::sync::Arc;
 use std::borrow::Cow;
 pub fn my_box() {
    // Drop at assembly line 1640
    Box::new(0);
 }
 pub fn my_rc() {
    // Drop at assembly line 1650
    Rc::new(0);
 }
 pub fn my_arc() {
    // Drop at assembly line 1660
    Arc::new(0);
 }
 pub fn my_cow() {
    // Drop at assembly line 1672
    Cow::from("drop");
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/4AMQug)
 # Collections
 Collection types use heap memory because their contents have dynamic size; they will request more
 memory [when needed](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.reserve), and can
 [release memory](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.shrink_to_fit) when it's
 no longer necessary. This dynamic property forces Rust to heap allocate everything they contain. In
 a way, **collections are smart pointers for many objects at a time**. Common types that fall under
 this umbrella are [`Vec`](https://doc.rust-lang.org/stable/alloc/vec/struct.Vec.html),
 [`HashMap`](https://doc.rust-lang.org/stable/std/collections/struct.HashMap.html), and
 [`String`](https://doc.rust-lang.org/stable/alloc/string/struct.String.html) (not
 [`str`](https://doc.rust-lang.org/std/primitive.str.html)).
 While collections store the objects they own in heap memory, _creating new collections will not
 allocate on the heap_. This is a bit weird; if we call `Vec::new()`, the assembly shows a
 corresponding call to `real_drop_in_place`:
 ```rust
 pub fn my_vec() {
    // Drop in place at line 481
    Vec::<u8>::new();
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/1WkNtC)
 But because the vector has no elements to manage, no calls to the allocator will ever be dispatched:
 ```rust
 use std::alloc::{GlobalAlloc, Layout, System};
 use std::sync::atomic::{AtomicBool, Ordering};
 fn main() {
    // Turn on panicking if we allocate on the heap
    DO_PANIC.store(true, Ordering::SeqCst);
    // Interesting bit happens here
    let x: Vec<u8> = Vec::new();
    drop(x);
    // Turn panicking back off, some deallocations occur
    // after main as well.
    DO_PANIC.store(false, Ordering::SeqCst);
 }
 #[global_allocator]
 static A: PanicAllocator = PanicAllocator;
 static DO_PANIC: AtomicBool = AtomicBool::new(false);
 struct PanicAllocator;
 unsafe impl GlobalAlloc for PanicAllocator {
    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
        if DO_PANIC.load(Ordering::SeqCst) {
            panic!("Unexpected allocation.");
        }
        System.alloc(layout)
    }
    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
        if DO_PANIC.load(Ordering::SeqCst) {
            panic!("Unexpected deallocation.");
        }
        System.dealloc(ptr, layout);
    }
 }
 ```
 --
 [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=831a297d176d015b1f9ace01ae416cc6)
 Other standard library types follow the same behavior; make sure to check out
 [`HashMap::new()`](https://doc.rust-lang.org/std/collections/hash_map/struct.HashMap.html#method.new),
 and [`String::new()`](https://doc.rust-lang.org/std/string/struct.String.html#method.new).
 # Heap Alternatives
 While it is a bit strange to speak of the stack after spending time with the heap, it's worth
 pointing out that some heap-allocated objects in Rust have stack-based counterparts provided by
 other crates. If you have need of the functionality, but want to avoid allocating, there are
 typically alternatives available.
 When it comes to some standard library smart pointers
 ([`RwLock`](https://doc.rust-lang.org/std/sync/struct.RwLock.html) and
 [`Mutex`](https://doc.rust-lang.org/std/sync/struct.Mutex.html)), stack-based alternatives are
 provided in crates like [parking_lot](https://crates.io/crates/parking_lot) and
 [spin](https://crates.io/crates/spin). You can check out
 [`lock_api::RwLock`](https://docs.rs/lock_api/0.1.5/lock_api/struct.RwLock.html),
 [`lock_api::Mutex`](https://docs.rs/lock_api/0.1.5/lock_api/struct.Mutex.html), and
 [`spin::Once`](https://mvdnes.github.io/rust-docs/spin-rs/spin/struct.Once.html) if you're in need
 of synchronization primitives.
 [thread_id](https://crates.io/crates/thread-id) may be necessary if you're implementing an allocator
 because [`thread::current().id()`](https://doc.rust-lang.org/std/thread/struct.ThreadId.html) uses a
 [`thread_local!` structure](https://doc.rust-lang.org/stable/src/std/sys_common/thread_info.rs.html#17-36)
 that needs heap allocation.
 # Tracing Allocators
 When writing performance-sensitive code, there's no alternative to measuring your code. If you
 didn't write a benchmark,
 [you don't care about it's performance](https://www.youtube.com/watch?v=2EWejmkKlxs&feature=youtu.be&t=263)
 You should never rely on your instincts when
 [a microsecond is an eternity](https://www.youtube.com/watch?v=NH1Tta7purM).
 Similarly, there's great work going on in Rust with allocators that keep track of what they're doing
 (like [`alloc_counter`](https://crates.io/crates/alloc_counter)). When it comes to tracking heap
 behavior, it's easy to make mistakes; please write tests and make sure you have tools to guard
 against future issues.
--- a/blog/2019-02-07-a-heaping-helping/index.mdx
+++ b/blog/2019-02-07-a-heaping-helping/index.mdx
@ -0,0 +1,258 @@
 ---
 slug: 2019/02/a-heaping-helping
 title: "Allocations in Rust: Dynamic memory"
 date: 2019-02-07 12:00:00
 authors: [bspeice]
 tags: []
 ---
 Managing dynamic memory is hard. Some languages assume users will do it themselves (C, C++), and
 some languages go to extreme lengths to protect users from themselves (Java, Python). In Rust, how
 the language uses dynamic memory (also referred to as the **heap**) is a system called _ownership_.
 And as the docs mention, ownership
 [is Rust's most unique feature](https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html).
 The heap is used in two situations; when the compiler is unable to predict either the _total size of
 memory needed_, or _how long the memory is needed for_, it allocates space in the heap.
 <!-- truncate -->
 This happens
 pretty frequently; if you want to download the Google home page, you won't know how large it is
 until your program runs. And when you're finished with Google, we deallocate the memory so it can be
 used to store other webpages. If you're interested in a slightly longer explanation of the heap,
 check out
 [The Stack and the Heap](https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html#the-stack-and-the-heap)
 in Rust's documentation.
 We won't go into detail on how the heap is managed; the
 [ownership documentation](https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html) does a
 phenomenal job explaining both the "why" and "how" of memory management. Instead, we're going to
 focus on understanding "when" heap allocations occur in Rust.
 To start off, take a guess for how many allocations happen in the program below:
 ```rust
 fn main() {}
 ```
 It's obviously a trick question; while no heap allocations occur as a result of that code, the setup
 needed to call `main` does allocate on the heap. Here's a way to show it:
 ```rust
 #![feature(integer_atomics)]
 use std::alloc::{GlobalAlloc, Layout, System};
 use std::sync::atomic::{AtomicU64, Ordering};
 static ALLOCATION_COUNT: AtomicU64 = AtomicU64::new(0);
 struct CountingAllocator;
 unsafe impl GlobalAlloc for CountingAllocator {
    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
        ALLOCATION_COUNT.fetch_add(1, Ordering::SeqCst);
        System.alloc(layout)
    }
    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
        System.dealloc(ptr, layout);
    }
 }
 #[global_allocator]
 static A: CountingAllocator = CountingAllocator;
 fn main() {
    let x = ALLOCATION_COUNT.fetch_add(0, Ordering::SeqCst);
    println!("There were {} allocations before calling main!", x);
 }
 ```
 --
 [Rust Playground](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=fb5060025ba79fc0f906b65a4ef8eb8e)
 As of the time of writing, there are five allocations that happen before `main` is ever called.
 But when we want to understand more practically where heap allocation happens, we'll follow this
 guide:
 - Smart pointers hold their contents in the heap
 - Collections are smart pointers for many objects at a time, and reallocate when they need to grow
 Finally, there are two "addendum" issues that are important to address when discussing Rust and the
 heap:
 - Non-heap alternatives to many standard library types are available.
 - Special allocators to track memory behavior should be used to benchmark code.
 ## Smart pointers
 The first thing to note are the "smart pointer" types. When you have data that must outlive the
 scope in which it is declared, or your data is of unknown or dynamic size, you'll make use of these
 types.
 The term [smart pointer](https://en.wikipedia.org/wiki/Smart_pointer) comes from C++, and while it's
 closely linked to a general design pattern of
 ["Resource Acquisition Is Initialization"](https://en.cppreference.com/w/cpp/language/raii), we'll
 use it here specifically to describe objects that are responsible for managing ownership of data
 allocated on the heap. The smart pointers available in the `alloc` crate should look mostly
 familiar:
 - [`Box`](https://doc.rust-lang.org/alloc/boxed/struct.Box.html)
 - [`Rc`](https://doc.rust-lang.org/alloc/rc/struct.Rc.html)
 - [`Arc`](https://doc.rust-lang.org/alloc/sync/struct.Arc.html)
 - [`Cow`](https://doc.rust-lang.org/alloc/borrow/enum.Cow.html)
 The [standard library](https://doc.rust-lang.org/std/) also defines some smart pointers to manage
 heap objects, though more than can be covered here. Some examples are:
 - [`RwLock`](https://doc.rust-lang.org/std/sync/struct.RwLock.html)
 - [`Mutex`](https://doc.rust-lang.org/std/sync/struct.Mutex.html)
 Finally, there is one ["gotcha"](https://www.merriam-webster.com/dictionary/gotcha): **cell types**
 (like [`RefCell`](https://doc.rust-lang.org/stable/core/cell/struct.RefCell.html)) look and behave
 similarly, but **don't involve heap allocation**. The
 [`core::cell` docs](https://doc.rust-lang.org/stable/core/cell/index.html) have more information.
 When a smart pointer is created, the data it is given is placed in heap memory and the location of
 that data is recorded in the smart pointer. Once the smart pointer has determined it's safe to
 deallocate that memory (when a `Box` has
 [gone out of scope](https://doc.rust-lang.org/stable/std/boxed/index.html) or a reference count
 [goes to zero](https://doc.rust-lang.org/alloc/rc/index.html)), the heap space is reclaimed. We can
 prove these types use heap memory by looking at code:
 ```rust
 use std::rc::Rc;
 use std::sync::Arc;
 use std::borrow::Cow;
 pub fn my_box() {
    // Drop at assembly line 1640
    Box::new(0);
 }
 pub fn my_rc() {
    // Drop at assembly line 1650
    Rc::new(0);
 }
 pub fn my_arc() {
    // Drop at assembly line 1660
    Arc::new(0);
 }
 pub fn my_cow() {
    // Drop at assembly line 1672
    Cow::from("drop");
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/4AMQug)
 ## Collections
 Collection types use heap memory because their contents have dynamic size; they will request more
 memory [when needed](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.reserve), and can
 [release memory](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.shrink_to_fit) when it's
 no longer necessary. This dynamic property forces Rust to heap allocate everything they contain. In
 a way, **collections are smart pointers for many objects at a time**. Common types that fall under
 this umbrella are [`Vec`](https://doc.rust-lang.org/stable/alloc/vec/struct.Vec.html),
 [`HashMap`](https://doc.rust-lang.org/stable/std/collections/struct.HashMap.html), and
 [`String`](https://doc.rust-lang.org/stable/alloc/string/struct.String.html) (not
 [`str`](https://doc.rust-lang.org/std/primitive.str.html)).
 While collections store the objects they own in heap memory, _creating new collections will not
 allocate on the heap_. This is a bit weird; if we call `Vec::new()`, the assembly shows a
 corresponding call to `real_drop_in_place`:
 ```rust
 pub fn my_vec() {
    // Drop in place at line 481
    Vec::<u8>::new();
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/1WkNtC)
 But because the vector has no elements to manage, no calls to the allocator will ever be dispatched:
 ```rust
 use std::alloc::{GlobalAlloc, Layout, System};
 use std::sync::atomic::{AtomicBool, Ordering};
 fn main() {
    // Turn on panicking if we allocate on the heap
    DO_PANIC.store(true, Ordering::SeqCst);
    // Interesting bit happens here
    let x: Vec<u8> = Vec::new();
    drop(x);
    // Turn panicking back off, some deallocations occur
    // after main as well.
    DO_PANIC.store(false, Ordering::SeqCst);
 }
 #[global_allocator]
 static A: PanicAllocator = PanicAllocator;
 static DO_PANIC: AtomicBool = AtomicBool::new(false);
 struct PanicAllocator;
 unsafe impl GlobalAlloc for PanicAllocator {
    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
        if DO_PANIC.load(Ordering::SeqCst) {
            panic!("Unexpected allocation.");
        }
        System.alloc(layout)
    }
    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
        if DO_PANIC.load(Ordering::SeqCst) {
            panic!("Unexpected deallocation.");
        }
        System.dealloc(ptr, layout);
    }
 }
 ```
 --
 [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=831a297d176d015b1f9ace01ae416cc6)
 Other standard library types follow the same behavior; make sure to check out
 [`HashMap::new()`](https://doc.rust-lang.org/std/collections/hash_map/struct.HashMap.html#method.new),
 and [`String::new()`](https://doc.rust-lang.org/std/string/struct.String.html#method.new).
 ## Heap Alternatives
 While it is a bit strange to speak of the stack after spending time with the heap, it's worth
 pointing out that some heap-allocated objects in Rust have stack-based counterparts provided by
 other crates. If you have need of the functionality, but want to avoid allocating, there are
 typically alternatives available.
 When it comes to some standard library smart pointers
 ([`RwLock`](https://doc.rust-lang.org/std/sync/struct.RwLock.html) and
 [`Mutex`](https://doc.rust-lang.org/std/sync/struct.Mutex.html)), stack-based alternatives are
 provided in crates like [parking_lot](https://crates.io/crates/parking_lot) and
 [spin](https://crates.io/crates/spin). You can check out
 [`lock_api::RwLock`](https://docs.rs/lock_api/0.1.5/lock_api/struct.RwLock.html),
 [`lock_api::Mutex`](https://docs.rs/lock_api/0.1.5/lock_api/struct.Mutex.html), and
 [`spin::Once`](https://mvdnes.github.io/rust-docs/spin-rs/spin/struct.Once.html) if you're in need
 of synchronization primitives.
 [thread_id](https://crates.io/crates/thread-id) may be necessary if you're implementing an allocator
 because [`thread::current().id()`](https://doc.rust-lang.org/std/thread/struct.ThreadId.html) uses a
 [`thread_local!` structure](https://doc.rust-lang.org/stable/src/std/sys_common/thread_info.rs.html#17-36)
 that needs heap allocation.
 ## Tracing Allocators
 When writing performance-sensitive code, there's no alternative to measuring your code. If you
 didn't write a benchmark,
 [you don't care about it's performance](https://www.youtube.com/watch?v=2EWejmkKlxs&feature=youtu.be&t=263)
 You should never rely on your instincts when
 [a microsecond is an eternity](https://www.youtube.com/watch?v=NH1Tta7purM).
 Similarly, there's great work going on in Rust with allocators that keep track of what they're doing
 (like [`alloc_counter`](https://crates.io/crates/alloc_counter)). When it comes to tracking heap
 behavior, it's easy to make mistakes; please write tests and make sure you have tools to guard
 against future issues.
--- a/blog/2019-02-08-compiler-optimizations/_article.md
+++ b/blog/2019-02-08-compiler-optimizations/_article.md
@ -0,0 +1,148 @@
 ---
 layout: post
 title: "Compiler Optimizations: What It's Done Lately"
 description: "A lot. The answer is a lot."
 category:
 tags: [rust, understanding-allocations]
 ---
 **Update 2019-02-10**: When debugging a
 [related issue](https://gitlab.com/sio4/code/alloc-counter/issues/1), it was discovered that the
 original code worked because LLVM optimized out the entire function, rather than just the allocation
 segments. The code has been updated with proper use of
 [`read_volatile`](https://doc.rust-lang.org/std/ptr/fn.read_volatile.html), and a previous section
 on vector capacity has been removed.
 ---
 Up to this point, we've been discussing memory usage in the Rust language by focusing on simple
 rules that are mostly right for small chunks of code. We've spent time showing how those rules work
 themselves out in practice, and become familiar with reading the assembly code needed to see each
 memory type (global, stack, heap) in action.
 Throughout the series so far, we've put a handicap on the code. In the name of consistent and
 understandable results, we've asked the compiler to pretty please leave the training wheels on. Now
 is the time where we throw out all the rules and take off the kid gloves. As it turns out, both the
 Rust compiler and the LLVM optimizers are incredibly sophisticated, and we'll step back and let them
 do their job.
 Similar to
 ["What Has My Compiler Done For Me Lately?"](https://www.youtube.com/watch?v=bSkpMdDe4g4), we're
 focusing on interesting things the Rust language (and LLVM!) can do with memory management. We'll
 still be looking at assembly code to understand what's going on, but it's important to mention
 again: **please use automated tools like [alloc-counter](https://crates.io/crates/alloc_counter) to
 double-check memory behavior if it's something you care about**. It's far too easy to mis-read
 assembly in large code sections, you should always verify behavior if you care about memory usage.
 The guiding principal as we move forward is this: _optimizing compilers won't produce worse programs
 than we started with._ There won't be any situations where stack allocations get moved to heap
 allocations. There will, however, be an opera of optimization.
 # The Case of the Disappearing Box
 Our first optimization comes when LLVM can reason that the lifetime of an object is sufficiently
 short that heap allocations aren't necessary. In these cases, LLVM will move the allocation to the
 stack instead! The way this interacts with `#[inline]` attributes is a bit opaque, but the important
 part is that LLVM can sometimes do better than the baseline Rust language:
 ```rust
 use std::alloc::{GlobalAlloc, Layout, System};
 use std::sync::atomic::{AtomicBool, Ordering};
 pub fn cmp(x: u32) {
    // Turn on panicking if we allocate on the heap
    DO_PANIC.store(true, Ordering::SeqCst);
    // The compiler is able to see through the constant `Box`
    // and directly compare `x` to 24 - assembly line 73
    let y = Box::new(24);
    let equals = x == *y;
    // This call to drop is eliminated
    drop(y);
    // Need to mark the comparison result as volatile so that
    // LLVM doesn't strip out all the code. If `y` is marked
    // volatile instead, allocation will be forced.
    unsafe { std::ptr::read_volatile(&equals) };
    // Turn off panicking, as there are some deallocations
    // when we exit main.
    DO_PANIC.store(false, Ordering::SeqCst);
 }
 fn main() {
    cmp(12)
 }
 #[global_allocator]
 static A: PanicAllocator = PanicAllocator;
 static DO_PANIC: AtomicBool = AtomicBool::new(false);
 struct PanicAllocator;
 unsafe impl GlobalAlloc for PanicAllocator {
    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
        if DO_PANIC.load(Ordering::SeqCst) {
            panic!("Unexpected allocation.");
        }
        System.alloc(layout)
    }
    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
        if DO_PANIC.load(Ordering::SeqCst) {
            panic!("Unexpected deallocation.");
        }
        System.dealloc(ptr, layout);
    }
 }
 ```
 ## -- [Compiler Explorer](https://godbolt.org/z/BZ_Yp3)
 [Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=4a765f753183d5b919f62c71d2109d5d)
 # Dr. Array or: How I Learned to Love the Optimizer
 Finally, this isn't so much about LLVM figuring out different memory behavior, but LLVM stripping
 out code that doesn't do anything. Optimizations of this type have a lot of nuance to them; if
 you're not careful, they can make your benchmarks look
 [impossibly good](https://www.youtube.com/watch?v=nXaxk27zwlk&feature=youtu.be&t=1199). In Rust, the
 `black_box` function (implemented in both
 [`libtest`](https://doc.rust-lang.org/1.1.0/test/fn.black_box.html) and
 [`criterion`](https://docs.rs/criterion/0.2.10/criterion/fn.black_box.html)) will tell the compiler
 to disable this kind of optimization. But if you let LLVM remove unnecessary code, you can end up
 running programs that previously caused errors:
 ```rust
 #[derive(Default)]
 struct TwoFiftySix {
    _a: [u64; 32]
 }
 #[derive(Default)]
 struct EightK {
    _a: [TwoFiftySix; 32]
 }
 #[derive(Default)]
 struct TwoFiftySixK {
    _a: [EightK; 32]
 }
 #[derive(Default)]
 struct EightM {
    _a: [TwoFiftySixK; 32]
 }
 pub fn main() {
    // Normally this blows up because we can't reserve size on stack
    // for the `EightM` struct. But because the compiler notices we
    // never do anything with `_x`, it optimizes out the stack storage
    // and the program completes successfully.
    let _x = EightM::default();
 }
 ```
 ## -- [Compiler Explorer](https://godbolt.org/z/daHn7P)
 [Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=4c253bf26072119896ab93c6ef064dc0)
--- a/blog/2019-02-08-compiler-optimizations/index.mdx
+++ b/blog/2019-02-08-compiler-optimizations/index.mdx
@ -0,0 +1,149 @@
 ---
 title: "Allocations in Rust: Compiler optimizations"
 description: "A lot. The answer is a lot."
 date: 2019-02-08 12:00:00
 last_updated:
    date: 2019-02-10 12:00:00
 tags: []
 ---
 Up to this point, we've been discussing memory usage in the Rust language by focusing on simple
 rules that are mostly right for small chunks of code. We've spent time showing how those rules work
 themselves out in practice, and become familiar with reading the assembly code needed to see each
 memory type (global, stack, heap) in action.
 Throughout the series so far, we've put a handicap on the code. In the name of consistent and
 understandable results, we've asked the compiler to pretty please leave the training wheels on. Now
 is the time where we throw out all the rules and take off the kid gloves. As it turns out, both the
 Rust compiler and the LLVM optimizers are incredibly sophisticated, and we'll step back and let them
 do their job.
 <!-- truncate -->
 Similar to
 ["What Has My Compiler Done For Me Lately?"](https://www.youtube.com/watch?v=bSkpMdDe4g4), we're
 focusing on interesting things the Rust language (and LLVM!) can do with memory management. We'll
 still be looking at assembly code to understand what's going on, but it's important to mention
 again: **please use automated tools like [alloc-counter](https://crates.io/crates/alloc_counter) to
 double-check memory behavior if it's something you care about**. It's far too easy to mis-read
 assembly in large code sections, you should always verify behavior if you care about memory usage.
 The guiding principal as we move forward is this: _optimizing compilers won't produce worse programs
 than we started with._ There won't be any situations where stack allocations get moved to heap
 allocations. There will, however, be an opera of optimization.
 **Update 2019-02-10**: When debugging a
 [related issue](https://gitlab.com/sio4/code/alloc-counter/issues/1), it was discovered that the
 original code worked because LLVM optimized out the entire function, rather than just the allocation
 segments. The code has been updated with proper use of
 [`read_volatile`](https://doc.rust-lang.org/std/ptr/fn.read_volatile.html), and a previous section
 on vector capacity has been removed.
 ## The Case of the Disappearing Box
 Our first optimization comes when LLVM can reason that the lifetime of an object is sufficiently
 short that heap allocations aren't necessary. In these cases, LLVM will move the allocation to the
 stack instead! The way this interacts with `#[inline]` attributes is a bit opaque, but the important
 part is that LLVM can sometimes do better than the baseline Rust language:
 ```rust
 use std::alloc::{GlobalAlloc, Layout, System};
 use std::sync::atomic::{AtomicBool, Ordering};
 pub fn cmp(x: u32) {
    // Turn on panicking if we allocate on the heap
    DO_PANIC.store(true, Ordering::SeqCst);
    // The compiler is able to see through the constant `Box`
    // and directly compare `x` to 24 - assembly line 73
    let y = Box::new(24);
    let equals = x == *y;
    // This call to drop is eliminated
    drop(y);
    // Need to mark the comparison result as volatile so that
    // LLVM doesn't strip out all the code. If `y` is marked
    // volatile instead, allocation will be forced.
    unsafe { std::ptr::read_volatile(&equals) };
    // Turn off panicking, as there are some deallocations
    // when we exit main.
    DO_PANIC.store(false, Ordering::SeqCst);
 }
 fn main() {
    cmp(12)
 }
 #[global_allocator]
 static A: PanicAllocator = PanicAllocator;
 static DO_PANIC: AtomicBool = AtomicBool::new(false);
 struct PanicAllocator;
 unsafe impl GlobalAlloc for PanicAllocator {
    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
        if DO_PANIC.load(Ordering::SeqCst) {
            panic!("Unexpected allocation.");
        }
        System.alloc(layout)
    }
    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
        if DO_PANIC.load(Ordering::SeqCst) {
            panic!("Unexpected deallocation.");
        }
        System.dealloc(ptr, layout);
    }
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/BZ_Yp3)
 -- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=4a765f753183d5b919f62c71d2109d5d)
 ## Dr. Array or: how I learned to love the optimizer
 Finally, this isn't so much about LLVM figuring out different memory behavior, but LLVM stripping
 out code that doesn't do anything. Optimizations of this type have a lot of nuance to them; if
 you're not careful, they can make your benchmarks look
 [impossibly good](https://www.youtube.com/watch?v=nXaxk27zwlk&feature=youtu.be&t=1199). In Rust, the
 `black_box` function (implemented in both
 [`libtest`](https://doc.rust-lang.org/1.1.0/test/fn.black_box.html) and
 [`criterion`](https://docs.rs/criterion/0.2.10/criterion/fn.black_box.html)) will tell the compiler
 to disable this kind of optimization. But if you let LLVM remove unnecessary code, you can end up
 running programs that previously caused errors:
 ```rust
 #[derive(Default)]
 struct TwoFiftySix {
    _a: [u64; 32]
 }
 #[derive(Default)]
 struct EightK {
    _a: [TwoFiftySix; 32]
 }
 #[derive(Default)]
 struct TwoFiftySixK {
    _a: [EightK; 32]
 }
 #[derive(Default)]
 struct EightM {
    _a: [TwoFiftySixK; 32]
 }
 pub fn main() {
    // Normally this blows up because we can't reserve size on stack
    // for the `EightM` struct. But because the compiler notices we
    // never do anything with `_x`, it optimizes out the stack storage
    // and the program completes successfully.
    let _x = EightM::default();
 }
 ```
 -- [Compiler Explorer](https://godbolt.org/z/daHn7P)
 -- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=4c253bf26072119896ab93c6ef064dc0)
--- a/blog/2019-02-09-summary/_article.md
+++ b/blog/2019-02-09-summary/_article.md
@ -0,0 +1,35 @@
 ---
 layout: post
 title: "Summary: What are the Allocation Rules?"
 description: "A synopsis and reference."
 category:
 tags: [rust, understanding-allocations]
 ---
 While there's a lot of interesting detail captured in this series, it's often helpful to have a
 document that answers some "yes/no" questions. You may not care about what an `Iterator` looks like
 in assembly, you just need to know whether it allocates an object on the heap or not. And while Rust
 will prioritize the fastest behavior it can, here are the rules for each memory type:
 **Heap Allocation**:
 - Smart pointers (`Box`, `Rc`, `Mutex`, etc.) allocate their contents in heap memory.
 - Collections (`HashMap`, `Vec`, `String`, etc.) allocate their contents in heap memory.
 - Some smart pointers in the standard library have counterparts in other crates that don't need heap
  memory. If possible, use those.
 **Stack Allocation**:
 - Everything not using a smart pointer will be allocated on the stack.
 - Structs, enums, iterators, arrays, and closures are all stack allocated.
 - Cell types (`RefCell`) behave like smart pointers, but are stack-allocated.
 - Inlining (`#[inline]`) will not affect allocation behavior for better or worse.
 - Types that are marked `Copy` are guaranteed to have their contents stack-allocated.
 **Global Allocation**:
 - `const` is a fixed value; the compiler is allowed to copy it wherever useful.
 - `static` is a fixed reference; the compiler will guarantee it is unique.
 ![Container Sizes in Rust](/assets/images/2019-02-04-container-size.svg) --
 [Raph Levien](https://docs.google.com/presentation/d/1q-c7UAyrUlM-eZyTo1pd8SZ0qwA_wYxmPZVOQkoDmH4/edit?usp=sharing)
--- a/blog/2019-02-09-summary/container-size.svg
+++ b/blog/2019-02-09-summary/container-size.svg
--- a/blog/2019-02-09-summary/index.mdx
+++ b/blog/2019-02-09-summary/index.mdx
@ -0,0 +1,39 @@
 ---
 slug: 2019/02/summary
 title: "Allocations in Rust: Summary"
 date: 2019-02-09 12:00:00
 authors: [bspeice]
 tags: []
 ---
 While there's a lot of interesting detail captured in this series, it's often helpful to have a
 document that answers some "yes/no" questions. You may not care about what an `Iterator` looks like
 in assembly, you just need to know whether it allocates an object on the heap or not. And while Rust
 will prioritize the fastest behavior it can, here are the rules for each memory type:
 <!-- truncate -->
 **Global Allocation**:
 - `const` is a fixed value; the compiler is allowed to copy it wherever useful.
 - `static` is a fixed reference; the compiler will guarantee it is unique.
 **Stack Allocation**:
 - Everything not using a smart pointer will be allocated on the stack.
 - Structs, enums, iterators, arrays, and closures are all stack allocated.
 - Cell types (`RefCell`) behave like smart pointers, but are stack-allocated.
 - Inlining (`#[inline]`) will not affect allocation behavior for better or worse.
 - Types that are marked `Copy` are guaranteed to have their contents stack-allocated.
 **Heap Allocation**:
 - Smart pointers (`Box`, `Rc`, `Mutex`, etc.) allocate their contents in heap memory.
 - Collections (`HashMap`, `Vec`, `String`, etc.) allocate their contents in heap memory.
 - Some smart pointers in the standard library have counterparts in other crates that don't need heap
  memory. If possible, use those.
 ![Container Sizes in Rust](./container-size.svg)
 -- [Raph Levien](https://docs.google.com/presentation/d/1q-c7UAyrUlM-eZyTo1pd8SZ0qwA_wYxmPZVOQkoDmH4/edit?usp=sharing)