From a7811fa9b5609a1613d6b3687b6448073ceb2e70 Mon Sep 17 00:00:00 2001 From: Bradlee Speice Date: Sat, 2 Feb 2019 20:34:35 -0500 Subject: [PATCH] Split into sections Get heap allocation before starting main --- _drafts/a-heaping-helping.md | 167 +++++ _drafts/stacking-up.md | 252 ++++++++ _drafts/the-whole-world.md | 295 +++++++++ _drafts/understanding-allocations-in-rust.md | 638 +------------------ 4 files changed, 720 insertions(+), 632 deletions(-) create mode 100644 _drafts/a-heaping-helping.md create mode 100644 _drafts/stacking-up.md create mode 100644 _drafts/the-whole-world.md diff --git a/_drafts/a-heaping-helping.md b/_drafts/a-heaping-helping.md new file mode 100644 index 0000000..1af7401 --- /dev/null +++ b/_drafts/a-heaping-helping.md @@ -0,0 +1,167 @@ +--- +layout: post +title: "A Heaping Helping: Dynamic Memory" +description: "The reason Rust exists" +category: +tags: [rust, understanding-allocations] +--- + +Managing dynamic memory is hard. Some languages assume users will do it themselves (C, C++), +and some languages go to extreme lengths to protect users from themselves (Java, Python). In Rust, +how the language uses dynamic memory (also referred to as the **heap**) is a system called *ownership*. +And as the docs mention, ownership +[is Rust's most unique feature](https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html). + +The heap is used in two situations; when the compiler is unable to predict either the *total size +of memory needed*, or *how long the memory is needed for*, it will allocate space in the heap. +This happens pretty frequently; if you want to download the Google home page, you won't know +how large it is until your program runs. And when you're finished with Google, whenever that might be, +we deallocate the memory so it can be used to store other webpages. + +We won't go into detail on how the heap is managed; the +[ownership documentation](https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html) +does a phenomenal job explaining both the "why" and "how" of memory management. Instead, +we're going to focus on understanding "when" heap allocations occur in Rust. + +To start off: take a guess for how many allocations happen in the program below: + +```rust +fn main() {} +``` + +It's obviously a trick question; while no heap allocations happen as a result of +the code listed above, the setup needed to call `main` does allocate on the heap. +Here's a way to show it: + +```rust +#![feature(integer_atomics)] +use std::alloc::{GlobalAlloc, Layout, System}; +use std::sync::atomic::{AtomicU64, Ordering}; + +static ALLOCATION_COUNT: AtomicU64 = AtomicU64::new(0); + +struct CountingAllocator; + +unsafe impl GlobalAlloc for CountingAllocator { + unsafe fn alloc(&self, layout: Layout) -> *mut u8 { + ALLOCATION_COUNT.fetch_add(1, Ordering::SeqCst); + System.alloc(layout) + } + + unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) { + System.dealloc(ptr, layout); + } +} + +#[global_allocator] +static A: CountingAllocator = CountingAllocator; + +fn main() { + let x = ALLOCATION_COUNT.fetch_add(0, Ordering::SeqCst); + println!("There were {} allocations before calling main!", x); +} +``` +-- [Rust Playground](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=fb5060025ba79fc0f906b65a4ef8eb8e) + +As of the time of writing, there are five allocations that happen before `main` +is ever called. + +But when we want to understand more practical situations where heap allocation +happens, we'll follow this guide: + +- Smart pointers hold their contents in the heap +- Collections are smart pointers for many objects at a time, and reallocate + when they need to grow +- Boxed closures (FnBox, others?) are heap allocated +- "Move" semantics don't trigger new allocation; just a change of ownership, + so are incredibly fast +- Stack-based alternatives to standard library types should be preferred (spin, parking_lot) + +## Smart pointers + +The first thing to note are the "smart pointer" types. +When you have data that must outlive the scope in which it is declared, +or your data is of unknown or dynamic size, you'll make use of these types. + +The term [smart pointer](https://en.wikipedia.org/wiki/Smart_pointer) +comes from C++, and is used to describe objects that are responsible for managing +ownership of data allocated on the heap. The smart pointers available in the `alloc` +crate should look mostly familiar: +- [`Box`](https://doc.rust-lang.org/alloc/boxed/struct.Box.html) +- [`Rc`](https://doc.rust-lang.org/alloc/rc/struct.Rc.html) +- [`Arc`](https://doc.rust-lang.org/alloc/sync/struct.Arc.html) +- [`Cow`](https://doc.rust-lang.org/alloc/borrow/enum.Cow.html) + +The [standard library](https://doc.rust-lang.org/std/) also defines some smart pointers, +though more than can be covered in this article. Some examples: +- [`RwLock`](https://doc.rust-lang.org/std/sync/struct.RwLock.html) +- [`Mutex`](https://doc.rust-lang.org/std/sync/struct.Mutex.html) + +Finally, there is one [gotcha](https://www.merriam-webster.com/dictionary/gotcha): +cell types (like [`RefCell`](https://doc.rust-lang.org/stable/core/cell/struct.RefCell.html)) +look and behave like smart pointers, but don't actually require heap allocation. +Check out the [`core::cell` docs](https://doc.rust-lang.org/stable/core/cell/index.html) +for more information. + +When a smart pointer is created, the data it is given is placed in heap memory and +the location of that data is recorded in the smart pointer. Once the smart pointer +has determined it's safe to deallocate that memory (when a `Box` has +[gone out of scope](https://doc.rust-lang.org/stable/std/boxed/index.html) or when +reference count for an object [goes to zero](https://doc.rust-lang.org/alloc/rc/index.html)), +the heap space is reclaimed. We can prove these types use heap memory by +looking at code: + +```rust +use std::rc::Rc; +use std::sync::Arc; +use std::borrow::Cow; + +pub fn my_box() { + // Drop at line 1640 + Box::new(0); +} + +pub fn my_rc() { + // Drop at line 1650 + Rc::new(0); +} + +pub fn my_arc() { + // Drop at line 1660 + Arc::new(0); +} + +pub fn my_cow() { + // Drop at line 1672 + Cow::from("drop"); +} +``` +-- [Compiler Explorer](https://godbolt.org/z/SaDpWg) + +## Collections + +Collections types use heap memory because they have dynamic size; they will request more memory +[when needed](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.reserve), +and can [release memory](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.shrink_to_fit) +when it's no longer necessary. This dynamic memory usage forces Rust to heap allocate +everything they contain. In a way, **collections are smart pointers for many objects at once.** +Common types that fall under this umbrella are `Vec`, `HashMap`, and `String` +(not [`&str`](https://doc.rust-lang.org/std/primitive.str.html)). + +But while collections store the objects they own in heap memory, *creating new collections +will not allocate on the heap*. This is a bit weird, because if we call `Vec::new()` the +assembly shows a corresponding call to `drop_in_place`: + +```rust +pub fn my_vec() { + // Drop in place at line 481 + Vec::::new(); +} +``` +-- [Compiler Explorer](https://godbolt.org/z/1WkNtC) + +But because the vector has no elements it is managing, no calls to the allocator +will ever be dispatched. A couple of places to look at for confirming this behavior: +[`Vec::new()`](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.new), +[`HashMap::new()`](https://doc.rust-lang.org/std/collections/hash_map/struct.HashMap.html#method.new), +and [`String::new()`](https://doc.rust-lang.org/std/string/struct.String.html#method.new). \ No newline at end of file diff --git a/_drafts/stacking-up.md b/_drafts/stacking-up.md new file mode 100644 index 0000000..cb81e75 --- /dev/null +++ b/_drafts/stacking-up.md @@ -0,0 +1,252 @@ +--- +layout: post +title: "Stacking Up: Fixed Memory" +description: "Going fast in Rust" +category: +tags: [rust, understanding-allocations] +--- + +`const` and `static` are perfectly fine, but it's very rare that we know +at compile-time about either values or references that will be the same for the entire +time our program is running. Put another way, it's not often the case that either you +or your compiler know how much memory your entire program will need. + +However, there are still some optimizations the compiler can do if it knows how much +memory individual functions will need. Specifically, the compiler can make use of +"stack" memory (as opposed to "heap" memory) which can be managed far faster in +both the short- and long-term. When requesting memory, the +[`push` instruction](http://www.cs.virginia.edu/~evans/cs216/guides/x86.html) +can typically complete in [1 or 2 cycles](https://agner.org/optimize/instruction_tables.ods) +(<1 nanosecond on modern CPUs). Heap memory instead requires using an allocator +(specialized software to track what memory is in use) to reserve space. +And when you're finished with your memory, the `pop` instruction likewise runs in +1-3 cycles, as opposed to an allocator needing to worry about memory fragmentation +and other issues. All sorts of incredibly sophisticated techniques have been used +to design allocators: +- [Garbage Collection](https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)) + strategies like [Tracing](https://en.wikipedia.org/wiki/Tracing_garbage_collection) + (used in [Java](https://www.oracle.com/technetwork/java/javase/tech/g1-intro-jsp-135488.html)) + and [Reference counting](https://en.wikipedia.org/wiki/Reference_counting) + (used in [Python](https://docs.python.org/3/extending/extending.html#reference-counts)) +- Thread-local structures to prevent locking the allocator in [tcmalloc](https://jamesgolick.com/2013/5/19/how-tcmalloc-works.html) +- Arena structures used in [jemalloc](http://jemalloc.net/), which until recently + was the primary allocator for Rust programs! + +But no matter how fast your allocator is, the principle remains: the +fastest allocator is the one you never use. As such, we're not going to go +in detail on how exactly the +[`push` and `pop` instructions work](http://www.cs.virginia.edu/~evans/cs216/guides/x86.html), +and we'll focus instead on the conditions that enable the Rust compiler to use +the faster stack-based allocation for variables. + +With that in mind, let's get into the details. How do we know when Rust will or will not use +stack allocation for objects we create? Looking at other languages, it's often easy to delineate +between stack and heap. Managed memory languages (Python, Java, +[C#](https://blogs.msdn.microsoft.com/ericlippert/2010/09/30/the-truth-about-value-types/)) assume +everything is on the heap. JIT compilers ([PyPy](https://www.pypy.org/), +[HotSpot](https://www.oracle.com/technetwork/java/javase/tech/index-jsp-136373.html)) may +optimize some heap allocations away, but you should never assume it will happen. +C makes things clear with calls to special functions ([malloc(3)](https://linux.die.net/man/3/malloc) +is one) being the way to use heap memory. Old C++ has the [`new`](https://stackoverflow.com/a/655086/1454178) +keyword, though modern C++/C++11 is more complicated with [RAII](https://en.cppreference.com/w/cpp/language/raii). + +For Rust specifically, the principle is this: *stack allocation will be used for everything +that doesn't involve "smart pointers" and collections.* If we're interested in dissecting it though, +there are three things we pay attention to: + +1. Stack manipulation instructions (`push`, `pop`, and `add`/`sub` of the `rsp` register) + indicate allocation of stack memory: + ```rust + pub fn stack_alloc(x: u32) -> u32 { + // Space for `y` is allocated by subtracting from `rsp`, + // and then populated + let y = [1u8, 2, 3, 4]; + // Space for `y` is deallocated by adding back to `rsp` + x + } + ``` + -- [Compiler Explorer](https://godbolt.org/z/5WSgc9) + +2. Tracking when exactly heap allocation calls happen is difficult. It's typically easier to + watch for `call core::ptr::real_drop_in_place`, and infer that a heap allocation happened + in the recent past: + ```rust + pub fn heap_alloc(x: usize) -> usize { + // Space for elements in a vector has to be allocated + // on the heap, and is then de-allocated once the + // vector goes out of scope + let y: Vec = Vec::with_capacity(x); + x + } + ``` + -- [Compiler Explorer](https://godbolt.org/z/epfgoQ) (`real_drop_in_place` happens on line 1317) + Note: While the [`Drop` trait](https://doc.rust-lang.org/std/ops/trait.Drop.html) + is [called for stack-allocated objects](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=87edf374d8983816eb3d8cfeac657b46), + the Rust standard library only defines `Drop` implementations for types that involve heap allocation. + +3. If you don't want to inspect the assembly, use a custom allocator that's able to track + and alert when heap allocations occur. As an unashamed plug, [qadapt](https://crates.io/crates/qadapt) + was designed for exactly this purpose. + +With all that in mind, let's talk about situations in which we're guaranteed to use stack memory: + +- Structs are created on the stack. +- Function arguments are passed on the stack. +- Enums and unions are stack-allocated. +- [Arrays](https://doc.rust-lang.org/std/primitive.array.html) are always stack-allocated. +- Using the [`#[inline]` attribute](https://doc.rust-lang.org/reference/attributes.html#inline-attribute) + will not change the memory region used. +- Closures capture their arguments on the stack +- Generics will use stack allocation, even with dynamic dispatch. + +## Structs + + + +## Enums + +It's been a worry of mine that I'd manage to trigger a heap allocation because +of wrapping an underlying type in +Given that you're not using smart pointers, `enum` and other wrapper types will never use +heap allocations. This shows up most often with +[`Option`](https://doc.rust-lang.org/stable/core/option/enum.Option.html) and +[`Result`](https://doc.rust-lang.org/stable/core/result/enum.Result.html) types, +but generalizes to any other types as well. + +Because the size of an `enum` is the size of its largest element plus the size of a +discriminator, the compiler can predict how much memory is used. If enums were +sized as tightly as possible, heap allocations would be needed to handle the fact +that enum variants were of dynamic size! + +## Arrays + +The array type is guaranteed to be stack allocated, which is why the array size must +be declared. Interestingly enough, this can be used to cause safe Rust programs to crash: + +```rust +// 256 bytes +#[derive(Default)] +struct TwoFiftySix { + _a: [u64; 32] +} + +// 8 kilobytes +#[derive(Default)] +struct EightK { + _a: [TwoFiftySix; 32] +} + +// 256 kilobytes +#[derive(Default)] +struct TwoFiftySixK { + _a: [EightK; 32] +} + +// 8 megabytes - exceeds space typically provided for the stack, +// though the kernel can be instructed to allocate more. +// On Linux, you can check stack size using `ulimit -s` +#[derive(Default)] +struct EightM { + _a: [TwoFiftySixK; 32] +} + +fn main() { + // Because we already have things in stack memory + // (like the current function), allocating another + // eight megabytes of stack memory crashes the program + let _x = EightM::default(); +} +``` +-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=137893e3ae05c2f32fe07d6f6f754709) + +There aren't any security implications of this (no memory corruption occurs, +just running out of memory), but it's good to note that the Rust compiler +won't move arrays into heap memory even if they can be reasonably expected +to overflow the stack. + +## **inline** attributes + +## Closures + +Rules for how anonymous functions capture their arguments are typically language-specific. +In Java, [Lambda Expressions](https://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html) +are actually objects created on the heap that capture local primitives by copying, and capture +local non-primitives as (`final`) references. +[Python](https://docs.python.org/3.7/reference/expressions.html#lambda) and +[JavaScript](https://javascriptweblog.wordpress.com/2010/10/25/understanding-javascript-closures/) +both bind *everything* by reference normally, but Python can also +[capture values](https://stackoverflow.com/a/235764/1454178) and JavaScript has +[Arrow functions](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/Arrow_functions). + +In Rust, arguments to closures are the same as arguments to other functions; +closures are simply functions that don't have a declared name. Some weird ordering +of the stack may be required to handle them, but it's the compiler's responsiblity +to figure it out. + +Each example below has the same effect, but compile to very different programs. +In the simplest case, we immediately run a closure returned by another function. +Because we don't store a reference to the closure, the stack memory needed to +store the captured values is contiguous: + +```rust +fn my_func() -> impl FnOnce() { + let x = 24; + // Note that this closure in assembly looks exactly like + // any other function; you even use the `call` instruction + // to start running it. + move || { x; } +} + +pub fn immediate() { + my_func()(); + my_func()(); +} +``` +-- [Compiler Explorer](https://godbolt.org/z/mgJ2zl), 25 total assembly instructions + +If we store a reference to the bound closure though, the Rust compiler has to +work a bit harder to make sure everything is correctly laid out in stack memory: + +```rust +pub fn simple_reference() { + let x = my_func(); + let y = my_func(); + y(); + x(); +} +``` +-- [Compiler Explorer](https://godbolt.org/z/K_dj5n), 55 total assembly instructions + +In more complex cases, even things like variable order matter: + +```rust +pub fn complex() { + let x = my_func(); + let y = my_func(); + x(); + y(); +} +``` +-- [Compiler Explorer](https://godbolt.org/z/p37qFl), 70 total assembly instructions + +In every circumstance though, the compiler ensured that no heap allocations were necessary. + +## Generics + +# A Heaping Helping: Rust and Dynamic Memory + +Opening question: How many allocations happen before `fn main()` is called? + +Now, one question I hope you're asking is "how do we distinguish stack- and +heap-based allocations in Rust code?" There are two strategies I'm going +to use for this: + +Summary section: + +- Smart pointers hold their contents in the heap +- Collections are smart pointers for many objects at a time, and reallocate + when they need to grow +- Boxed closures (FnBox, others?) are heap allocated +- "Move" semantics don't trigger new allocation; just a change of ownership, + so are incredibly fast +- Stack-based alternatives to standard library types should be preferred (spin, parking_lot) \ No newline at end of file diff --git a/_drafts/the-whole-world.md b/_drafts/the-whole-world.md new file mode 100644 index 0000000..1d38b8b --- /dev/null +++ b/_drafts/the-whole-world.md @@ -0,0 +1,295 @@ +--- +layout: post +title: "The Whole World: Global Memory Usage" +description: "const and static allocations" +category: +tags: [rust, understanding-allocations] +--- + +The first memory type we'll look at is pretty special: when Rust can prove that +a *value* is fixed for the life of a program (`const`), and when a *reference* is valid for +the duration of the program (`static` as a declaration, not +[`'static`](https://doc.rust-lang.org/book/ch10-03-lifetime-syntax.html#the-static-lifetime) +as a lifetime). +Understanding the distinction between value and reference is important for reasons +we'll go into below. The +[full specification](https://github.com/rust-lang/rfcs/blob/master/text/0246-const-vs-static.md) +for these two memory types is available, but we'll take a hands-on approach to the topic. + +## **const** + +The quick summary is this: `const` declares a read-only block of memory that is loaded +as part of your program binary (during the call to [exec(3)](https://linux.die.net/man/3/exec)). +Any `const` value resulting from calling a `const fn` is guaranteed to be materialized +at compile-time (meaning that access at runtime will not invoke the `const fn`), +even though the `const fn` functions are available at run-time as well. The compiler +can choose to copy the constant value wherever it is deemed practical. Getting the address +of a `const` value is legal, but not guaranteed to be the same even when referring to the +same named identifier. + +The first point is a bit strange - "read-only memory". +[The Rust book](https://doc.rust-lang.org/book/ch03-01-variables-and-mutability.html#differences-between-variables-and-constants) +mentions in a couple places that using `mut` with constants is illegal, +but it's also important to demonstrate just how immutable they are. *Typically* in Rust +you can use "inner mutability" to modify things that aren't declared `mut`. +[`RefCell`](https://doc.rust-lang.org/std/cell/struct.RefCell.html) provides an API +to guarantee at runtime that some consistency rules are enforced: + +```rust +use std::cell::RefCell; + +fn my_mutator(cell: &RefCell) { + // Even though we're given an immutable reference, + // the `replace` method allows us to modify the inner value. + cell.replace(14); +} + +fn main() { + let cell = RefCell::new(25); + // Prints out 25 + println!("Cell: {:?}", cell); + my_mutator(&cell); + // Prints out 14 + println!("Cell: {:?}", cell); +} +``` +-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8e4bea1a718edaff4507944e825a54b2) + +When `const` is involved though, modifications are silently ignored: + +```rust +use std::cell::RefCell; + +const CELL: RefCell = RefCell::new(25); + +fn my_mutator(cell: &RefCell) { + cell.replace(14); +} + +fn main() { + // First line prints 25 as expected + println!("Cell: {:?}", &CELL); + my_mutator(&CELL); + // Second line *still* prints 25 + println!("Cell: {:?}", &CELL); +} +``` +-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=88fe98110c33c1b3a51e341f48b8ae00) + +And a second example using [`Once`](https://doc.rust-lang.org/std/sync/struct.Once.html): + +```rust +use std::sync::Once; + +const SURPRISE: Once = Once::new(); + +fn main() { + // This is how `Once` is supposed to be used + SURPRISE.call_once(|| println!("Initializing...")); + // Because `Once` is a `const` value, we never record it + // having been initialized the first time, and this closure + // will also execute. + SURPRISE.call_once(|| println!("Initializing again???")); +} +``` +-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=c3cc5979b5e5434eca0f9ec4a06ee0ed) + +When the [`const` specification](https://github.com/rust-lang/rfcs/blob/26197104b7bb9a5a35db243d639aee6e46d35d75/text/0246-const-vs-static.md) +refers to ["rvalues"](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3055.pdf), this is +what they mean. [Clippy](https://github.com/rust-lang/rust-clippy) will treat this as an error, +but it's still something to be aware of. + +The next thing to mention is that `const` values are loaded into memory *as part of your program binary*. +Because of this, any `const` values declared in your program will be "realized" at compile-time; +accessing them may trigger a main-memory lookup (with a fixed address, so your CPU may +be able to prefetch the value), but that's it. + +```rust +use std::cell::RefCell; + +const CELL: RefCell = RefCell::new(24); + +pub fn multiply(value: u32) -> u32 { + value * (*CELL.get_mut()) +} +``` +-- [Compiler Explorer](https://godbolt.org/z/2KXUcN) + +The compiler only creates one `RefCell`, and uses it everywhere. However, that value +is fully realized at compile time, and is fully stored in the `.L__unnamed_1` section. + +If it's helpful though, the compiler can choose to copy `const` values. + +```rust +const FACTOR: u32 = 1000; + +pub fn multiply(value: u32) -> u32 { + value * FACTOR +} + +pub fn multiply_twice(value: u32) -> u32 { + value * FACTOR * FACTOR +} +``` +-- [Compiler Explorer](https://godbolt.org/z/_JiT9O) + +In this example, the `FACTOR` value is turned into the `mov edi, 1000` instruction +in both the `multiply` and `multiply_twice` functions; the "1000" value is never +"stored" anywhere, as it's small enough to inline into the assembly instructions. + +Finally, getting the address of a `const` value is possible but not guaranteed +to be unique (given that the compiler can choose to copy values). In my testing +I was never able to get the compiler to copy a `const` value and get differing pointers, +but the specifications are clear enough: *don't rely on pointers to `const` +values being consistent*. To be frank, caring about locations for `const` values +is almost certainly a code smell. + +## **static** + +Static variables are related to `const` variables, but take a slightly different approach. +When the compiler can guarantee that a *reference* is fixed for the life of a program, +you end up with a `static` variable (as opposed to *values* that are fixed for the +duration a program is running). Because of this reference/value distinction, +static variables behave much more like what people expect from "global" variables. +We'll look at regular static variables first, and then address the `lazy_static!()` +and `thread_local!()` macros later. + +More generally, `static` variables are globally unique locations in memory, +the contents of which are loaded as part of your program being read into main memory. +They allow initialization with both raw values and `const fn` calls, and the initial +value is loaded along with the program/library binary. All static variables must +be of a type that implements the [`Sync`](https://doc.rust-lang.org/std/marker/trait.Sync.html) +marker trait. And while `static mut` variables are allowed, mutating a static is considered +an `unsafe` operation. + +The single biggest difference between `const` and `static` is the guarantees +provided about uniqueness. Where `const` variables may or may not be copied +in code, `static` variables are guarantee to be unique. If we take a previous +`const` example and change it to `static`, the difference should be clear: + +```rust +static FACTOR: u32 = 1000; + +pub fn multiply(value: u32) -> u32 { + value * FACTOR +} + +pub fn multiply_twice(value: u32) -> u32 { + value * FACTOR * FACTOR +} +``` +-- [Compiler Explorer](https://godbolt.org/z/bSfBxn) + +Where [previously](https://godbolt.org/z/_JiT90) there were plenty of +references to multiplying by 1000, the new assembly refers to `FACTOR` +as a named memory location instead. No initialization work needs to be done, +but the compiler can no longer prove the value never changes during execution. + +Next, let's talk about initialization. The simplest case is initializing +static variables with either scalar or struct notation: + +```rust +#[derive(Debug)] +struct MyStruct { + x: u32 +} + +static MY_STRUCT: MyStruct = MyStruct { + // You can even reference other statics + // declared later + x: MY_VAL +}; + +static MY_VAL: u32 = 24; + +fn main() { + println!("Static MyStruct: {:?}", MY_STRUCT); +} +``` +-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=b538dbc46076f12db047af4f4403ee6e) + +Things get a bit weirder when using `const fn`. In most cases, things just work: + +```rust +#[derive(Debug)] +struct MyStruct { + x: u32 +} + +impl MyStruct { + const fn new() -> MyStruct { + MyStruct { x: 24 } + } +} + +static MY_STRUCT: MyStruct = MyStruct::new(); + +fn main() { + println!("const fn Static MyStruct: {:?}", MY_STRUCT); +} +``` +-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8c796a6e7fc273c12115091b707b0255) + +However, there's a caveat: you're currently not allowed to use `const fn` to initialize +static variables of types that aren't marked `Sync`. As an example, even though +[`RefCell::new()`](https://doc.rust-lang.org/std/cell/struct.RefCell.html#method.new) +is `const fn`, because [`RefCell` isn't `Sync`](https://doc.rust-lang.org/std/cell/struct.RefCell.html#impl-Sync), +you'll get an error at compile time: + +```rust +use std::cell::RefCell; + +// error[E0277]: `std::cell::RefCell` cannot be shared between threads safely +static MY_LOCK: RefCell = RefCell::new(0); +``` +-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=c76ef86e473d07117a1700e21fd45560) + +It's likely that this will [change in the future](https://github.com/rust-lang/rfcs/blob/master/text/0911-const-fn.md) though. + +Which leads well to the next point: static variable types must implement the +[`Sync` marker](https://doc.rust-lang.org/std/marker/trait.Sync.html). +Because they're globally unique, it must be safe for you to access static variables +from any thread at any time. Most `struct` definitions automatically implement the +`Sync` trait because they contain only elements which themselves +implement `Sync`. This is why earlier examples could get away with initializing +statics, even though we never included an `impl Sync for MyStruct` in the code. +For more on the `Sync` trait, the [Nomicon](https://doc.rust-lang.org/nomicon/send-and-sync.html) +has a much more thorough treatment. But as an example, Rust refuses to compile +our earlier example if we add a non-`Sync` element to the `struct` definition: + +```rust +use std::cell::RefCell; + +struct MyStruct { + x: u32, + y: RefCell, +} + +// error[E0277]: `std::cell::RefCell` cannot be shared between threads safely +static MY_STRUCT: MyStruct = MyStruct { + x: 8, + y: RefCell::new(8) +}; +``` +-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=40074d0248f056c296b662dbbff97cfc) + +Finally, while `static mut` variables are allowed, mutating them is an `unsafe` operation. +Unlike `const` however, interior mutability is acceptable. To demonstrate: + +```rust +use std::sync::Once; + +// This example adapted from https://doc.rust-lang.org/std/sync/struct.Once.html#method.call_once +static INIT: Once = Once::new(); + +fn main() { + // Note that while `INIT` is declared immutable, we're still allowed + // to mutate its interior + INIT.call_once(|| println!("Initializing...")); + // This code won't panic, as the interior of INIT was modified + // as part of the previous `call_once` + INIT.call_once(|| panic!("INIT was called twice!")); +} +``` +-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=3ba003a981a7ed7400240caadd384d59) + diff --git a/_drafts/understanding-allocations-in-rust.md b/_drafts/understanding-allocations-in-rust.md index 788b27d..2fe832a 100644 --- a/_drafts/understanding-allocations-in-rust.md +++ b/_drafts/understanding-allocations-in-rust.md @@ -3,7 +3,7 @@ layout: post title: "Allocations in Rust" description: "An introduction to the memory model" category: -tags: [rust] +tags: [rust, understanding-allocations] --- There's an alchemy of distilling complex technical topics into articles and videos @@ -26,13 +26,12 @@ Let's learn a bit about memory in Rust. This post is intended as both guide and reference material; we'll work to establish an understanding of the different memory types Rust makes use of, then summarize each -section for easy citation in the future. To that end, a table of contents is provided -to assist in easy navigation: +section at the end for easy future citation. To that end, a table of contents is in order: -- [Foreword](#foreword) -- [The Whole World: Global Memory Usage](#the-whole-world-global-memory-usage) -- [Stacking Up: Non-Heap Memory](#stacking-up-non-heap-memory) -- [A Heaping Helping: Rust and Dynamic Memory](#a-heaping-helping-rust-and-dynamic-memory) +- Foreword +- [The Whole World: Global Memory Usage](/2019/02/the-whole-world) +- [Stacking Up: Fixed Memory](/2019/02/stacking-up) +- [A Heaping Helping: Dynamic Memory](/2019/02/a-heaping-helping) - [Compiler Optimizations: What It's Done For You Lately](#compiler-optimizations-what-its-done-for-you-lately) - Summary: When Does Rust Allocate? @@ -105,631 +104,6 @@ have a notice worth repeating: > > -- [the docs](https://doc.rust-lang.org/std/ptr/fn.read_volatile.html) -# The Whole World: Global Memory Usage - -The first memory type we'll look at is pretty special: when Rust can prove that -a *value* is fixed for the life of a program (`const`), and when a *reference* is valid for -the duration of the program (`static` as a declaration, not -[`'static`](https://doc.rust-lang.org/book/ch10-03-lifetime-syntax.html#the-static-lifetime) -as a lifetime). -Understanding the distinction between value and reference is important for reasons -we'll go into below. The -[full specification](https://github.com/rust-lang/rfcs/blob/master/text/0246-const-vs-static.md) -for these two memory types is available, but we'll take a hands-on approach to the topic. - -## **const** - -The quick summary is this: `const` declares a read-only block of memory that is loaded -as part of your program binary (during the call to [exec(3)](https://linux.die.net/man/3/exec)). -Any `const` value resulting from calling a `const fn` is guaranteed to be materialized -at compile-time (meaning that access at runtime will not invoke the `const fn`), -even though the `const fn` functions are available at run-time as well. The compiler -can choose to copy the constant value wherever it is deemed practical. Getting the address -of a `const` value is legal, but not guaranteed to be the same even when referring to the -same named identifier. - -The first point is a bit strange - "read-only memory". -[The Rust book](https://doc.rust-lang.org/book/ch03-01-variables-and-mutability.html#differences-between-variables-and-constants) -mentions in a couple places that using `mut` with constants is illegal, -but it's also important to demonstrate just how immutable they are. *Typically* in Rust -you can use "inner mutability" to modify things that aren't declared `mut`. -[`RefCell`](https://doc.rust-lang.org/std/cell/struct.RefCell.html) provides an API -to guarantee at runtime that some consistency rules are enforced: - -```rust -use std::cell::RefCell; - -fn my_mutator(cell: &RefCell) { - // Even though we're given an immutable reference, - // the `replace` method allows us to modify the inner value. - cell.replace(14); -} - -fn main() { - let cell = RefCell::new(25); - // Prints out 25 - println!("Cell: {:?}", cell); - my_mutator(&cell); - // Prints out 14 - println!("Cell: {:?}", cell); -} -``` --- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8e4bea1a718edaff4507944e825a54b2) - -When `const` is involved though, modifications are silently ignored: - -```rust -use std::cell::RefCell; - -const CELL: RefCell = RefCell::new(25); - -fn my_mutator(cell: &RefCell) { - cell.replace(14); -} - -fn main() { - // First line prints 25 as expected - println!("Cell: {:?}", &CELL); - my_mutator(&CELL); - // Second line *still* prints 25 - println!("Cell: {:?}", &CELL); -} -``` --- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=88fe98110c33c1b3a51e341f48b8ae00) - -And a second example using [`Once`](https://doc.rust-lang.org/std/sync/struct.Once.html): - -```rust -use std::sync::Once; - -const SURPRISE: Once = Once::new(); - -fn main() { - // This is how `Once` is supposed to be used - SURPRISE.call_once(|| println!("Initializing...")); - // Because `Once` is a `const` value, we never record it - // having been initialized the first time, and this closure - // will also execute. - SURPRISE.call_once(|| println!("Initializing again???")); -} -``` --- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=c3cc5979b5e5434eca0f9ec4a06ee0ed) - -When the [`const` specification](https://github.com/rust-lang/rfcs/blob/26197104b7bb9a5a35db243d639aee6e46d35d75/text/0246-const-vs-static.md) -refers to ["rvalues"](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3055.pdf), this is -what they mean. [Clippy](https://github.com/rust-lang/rust-clippy) will treat this as an error, -but it's still something to be aware of. - -The next thing to mention is that `const` values are loaded into memory *as part of your program binary*. -Because of this, any `const` values declared in your program will be "realized" at compile-time; -accessing them may trigger a main-memory lookup (with a fixed address, so your CPU may -be able to prefetch the value), but that's it. - -```rust -use std::cell::RefCell; - -const CELL: RefCell = RefCell::new(24); - -pub fn multiply(value: u32) -> u32 { - value * (*CELL.get_mut()) -} -``` --- [Compiler Explorer](https://godbolt.org/z/2KXUcN) - -The compiler only creates one `RefCell`, and uses it everywhere. However, that value -is fully realized at compile time, and is fully stored in the `.L__unnamed_1` section. - -If it's helpful though, the compiler can choose to copy `const` values. - -```rust -const FACTOR: u32 = 1000; - -pub fn multiply(value: u32) -> u32 { - value * FACTOR -} - -pub fn multiply_twice(value: u32) -> u32 { - value * FACTOR * FACTOR -} -``` --- [Compiler Explorer](https://godbolt.org/z/_JiT9O) - -In this example, the `FACTOR` value is turned into the `mov edi, 1000` instruction -in both the `multiply` and `multiply_twice` functions; the "1000" value is never -"stored" anywhere, as it's small enough to inline into the assembly instructions. - -Finally, getting the address of a `const` value is possible but not guaranteed -to be unique (given that the compiler can choose to copy values). In my testing -I was never able to get the compiler to copy a `const` value and get differing pointers, -but the specifications are clear enough: *don't rely on pointers to `const` -values being consistent*. To be frank, caring about locations for `const` values -is almost certainly a code smell. - -## **static** - -Static variables are related to `const` variables, but take a slightly different approach. -When the compiler can guarantee that a *reference* is fixed for the life of a program, -you end up with a `static` variable (as opposed to *values* that are fixed for the -duration a program is running). Because of this reference/value distinction, -static variables behave much more like what people expect from "global" variables. -We'll look at regular static variables first, and then address the `lazy_static!()` -and `thread_local!()` macros later. - -More generally, `static` variables are globally unique locations in memory, -the contents of which are loaded as part of your program being read into main memory. -They allow initialization with both raw values and `const fn` calls, and the initial -value is loaded along with the program/library binary. All static variables must -be of a type that implements the [`Sync`](https://doc.rust-lang.org/std/marker/trait.Sync.html) -marker trait. And while `static mut` variables are allowed, mutating a static is considered -an `unsafe` operation. - -The single biggest difference between `const` and `static` is the guarantees -provided about uniqueness. Where `const` variables may or may not be copied -in code, `static` variables are guarantee to be unique. If we take a previous -`const` example and change it to `static`, the difference should be clear: - -```rust -static FACTOR: u32 = 1000; - -pub fn multiply(value: u32) -> u32 { - value * FACTOR -} - -pub fn multiply_twice(value: u32) -> u32 { - value * FACTOR * FACTOR -} -``` --- [Compiler Explorer](https://godbolt.org/z/bSfBxn) - -Where [previously](https://godbolt.org/z/_JiT90) there were plenty of -references to multiplying by 1000, the new assembly refers to `FACTOR` -as a named memory location instead. No initialization work needs to be done, -but the compiler can no longer prove the value never changes during execution. - -Next, let's talk about initialization. The simplest case is initializing -static variables with either scalar or struct notation: - -```rust -#[derive(Debug)] -struct MyStruct { - x: u32 -} - -static MY_STRUCT: MyStruct = MyStruct { - // You can even reference other statics - // declared later - x: MY_VAL -}; - -static MY_VAL: u32 = 24; - -fn main() { - println!("Static MyStruct: {:?}", MY_STRUCT); -} -``` --- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=b538dbc46076f12db047af4f4403ee6e) - -Things get a bit weirder when using `const fn`. In most cases, things just work: - -```rust -#[derive(Debug)] -struct MyStruct { - x: u32 -} - -impl MyStruct { - const fn new() -> MyStruct { - MyStruct { x: 24 } - } -} - -static MY_STRUCT: MyStruct = MyStruct::new(); - -fn main() { - println!("const fn Static MyStruct: {:?}", MY_STRUCT); -} -``` --- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8c796a6e7fc273c12115091b707b0255) - -However, there's a caveat: you're currently not allowed to use `const fn` to initialize -static variables of types that aren't marked `Sync`. As an example, even though -[`RefCell::new()`](https://doc.rust-lang.org/std/cell/struct.RefCell.html#method.new) -is `const fn`, because [`RefCell` isn't `Sync`](https://doc.rust-lang.org/std/cell/struct.RefCell.html#impl-Sync), -you'll get an error at compile time: - -```rust -use std::cell::RefCell; - -// error[E0277]: `std::cell::RefCell` cannot be shared between threads safely -static MY_LOCK: RefCell = RefCell::new(0); -``` --- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=c76ef86e473d07117a1700e21fd45560) - -It's likely that this will [change in the future](https://github.com/rust-lang/rfcs/blob/master/text/0911-const-fn.md) though. - -Which leads well to the next point: static variable types must implement the -[`Sync` marker](https://doc.rust-lang.org/std/marker/trait.Sync.html). -Because they're globally unique, it must be safe for you to access static variables -from any thread at any time. Most `struct` definitions automatically implement the -`Sync` trait because they contain only elements which themselves -implement `Sync`. This is why earlier examples could get away with initializing -statics, even though we never included an `impl Sync for MyStruct` in the code. -For more on the `Sync` trait, the [Nomicon](https://doc.rust-lang.org/nomicon/send-and-sync.html) -has a much more thorough treatment. But as an example, Rust refuses to compile -our earlier example if we add a non-`Sync` element to the `struct` definition: - -```rust -use std::cell::RefCell; - -struct MyStruct { - x: u32, - y: RefCell, -} - -// error[E0277]: `std::cell::RefCell` cannot be shared between threads safely -static MY_STRUCT: MyStruct = MyStruct { - x: 8, - y: RefCell::new(8) -}; -``` --- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=40074d0248f056c296b662dbbff97cfc) - -Finally, while `static mut` variables are allowed, mutating them is an `unsafe` operation. -Unlike `const` however, interior mutability is acceptable. To demonstrate: - -```rust -use std::sync::Once; - -// This example adapted from https://doc.rust-lang.org/std/sync/struct.Once.html#method.call_once -static INIT: Once = Once::new(); - -fn main() { - // Note that while `INIT` is declared immutable, we're still allowed - // to mutate its interior - INIT.call_once(|| println!("Initializing...")); - // This code won't panic, as the interior of INIT was modified - // as part of the previous `call_once` - INIT.call_once(|| panic!("INIT was called twice!")); -} -``` --- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=3ba003a981a7ed7400240caadd384d59) - -# Stacking Up: Non-Heap Memory - -`const` and `static` are perfectly fine, but it's very rare that we know -at compile-time about either values or references that will be the same for the entire -time our program is running. Put another way, it's not often the case that either you -or your compiler know how much memory your entire program will need. - -However, there are still some optimizations the compiler can do if it knows how much -memory individual functions will need. Specifically, the compiler can make use of -"stack" memory (as opposed to "heap" memory) which can be managed far faster in -both the short- and long-term. When requesting memory, the -[`push` instruction](http://www.cs.virginia.edu/~evans/cs216/guides/x86.html) -can typically complete in [1 or 2 cycles](https://agner.org/optimize/instruction_tables.ods) -(<1 nanosecond on modern CPUs). Heap memory instead requires using an allocator -(specialized software to track what memory is in use) to reserve space. -And when you're finished with your memory, the `pop` instruction likewise runs in -1-3 cycles, as opposed to an allocator needing to worry about memory fragmentation -and other issues. All sorts of incredibly sophisticated techniques have been used -to design allocators: -- [Garbage Collection](https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)) - strategies like [Tracing](https://en.wikipedia.org/wiki/Tracing_garbage_collection) - (used in [Java](https://www.oracle.com/technetwork/java/javase/tech/g1-intro-jsp-135488.html)) - and [Reference counting](https://en.wikipedia.org/wiki/Reference_counting) - (used in [Python](https://docs.python.org/3/extending/extending.html#reference-counts)) -- Thread-local structures to prevent locking the allocator in [tcmalloc](https://jamesgolick.com/2013/5/19/how-tcmalloc-works.html) -- Arena structures used in [jemalloc](http://jemalloc.net/), which until recently - was the primary allocator for Rust programs! - -But no matter how fast your allocator is, the principle remains: the -fastest allocator is the one you never use. As such, we're not going to go -in detail on how exactly the -[`push` and `pop` instructions work](http://www.cs.virginia.edu/~evans/cs216/guides/x86.html), -and we'll focus instead on the conditions that enable the Rust compiler to use -the faster stack-based allocation for variables. - -With that in mind, let's get into the details. How do we know when Rust will or will not use -stack allocation for objects we create? Looking at other languages, it's often easy to delineate -between stack and heap. Managed memory languages (Python, Java, -[C#](https://blogs.msdn.microsoft.com/ericlippert/2010/09/30/the-truth-about-value-types/)) assume -everything is on the heap. JIT compilers ([PyPy](https://www.pypy.org/), -[HotSpot](https://www.oracle.com/technetwork/java/javase/tech/index-jsp-136373.html)) may -optimize some heap allocations away, but you should never assume it will happen. -C makes things clear with calls to special functions ([malloc(3)](https://linux.die.net/man/3/malloc) -is one) being the way to use heap memory. Old C++ has the [`new`](https://stackoverflow.com/a/655086/1454178) -keyword, though modern C++/C++11 is more complicated with [RAII](https://en.cppreference.com/w/cpp/language/raii). - -For Rust specifically, the principle is this: *stack allocation will be used for everything -that doesn't involve "smart pointers" and collections.* If we're interested in dissecting it though, -there are three things we pay attention to: - -1. Stack manipulation instructions (`push`, `pop`, and `add`/`sub` of the `rsp` register) - indicate allocation of stack memory: - ```rust - pub fn stack_alloc(x: u32) -> u32 { - // Space for `y` is allocated by subtracting from `rsp`, - // and then populated - let y = [1u8, 2, 3, 4]; - // Space for `y` is deallocated by adding back to `rsp` - x - } - ``` - -- [Compiler Explorer](https://godbolt.org/z/5WSgc9) - -2. Tracking when exactly heap allocation calls happen is difficult. It's typically easier to - watch for `call core::ptr::real_drop_in_place`, and infer that a heap allocation happened - in the recent past: - ```rust - pub fn heap_alloc(x: usize) -> usize { - // Space for elements in a vector has to be allocated - // on the heap, and is then de-allocated once the - // vector goes out of scope - let y: Vec = Vec::with_capacity(x); - x - } - ``` - -- [Compiler Explorer](https://godbolt.org/z/epfgoQ) (`real_drop_in_place` happens on line 1317) - Note: While the [`Drop` trait](https://doc.rust-lang.org/std/ops/trait.Drop.html) - is [called for stack-allocated objects](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=87edf374d8983816eb3d8cfeac657b46), - the Rust standard library only defines `Drop` implementations for types that involve heap allocation. - -3. If you don't want to inspect the assembly, use a custom allocator that's able to track - and alert when heap allocations occur. As an unashamed plug, [qadapt](https://crates.io/crates/qadapt) - was designed for exactly this purpose. - -With all that in mind, let's talk about situations in which we're guaranteed to use stack memory: - -- Structs are created on the stack. -- Function arguments are passed on the stack. -- Enums and unions are stack-allocated. -- [Arrays](https://doc.rust-lang.org/std/primitive.array.html) are always stack-allocated. -- Using the [`#[inline]` attribute](https://doc.rust-lang.org/reference/attributes.html#inline-attribute) - will not change the memory region used. -- Closures capture their arguments on the stack -- Generics will use stack allocation, even with dynamic dispatch. - -## Structs - - - -## Enums - -It's been a worry of mine that I'd manage to trigger a heap allocation because -of wrapping an underlying type in -Given that you're not using smart pointers, `enum` and other wrapper types will never use -heap allocations. This shows up most often with -[`Option`](https://doc.rust-lang.org/stable/core/option/enum.Option.html) and -[`Result`](https://doc.rust-lang.org/stable/core/result/enum.Result.html) types, -but generalizes to any other types as well. - -Because the size of an `enum` is the size of its largest element plus the size of a -discriminator, the compiler can predict how much memory is used. If enums were -sized as tightly as possible, heap allocations would be needed to handle the fact -that enum variants were of dynamic size! - -## Arrays - -The array type is guaranteed to be stack allocated, which is why the array size must -be declared. Interestingly enough, this can be used to cause safe Rust programs to crash: - -```rust -// 256 bytes -#[derive(Default)] -struct TwoFiftySix { - _a: [u64; 32] -} - -// 8 kilobytes -#[derive(Default)] -struct EightK { - _a: [TwoFiftySix; 32] -} - -// 256 kilobytes -#[derive(Default)] -struct TwoFiftySixK { - _a: [EightK; 32] -} - -// 8 megabytes - exceeds space typically provided for the stack, -// though the kernel can be instructed to allocate more. -// On Linux, you can check stack size using `ulimit -s` -#[derive(Default)] -struct EightM { - _a: [TwoFiftySixK; 32] -} - -fn main() { - // Because we already have things in stack memory - // (like the current function), allocating another - // eight megabytes of stack memory crashes the program - let _x = EightM::default(); -} -``` --- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=137893e3ae05c2f32fe07d6f6f754709) - -There aren't any security implications of this (no memory corruption occurs, -just running out of memory), but it's good to note that the Rust compiler -won't move arrays into heap memory even if they can be reasonably expected -to overflow the stack. - -## **inline** attributes - -## Closures - -Rules for how anonymous functions capture their arguments are typically language-specific. -In Java, [Lambda Expressions](https://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html) -are actually objects created on the heap that capture local primitives by copying, and capture -local non-primitives as (`final`) references. -[Python](https://docs.python.org/3.7/reference/expressions.html#lambda) and -[JavaScript](https://javascriptweblog.wordpress.com/2010/10/25/understanding-javascript-closures/) -both bind *everything* by reference normally, but Python can also -[capture values](https://stackoverflow.com/a/235764/1454178) and JavaScript has -[Arrow functions](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/Arrow_functions). - -In Rust, arguments to closures are the same as arguments to other functions; -closures are simply functions that don't have a declared name. Some weird ordering -of the stack may be required to handle them, but it's the compiler's responsiblity -to figure it out. - -Each example below has the same effect, but compile to very different programs. -In the simplest case, we immediately run a closure returned by another function. -Because we don't store a reference to the closure, the stack memory needed to -store the captured values is contiguous: - -```rust -fn my_func() -> impl FnOnce() { - let x = 24; - // Note that this closure in assembly looks exactly like - // any other function; you even use the `call` instruction - // to start running it. - move || { x; } -} - -pub fn immediate() { - my_func()(); - my_func()(); -} -``` --- [Compiler Explorer](https://godbolt.org/z/mgJ2zl), 25 total assembly instructions - -If we store a reference to the bound closure though, the Rust compiler has to -work a bit harder to make sure everything is correctly laid out in stack memory: - -```rust -pub fn simple_reference() { - let x = my_func(); - let y = my_func(); - y(); - x(); -} -``` --- [Compiler Explorer](https://godbolt.org/z/K_dj5n), 55 total assembly instructions - -In more complex cases, even things like variable order matter: - -```rust -pub fn complex() { - let x = my_func(); - let y = my_func(); - x(); - y(); -} -``` --- [Compiler Explorer](https://godbolt.org/z/p37qFl), 70 total assembly instructions - -In every circumstance though, the compiler ensured that no heap allocations were necessary. - -## Generics - -# A Heaping Helping: Rust and Dynamic Memory - -Opening question: How many allocations happen before `fn main()` is called? - -Now, one question I hope you're asking is "how do we distinguish stack- and -heap-based allocations in Rust code?" There are two strategies I'm going -to use for this: - -Summary section: - -- Smart pointers hold their contents in the heap -- Collections are smart pointers for many objects at a time, and reallocate - when they need to grow -- Boxed closures (FnBox, others?) are heap allocated -- "Move" semantics don't trigger new allocation; just a change of ownership, - so are incredibly fast -- Stack-based alternatives to standard library types should be preferred (spin, parking_lot) - -## Smart pointers - -The first thing to note are the "smart pointer" types. -When you have data that must outlive the scope in which it is declared, -or your data is of unknown or dynamic size, you'll make use of these types. - -The term [smart pointer](https://en.wikipedia.org/wiki/Smart_pointer) -comes from C++, and is used to describe objects that are responsible for managing -ownership of data allocated on the heap. The smart pointers available in the `alloc` -crate should look mostly familiar: -- [`Box`](https://doc.rust-lang.org/alloc/boxed/struct.Box.html) -- [`Rc`](https://doc.rust-lang.org/alloc/rc/struct.Rc.html) -- [`Arc`](https://doc.rust-lang.org/alloc/sync/struct.Arc.html) -- [`Cow`](https://doc.rust-lang.org/alloc/borrow/enum.Cow.html) - -The [standard library](https://doc.rust-lang.org/std/) also defines some smart pointers, -though more than can be covered in this article. Some examples: -- [`RwLock`](https://doc.rust-lang.org/std/sync/struct.RwLock.html) -- [`Mutex`](https://doc.rust-lang.org/std/sync/struct.Mutex.html) - -Finally, there is one [gotcha](https://www.merriam-webster.com/dictionary/gotcha): -cell types (like [`RefCell`](https://doc.rust-lang.org/stable/core/cell/struct.RefCell.html)) -look and behave like smart pointers, but don't actually require heap allocation. -Check out the [`core::cell` docs](https://doc.rust-lang.org/stable/core/cell/index.html) -for more information. - -When a smart pointer is created, the data it is given is placed in heap memory and -the location of that data is recorded in the smart pointer. Once the smart pointer -has determined it's safe to deallocate that memory (when a `Box` has -[gone out of scope](https://doc.rust-lang.org/stable/std/boxed/index.html) or when -reference count for an object [goes to zero](https://doc.rust-lang.org/alloc/rc/index.html)), -the heap space is reclaimed. We can prove these types use heap memory by -looking at code: - -```rust -use std::rc::Rc; -use std::sync::Arc; -use std::borrow::Cow; - -pub fn my_box() { - // Drop at line 1640 - Box::new(0); -} - -pub fn my_rc() { - // Drop at line 1650 - Rc::new(0); -} - -pub fn my_arc() { - // Drop at line 1660 - Arc::new(0); -} - -pub fn my_cow() { - // Drop at line 1672 - Cow::from("drop"); -} -``` --- [Compiler Explorer](https://godbolt.org/z/SaDpWg) - -## Collections - -Collections types use heap memory because they have dynamic size; they will request more memory -[when needed](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.reserve), -and can [release memory](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.shrink_to_fit) -when it's no longer necessary. This dynamic memory usage forces Rust to heap allocate -everything they contain. In a way, **collections are smart pointers for many objects at once.** -Common types that fall under this umbrella are `Vec`, `HashMap`, and `String` -(not [`&str`](https://doc.rust-lang.org/std/primitive.str.html)). - -But while collections store the objects they own in heap memory, *creating new collections -will not allocate on the heap*. This is a bit weird, because if we call `Vec::new()` the -assembly shows a corresponding call to `drop_in_place`: - -```rust -pub fn my_vec() { - // Drop in place at line 481 - Vec::::new(); -} -``` --- [Compiler Explorer](https://godbolt.org/z/1WkNtC) - -But because the vector has no elements it is managing, no calls to the allocator -will ever be dispatched. A couple of places to look at for confirming this behavior: -[`Vec::new()`](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.new), -[`HashMap::new()`](https://doc.rust-lang.org/std/collections/hash_map/struct.HashMap.html#method.new), -and [`String::new()`](https://doc.rust-lang.org/std/string/struct.String.html#method.new). - # Compiler Optimizations: What It's Done For You Lately 1. Box<> getting inlined into stack allocations