Final draft!

I think.
This commit is contained in:
Bradlee Speice 2019-02-10 22:44:40 -05:00
parent f3dad2a34d
commit d1153d07f6
No known key found for this signature in database
GPG Key ID: 48BEA6257238E620
5 changed files with 182 additions and 142 deletions

View File

@ -7,33 +7,42 @@ tags: [rust, understanding-allocations]
--- ---
The first memory type we'll look at is pretty special: when Rust can prove that The first memory type we'll look at is pretty special: when Rust can prove that
a *value* is fixed for the life of a program (`const`), and when a *reference* is valid for a *value* is fixed for the life of a program (`const`), and when a *reference* is unique for
the duration of the program (`static` as a declaration, not the life of a program (`static` as a declaration, not
[`'static`](https://doc.rust-lang.org/book/ch10-03-lifetime-syntax.html#the-static-lifetime) [`'static`](https://doc.rust-lang.org/book/ch10-03-lifetime-syntax.html#the-static-lifetime)
as a lifetime). as a lifetime), we can make use of global memory. This special section of data is embedded
Understanding the distinction between value and reference is important for reasons directly in the program binary so that variables are ready to go once the program loads;
we'll go into below. The no additional computation is necessary.
Understanding the value/reference distinction is important for reasons we'll go into below,
and while the
[full specification](https://github.com/rust-lang/rfcs/blob/master/text/0246-const-vs-static.md) [full specification](https://github.com/rust-lang/rfcs/blob/master/text/0246-const-vs-static.md)
for these two memory types is available, but we'll take a hands-on approach to the topic. for these two keywords is available, we'll take a hands-on approach to the topic.
# **const** # **const**
The quick summary is this: `const` declares a read-only block of memory that is loaded When a *value* is guaranteed to be unchanging in your program (where "value" may be scalars,
as part of your program binary (during the call to [exec(3)](https://linux.die.net/man/3/exec)). `struct`s, etc.), you can declare it `const`.
Any `const` value resulting from calling a `const fn` is guaranteed to be materialized This tells the compiler that it's safe to treat the value as never changing, and enables
at compile-time (meaning that access at runtime will not invoke the `const fn`), some interesting optimizations; not only is there no initialization cost to
even though the `const fn` functions are available at run-time as well. The compiler creating the value (it is loaded at the same time as the executable parts of your program),
can choose to copy the constant value wherever it is deemed practical. Getting the address but the compiler can also copy the value around if it speeds up the code.
of a `const` value is legal, but not guaranteed to be the same even when referring to the
same named identifier.
The first point is a bit strange - "read-only memory". The points we need to address when talking about `const` are:
- `Const` values are stored in read-only memory - it's impossible to modify.
- Values resulting from calling a `const fn` are materialized at compile-time.
- The compiler may (or may not) copy `const` values wherever it chooses.
## Read-Only
The first point is a bit strange - "read-only memory."
[The Rust book](https://doc.rust-lang.org/book/ch03-01-variables-and-mutability.html#differences-between-variables-and-constants) [The Rust book](https://doc.rust-lang.org/book/ch03-01-variables-and-mutability.html#differences-between-variables-and-constants)
mentions in a couple places that using `mut` with constants is illegal, mentions in a couple places that using `mut` with constants is illegal,
but it's also important to demonstrate just how immutable they are. *Typically* in Rust but it's also important to demonstrate just how immutable they are. *Typically* in Rust
you can use "inner mutability" to modify things that aren't declared `mut`. you can use [interior mutability](https://doc.rust-lang.org/book/ch15-05-interior-mutability.html)
[`RefCell`](https://doc.rust-lang.org/std/cell/struct.RefCell.html) provides an API to modify things that aren't declared `mut`.
to guarantee at runtime that some consistency rules are enforced: [`RefCell`](https://doc.rust-lang.org/std/cell/struct.RefCell.html) provides an
example of this pattern in action:
```rust ```rust
use std::cell::RefCell; use std::cell::RefCell;
@ -55,7 +64,7 @@ fn main() {
``` ```
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8e4bea1a718edaff4507944e825a54b2) -- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8e4bea1a718edaff4507944e825a54b2)
When `const` is involved though, modifications are silently ignored: When `const` is involved though, interior mutability is impossible:
```rust ```rust
use std::cell::RefCell; use std::cell::RefCell;
@ -95,10 +104,12 @@ fn main() {
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=c3cc5979b5e5434eca0f9ec4a06ee0ed) -- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=c3cc5979b5e5434eca0f9ec4a06ee0ed)
When the [`const` specification](https://github.com/rust-lang/rfcs/blob/26197104b7bb9a5a35db243d639aee6e46d35d75/text/0246-const-vs-static.md) When the [`const` specification](https://github.com/rust-lang/rfcs/blob/26197104b7bb9a5a35db243d639aee6e46d35d75/text/0246-const-vs-static.md)
refers to ["rvalues"](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3055.pdf), this is refers to ["rvalues"](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3055.pdf), this behavior
what they mean. [Clippy](https://github.com/rust-lang/rust-clippy) will treat this as an error, is what they refer to. [Clippy](https://github.com/rust-lang/rust-clippy) will treat this as an error,
but it's still something to be aware of. but it's still something to be aware of.
## Initialization == Compilation
The next thing to mention is that `const` values are loaded into memory *as part of your program binary*. The next thing to mention is that `const` values are loaded into memory *as part of your program binary*.
Because of this, any `const` values declared in your program will be "realized" at compile-time; Because of this, any `const` values declared in your program will be "realized" at compile-time;
accessing them may trigger a main-memory lookup (with a fixed address, so your CPU may accessing them may trigger a main-memory lookup (with a fixed address, so your CPU may
@ -110,13 +121,16 @@ use std::cell::RefCell;
const CELL: RefCell<u32> = RefCell::new(24); const CELL: RefCell<u32> = RefCell::new(24);
pub fn multiply(value: u32) -> u32 { pub fn multiply(value: u32) -> u32 {
// CELL is stored at `.L__unnamed_1`
value * (*CELL.get_mut()) value * (*CELL.get_mut())
} }
``` ```
-- [Compiler Explorer](https://godbolt.org/z/2KXUcN) -- [Compiler Explorer](https://godbolt.org/z/Th8boO)
The compiler only creates one `RefCell`, and uses it everywhere. However, that value The compiler creates one `RefCell`, uses it everywhere, and never
is fully realized at compile time, and is fully stored in the `.L__unnamed_1` section. needs to call the `RefCell::new` function.
## Copying
If it's helpful though, the compiler can choose to copy `const` values. If it's helpful though, the compiler can choose to copy `const` values.
@ -124,22 +138,24 @@ If it's helpful though, the compiler can choose to copy `const` values.
const FACTOR: u32 = 1000; const FACTOR: u32 = 1000;
pub fn multiply(value: u32) -> u32 { pub fn multiply(value: u32) -> u32 {
// See assembly line 4 for the `mov edi, 1000` instruction
value * FACTOR value * FACTOR
} }
pub fn multiply_twice(value: u32) -> u32 { pub fn multiply_twice(value: u32) -> u32 {
// See assembly lines 22 and 29 for `mov edi, 1000` instructions
value * FACTOR * FACTOR value * FACTOR * FACTOR
} }
``` ```
-- [Compiler Explorer](https://godbolt.org/z/_JiT9O) -- [Compiler Explorer](https://godbolt.org/z/ZtS54X)
In this example, the `FACTOR` value is turned into the `mov edi, 1000` instruction In this example, the `FACTOR` value is turned into the `mov edi, 1000` instruction
in both the `multiply` and `multiply_twice` functions; the "1000" value is never in both the `multiply` and `multiply_twice` functions; the "1000" value is never
"stored" anywhere, as it's small enough to inline into the assembly instructions. "stored" anywhere, as it's small enough to inline into the assembly instructions.
Finally, getting the address of a `const` value is possible but not guaranteed Finally, getting the address of a `const` value is possible, but not guaranteed
to be unique (given that the compiler can choose to copy values). In my testing to be unique (because the compiler can choose to copy values). I was unable to
I was never able to get the compiler to copy a `const` value and get differing pointers, get non-unique pointers in my testing (even using different crates),
but the specifications are clear enough: *don't rely on pointers to `const` but the specifications are clear enough: *don't rely on pointers to `const`
values being consistent*. To be frank, caring about locations for `const` values values being consistent*. To be frank, caring about locations for `const` values
is almost certainly a code smell. is almost certainly a code smell.
@ -147,20 +163,19 @@ is almost certainly a code smell.
# **static** # **static**
Static variables are related to `const` variables, but take a slightly different approach. Static variables are related to `const` variables, but take a slightly different approach.
When the compiler can guarantee that a *reference* is fixed for the life of a program, When we declare that a *reference* is unique for the life of a program,
you end up with a `static` variable (as opposed to *values* that are fixed for the you have a `static` variable (unrelated to the `'static` lifetime). Because of the
duration a program is running). Because of this reference/value distinction, reference/value distinction with `const`/`static`,
static variables behave much more like what people expect from "global" variables. static variables behave much more like typical "global" variables.
We'll look at regular static variables first, and then address the `lazy_static!()`
and `thread_local!()` macros later.
More generally, `static` variables are globally unique locations in memory, But to understand `static`, here's what we'll look at:
the contents of which are loaded as part of your program being read into main memory. - `static` variables are globally unique locations in memory.
They allow initialization with both raw values and `const fn` calls, and the initial - Like `const`, `static` variables are loaded at the same time as your program being read into memory.
value is loaded along with the program/library binary. All static variables must - All `static` variables must implement the [`Sync`](https://doc.rust-lang.org/std/marker/trait.Sync.html)
be of a type that implements the [`Sync`](https://doc.rust-lang.org/std/marker/trait.Sync.html) marker trait.
marker trait. And while `static mut` variables are allowed, mutating a static is considered - Interior mutability is safe and acceptable when using `static` variables.
an `unsafe` operation.
## Memory Uniqueness
The single biggest difference between `const` and `static` is the guarantees The single biggest difference between `const` and `static` is the guarantees
provided about uniqueness. Where `const` variables may or may not be copied provided about uniqueness. Where `const` variables may or may not be copied
@ -171,20 +186,24 @@ in code, `static` variables are guarantee to be unique. If we take a previous
static FACTOR: u32 = 1000; static FACTOR: u32 = 1000;
pub fn multiply(value: u32) -> u32 { pub fn multiply(value: u32) -> u32 {
// The assembly to `mul dword ptr [rip + example::FACTOR]` is how FACTOR gets used
value * FACTOR value * FACTOR
} }
pub fn multiply_twice(value: u32) -> u32 { pub fn multiply_twice(value: u32) -> u32 {
// The assembly to `mul dword ptr [rip + example::FACTOR]` is how FACTOR gets used
value * FACTOR * FACTOR value * FACTOR * FACTOR
} }
``` ```
-- [Compiler Explorer](https://godbolt.org/z/bSfBxn) -- [Compiler Explorer](https://godbolt.org/z/uxmiRQ)
Where [previously](https://godbolt.org/z/_JiT90) there were plenty of Where [previously](#copying) there were plenty of
references to multiplying by 1000, the new assembly refers to `FACTOR` references to multiplying by 1000, the new assembly refers to `FACTOR`
as a named memory location instead. No initialization work needs to be done, as a named memory location instead. No initialization work needs to be done,
but the compiler can no longer prove the value never changes during execution. but the compiler can no longer prove the value never changes during execution.
## Initialization == Compilation
Next, let's talk about initialization. The simplest case is initializing Next, let's talk about initialization. The simplest case is initializing
static variables with either scalar or struct notation: static variables with either scalar or struct notation:
@ -208,7 +227,7 @@ fn main() {
``` ```
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=b538dbc46076f12db047af4f4403ee6e) -- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=b538dbc46076f12db047af4f4403ee6e)
Things get a bit weirder when using `const fn`. In most cases, things just work: Things can get a bit weirder when using `const fn` though. In most cases, it just works:
```rust ```rust
#[derive(Debug)] #[derive(Debug)]
@ -231,9 +250,9 @@ fn main() {
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8c796a6e7fc273c12115091b707b0255) -- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8c796a6e7fc273c12115091b707b0255)
However, there's a caveat: you're currently not allowed to use `const fn` to initialize However, there's a caveat: you're currently not allowed to use `const fn` to initialize
static variables of types that aren't marked `Sync`. As an example, even though static variables of types that aren't marked `Sync`. For example,
[`RefCell::new()`](https://doc.rust-lang.org/std/cell/struct.RefCell.html#method.new) [`RefCell::new()`](https://doc.rust-lang.org/std/cell/struct.RefCell.html#method.new)
is `const fn`, because [`RefCell` isn't `Sync`](https://doc.rust-lang.org/std/cell/struct.RefCell.html#impl-Sync), is a `const fn`, but because [`RefCell` isn't `Sync`](https://doc.rust-lang.org/std/cell/struct.RefCell.html#impl-Sync),
you'll get an error at compile time: you'll get an error at compile time:
```rust ```rust
@ -246,16 +265,18 @@ static MY_LOCK: RefCell<u8> = RefCell::new(0);
It's likely that this will [change in the future](https://github.com/rust-lang/rfcs/blob/master/text/0911-const-fn.md) though. It's likely that this will [change in the future](https://github.com/rust-lang/rfcs/blob/master/text/0911-const-fn.md) though.
## **Sync**
Which leads well to the next point: static variable types must implement the Which leads well to the next point: static variable types must implement the
[`Sync` marker](https://doc.rust-lang.org/std/marker/trait.Sync.html). [`Sync` marker](https://doc.rust-lang.org/std/marker/trait.Sync.html).
Because they're globally unique, it must be safe for you to access static variables Because they're globally unique, it must be safe for you to access static variables
from any thread at any time. Most `struct` definitions automatically implement the from any thread at any time. Most `struct` definitions automatically implement the
`Sync` trait because they contain only elements which themselves `Sync` trait because they contain only elements which themselves
implement `Sync`. This is why earlier examples could get away with initializing implement `Sync` (read more in the [Nomicon](https://doc.rust-lang.org/nomicon/send-and-sync.html)).
This is why earlier examples could get away with initializing
statics, even though we never included an `impl Sync for MyStruct` in the code. statics, even though we never included an `impl Sync for MyStruct` in the code.
For more on the `Sync` trait, the [Nomicon](https://doc.rust-lang.org/nomicon/send-and-sync.html) To demonstrate this property, Rust refuses to compile our earlier
has a much more thorough treatment. But as an example, Rust refuses to compile example if we add a non-`Sync` element to the `struct` definition:
our earlier example if we add a non-`Sync` element to the `struct` definition:
```rust ```rust
use std::cell::RefCell; use std::cell::RefCell;
@ -273,8 +294,11 @@ static MY_STRUCT: MyStruct = MyStruct {
``` ```
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=40074d0248f056c296b662dbbff97cfc) -- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=40074d0248f056c296b662dbbff97cfc)
## Interior Mutability
Finally, while `static mut` variables are allowed, mutating them is an `unsafe` operation. Finally, while `static mut` variables are allowed, mutating them is an `unsafe` operation.
Unlike `const` however, interior mutability is acceptable. To demonstrate: If we want to stay in `safe` Rust, we can use interior mutability to accomplish
similar goals:
```rust ```rust
use std::sync::Once; use std::sync::Once;

View File

@ -6,10 +6,10 @@ category:
tags: [rust, understanding-allocations] tags: [rust, understanding-allocations]
--- ---
`const` and `static` are perfectly fine, but it's very rare that we know `const` and `static` are perfectly fine, but it's relatively rare that we know
at compile-time about either values or references that will be the same for the at compile-time about either values or references that will be the same for the
duration of our program. Put another way, it's not often the case that either you duration of our program. Put another way, it's not often the case that either you
or your compiler knows how much memory your entire program will need. or your compiler knows how much memory your entire program will ever need.
However, there are still some optimizations the compiler can do if it knows how much However, there are still some optimizations the compiler can do if it knows how much
memory individual functions will need. Specifically, the compiler can make use of memory individual functions will need. Specifically, the compiler can make use of
@ -19,9 +19,9 @@ both the short- and long-term. When requesting memory, the
can typically complete in [1 or 2 cycles](https://agner.org/optimize/instruction_tables.ods) can typically complete in [1 or 2 cycles](https://agner.org/optimize/instruction_tables.ods)
(<1 nanosecond on modern CPUs). Contrast that to heap memory which requires an allocator (<1 nanosecond on modern CPUs). Contrast that to heap memory which requires an allocator
(specialized software to track what memory is in use) to reserve space. (specialized software to track what memory is in use) to reserve space.
And when you're finished with your memory, the `pop` instruction likewise runs in When you're finished with stack memory, the `pop` instruction runs in
1-3 cycles, as opposed to an allocator needing to worry about memory fragmentation 1-3 cycles, as opposed to an allocator needing to worry about memory fragmentation
and other issues. All sorts of incredibly sophisticated techniques have been used and other issues with the heap. All sorts of incredibly sophisticated techniques have been used
to design allocators: to design allocators:
- [Garbage Collection](https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)) - [Garbage Collection](https://en.wikipedia.org/wiki/Garbage_collection_(computer_science))
strategies like [Tracing](https://en.wikipedia.org/wiki/Tracing_garbage_collection) strategies like [Tracing](https://en.wikipedia.org/wiki/Tracing_garbage_collection)
@ -37,7 +37,7 @@ But no matter how fast your allocator is, the principle remains: the
fastest allocator is the one you never use. As such, we're not going to discuss how exactly the fastest allocator is the one you never use. As such, we're not going to discuss how exactly the
[`push` and `pop` instructions work](http://www.cs.virginia.edu/~evans/cs216/guides/x86.html), [`push` and `pop` instructions work](http://www.cs.virginia.edu/~evans/cs216/guides/x86.html),
but we'll focus instead on the conditions that enable the Rust compiler to use but we'll focus instead on the conditions that enable the Rust compiler to use
the faster stack-based allocation for variables. faster stack-based allocation for variables.
So, **how do we know when Rust will or will not use stack allocation for objects we create?** So, **how do we know when Rust will or will not use stack allocation for objects we create?**
Looking at other languages, it's often easy to delineate Looking at other languages, it's often easy to delineate
@ -46,14 +46,14 @@ between stack and heap. Managed memory languages (Python, Java,
place everything on the heap. JIT compilers ([PyPy](https://www.pypy.org/), place everything on the heap. JIT compilers ([PyPy](https://www.pypy.org/),
[HotSpot](https://www.oracle.com/technetwork/java/javase/tech/index-jsp-136373.html)) may [HotSpot](https://www.oracle.com/technetwork/java/javase/tech/index-jsp-136373.html)) may
optimize some heap allocations away, but you should never assume it will happen. optimize some heap allocations away, but you should never assume it will happen.
C makes things clear with calls to special functions ([malloc(3)](https://linux.die.net/man/3/malloc) C makes things clear with calls to special functions (like [malloc(3)](https://linux.die.net/man/3/malloc))
is one) being the way to use heap memory. Old C++ has the [`new`](https://stackoverflow.com/a/655086/1454178) needed to access heap memory. Old C++ has the [`new`](https://stackoverflow.com/a/655086/1454178)
keyword, though modern C++/C++11 is more complicated with [RAII](https://en.cppreference.com/w/cpp/language/raii). keyword, though modern C++/C++11 is more complicated with [RAII](https://en.cppreference.com/w/cpp/language/raii).
For Rust specifically, the principle is this: **stack allocation will be used for everything For Rust, we can summarize as follows: **stack allocation will be used for everything
that doesn't involve "smart pointers" and collections.** We'll skip over a precise definition that doesn't involve "smart pointers" and collections**. We'll skip over a precise definition
of the term "smart pointer" for now, and instead discuss what we should watch for when talking of the term "smart pointer" for now, and instead discuss what we should watch for to understand
about the memory region used for allocation: when stack and heap memory regions are used:
1. Stack manipulation instructions (`push`, `pop`, and `add`/`sub` of the `rsp` register) 1. Stack manipulation instructions (`push`, `pop`, and `add`/`sub` of the `rsp` register)
indicate allocation of stack memory: indicate allocation of stack memory:
@ -68,7 +68,7 @@ about the memory region used for allocation:
``` ```
-- [Compiler Explorer](https://godbolt.org/z/5WSgc9) -- [Compiler Explorer](https://godbolt.org/z/5WSgc9)
2. Tracking when exactly heap allocation calls happen is difficult. It's typically easier to 2. Tracking when exactly heap allocation calls occur is difficult. It's typically easier to
watch for `call core::ptr::real_drop_in_place`, and infer that a heap allocation happened watch for `call core::ptr::real_drop_in_place`, and infer that a heap allocation happened
in the recent past: in the recent past:
```rust ```rust
@ -200,7 +200,7 @@ pub fn total_distance() {
-- [Compiler Explorer](https://godbolt.org/z/Qmx4ST) -- [Compiler Explorer](https://godbolt.org/z/Qmx4ST)
As a consequence of function arguments never using heap memory, we can also As a consequence of function arguments never using heap memory, we can also
infer that functions using the `#[inline]` attributes also do not heap-allocate. infer that functions using the `#[inline]` attributes also do not heap allocate.
But better than inferring, we can look at the assembly to prove it: But better than inferring, we can look at the assembly to prove it:
```rust ```rust
@ -239,8 +239,42 @@ pub fn total_distance() {
Finally, passing by value (arguments with type Finally, passing by value (arguments with type
[`Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html)) [`Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html))
and passing by reference (either moving ownership or passing a pointer) may have and passing by reference (either moving ownership or passing a pointer) may have
[slightly different layouts in assembly](https://godbolt.org/z/sKi_kl), but will slightly different layouts in assembly, but will still use either stack memory
still use either stack memory or CPU registers. or CPU registers:
```rust
pub struct Point {
x: i64,
y: i64,
}
// Moving values
pub fn distance_moved(a: Point, b: Point) -> i64 {
let x1 = a.x;
let x2 = b.x;
let y1 = a.y;
let y2 = b.y;
let x_pow = (x1 - x2) * (x1 - x2);
let y_pow = (y1 - y2) * (y1 - y2);
let squared = x_pow + y_pow;
squared / squared
}
// Borrowing values has two extra `mov` instructions on lines 21 and 22
pub fn distance_borrowed(a: &Point, b: &Point) -> i64 {
let x1 = a.x;
let x2 = b.x;
let y1 = a.y;
let y2 = b.y;
let x_pow = (x1 - x2) * (x1 - x2);
let y_pow = (y1 - y2) * (y1 - y2);
let squared = x_pow + y_pow;
squared / squared
}
```
-- [Compiler Explorer](https://godbolt.org/z/06hGiv)
# Enums # Enums
@ -340,9 +374,9 @@ both bind *everything* by reference normally, but Python can also
In Rust, arguments to closures are the same as arguments to other functions; In Rust, arguments to closures are the same as arguments to other functions;
closures are simply functions that don't have a declared name. Some weird ordering closures are simply functions that don't have a declared name. Some weird ordering
of the stack may be required to handle them, but it's the compiler's responsiblity of the stack may be required to handle them, but it's the compiler's responsiblity
to figure it out. to figure that out.
Each example below has the same effect, but compile to very different programs. Each example below has the same effect, but a different assembly implementation.
In the simplest case, we immediately run a closure returned by another function. In the simplest case, we immediately run a closure returned by another function.
Because we don't store a reference to the closure, the stack memory needed to Because we don't store a reference to the closure, the stack memory needed to
store the captured values is contiguous: store the captured values is contiguous:
@ -457,7 +491,7 @@ used for objects that aren't heap allocated, but it technically can be done.
Understanding move semantics and copy semantics in Rust is weird at first. The Rust docs Understanding move semantics and copy semantics in Rust is weird at first. The Rust docs
[go into detail](https://doc.rust-lang.org/stable/core/marker/trait.Copy.html) [go into detail](https://doc.rust-lang.org/stable/core/marker/trait.Copy.html)
far better than can be addressed here, so I'll leave them to do the job. far better than can be addressed here, so I'll leave them to do the job.
Even from a memory perspective though, their guideline is reasonable: From a memory perspective though, their guideline is reasonable:
[if your type can implemement `Copy`, it should](https://doc.rust-lang.org/stable/core/marker/trait.Copy.html#when-should-my-type-be-copy). [if your type can implemement `Copy`, it should](https://doc.rust-lang.org/stable/core/marker/trait.Copy.html#when-should-my-type-be-copy).
While there are potential speed tradeoffs to *benchmark* when discussing `Copy` While there are potential speed tradeoffs to *benchmark* when discussing `Copy`
(move semantics for stack objects vs. copying stack pointers vs. copying stack `struct`s), (move semantics for stack objects vs. copying stack pointers vs. copying stack `struct`s),
@ -471,8 +505,7 @@ because it's a marker trait. From there we'll note that a type
if (and only if) its components implement `Copy`, and that if (and only if) its components implement `Copy`, and that
[no heap-allocated types implement `Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html#implementors). [no heap-allocated types implement `Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html#implementors).
Thus, assignments involving heap types are always move semantics, and new heap Thus, assignments involving heap types are always move semantics, and new heap
allocations won't occur without explicit calls to allocations won't occur because of implicit operator behavior.
[`clone()`](https://doc.rust-lang.org/std/clone/trait.Clone.html#tymethod.clone).
```rust ```rust
#[derive(Clone)] #[derive(Clone)]
@ -490,8 +523,8 @@ struct NotCopyable {
# Iterators # Iterators
In [managed memory languages](https://www.youtube.com/watch?v=bSkpMdDe4g4&feature=youtu.be&t=357) In managed memory languages (like [Java](https://www.youtube.com/watch?v=bSkpMdDe4g4&feature=youtu.be&t=357)),
(like Java), there's a subtle difference between these two code samples: there's a subtle difference between these two code samples:
```java ```java
public static int sum_for(List<Long> vals) { public static int sum_for(List<Long> vals) {
@ -522,8 +555,7 @@ once the function ends. Sounds exactly like the issue stack-allocated objects ad
In Rust, iterators are allocated on the stack. The objects to iterate over are almost In Rust, iterators are allocated on the stack. The objects to iterate over are almost
certainly in heap memory, but the iterator itself certainly in heap memory, but the iterator itself
([`Iter`](https://doc.rust-lang.org/std/slice/struct.Iter.html)) doesn't need to use the heap. ([`Iter`](https://doc.rust-lang.org/std/slice/struct.Iter.html)) doesn't need to use the heap.
In each of the examples below we iterate over a collection, but will never need to allocate In each of the examples below we iterate over a collection, but never use heap allocation:
a object on the heap to clean up:
```rust ```rust
use std::collections::HashMap; use std::collections::HashMap;

View File

@ -12,11 +12,11 @@ how the language uses dynamic memory (also referred to as the **heap**) is a sys
And as the docs mention, ownership And as the docs mention, ownership
[is Rust's most unique feature](https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html). [is Rust's most unique feature](https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html).
The heap is used in two situations: when the compiler is unable to predict the *total size The heap is used in two situations; when the compiler is unable to predict either the *total size
of memory needed*, or *how long the memory is needed for*, it will allocate space in the heap. of memory needed*, or *how long the memory is needed for*, it allocates space in the heap.
This happens pretty frequently; if you want to download the Google home page, you won't know This happens pretty frequently; if you want to download the Google home page, you won't know
how large it is until your program runs. And when you're finished with Google, whenever that how large it is until your program runs. And when you're finished with Google, we deallocate
happens to be, we deallocate the memory so it can be used to store other webpages. If you're the memory so it can be used to store other webpages. If you're
interested in a slightly longer explanation of the heap, check out interested in a slightly longer explanation of the heap, check out
[The Stack and the Heap](https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html#the-stack-and-the-heap) [The Stack and the Heap](https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html#the-stack-and-the-heap)
in Rust's documentation. in Rust's documentation.
@ -32,8 +32,8 @@ To start off, take a guess for how many allocations happen in the program below:
fn main() {} fn main() {}
``` ```
It's obviously a trick question; while no heap allocations happen as a result of It's obviously a trick question; while no heap allocations occur as a result of
the code listed above, the setup needed to call `main` does allocate on the heap. that code, the setup needed to call `main` does allocate on the heap.
Here's a way to show it: Here's a way to show it:
```rust ```rust
@ -78,8 +78,8 @@ we'll follow this guide:
Finally, there are two "addendum" issues that are important to address when discussing Finally, there are two "addendum" issues that are important to address when discussing
Rust and the heap: Rust and the heap:
- Stack-based alternatives to some standard library types are available - Non-heap alternatives to many standard library types are available.
- Special allocators to track memory behavior are available - Special allocators to track memory behavior should be used to benchmark code.
# Smart pointers # Smart pointers
@ -99,7 +99,7 @@ crate should look mostly familiar:
- [`Cow`](https://doc.rust-lang.org/alloc/borrow/enum.Cow.html) - [`Cow`](https://doc.rust-lang.org/alloc/borrow/enum.Cow.html)
The [standard library](https://doc.rust-lang.org/std/) also defines some smart pointers The [standard library](https://doc.rust-lang.org/std/) also defines some smart pointers
to manage heap objects, though more than can be covered here. Some examples: to manage heap objects, though more than can be covered here. Some examples are:
- [`RwLock`](https://doc.rust-lang.org/std/sync/struct.RwLock.html) - [`RwLock`](https://doc.rust-lang.org/std/sync/struct.RwLock.html)
- [`Mutex`](https://doc.rust-lang.org/std/sync/struct.Mutex.html) - [`Mutex`](https://doc.rust-lang.org/std/sync/struct.Mutex.html)
@ -112,8 +112,8 @@ have more information.
When a smart pointer is created, the data it is given is placed in heap memory and When a smart pointer is created, the data it is given is placed in heap memory and
the location of that data is recorded in the smart pointer. Once the smart pointer the location of that data is recorded in the smart pointer. Once the smart pointer
has determined it's safe to deallocate that memory (when a `Box` has has determined it's safe to deallocate that memory (when a `Box` has
[gone out of scope](https://doc.rust-lang.org/stable/std/boxed/index.html) or when [gone out of scope](https://doc.rust-lang.org/stable/std/boxed/index.html) or a
reference count for an object [goes to zero](https://doc.rust-lang.org/alloc/rc/index.html)), reference count [goes to zero](https://doc.rust-lang.org/alloc/rc/index.html)),
the heap space is reclaimed. We can prove these types use heap memory by the heap space is reclaimed. We can prove these types use heap memory by
looking at code: looking at code:
@ -146,18 +146,18 @@ pub fn my_cow() {
# Collections # Collections
Collections types use heap memory because their contents have dynamic size; they will request Collection types use heap memory because their contents have dynamic size; they will request
more memory [when needed](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.reserve), more memory [when needed](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.reserve),
and can [release memory](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.shrink_to_fit) and can [release memory](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.shrink_to_fit)
when it's no longer necessary. This dynamic property forces Rust to heap allocate when it's no longer necessary. This dynamic property forces Rust to heap allocate
everything they contain. In a way, **collections are smart pointers for many objects at once.** everything they contain. In a way, **collections are smart pointers for many objects at a time**.
Common types that fall under this umbrella are Common types that fall under this umbrella are
[`Vec`](https://doc.rust-lang.org/stable/alloc/vec/struct.Vec.html), [`Vec`](https://doc.rust-lang.org/stable/alloc/vec/struct.Vec.html),
[`HashMap`](https://doc.rust-lang.org/stable/std/collections/struct.HashMap.html), and [`HashMap`](https://doc.rust-lang.org/stable/std/collections/struct.HashMap.html), and
[`String`](https://doc.rust-lang.org/stable/alloc/string/struct.String.html) [`String`](https://doc.rust-lang.org/stable/alloc/string/struct.String.html)
(not [`&str`](https://doc.rust-lang.org/std/primitive.str.html)). (not [`str`](https://doc.rust-lang.org/std/primitive.str.html)).
But while collections store the objects they own in heap memory, *creating new collections While collections store the objects they own in heap memory, *creating new collections
will not allocate on the heap*. This is a bit weird; if we call `Vec::new()`, the will not allocate on the heap*. This is a bit weird; if we call `Vec::new()`, the
assembly shows a corresponding call to `real_drop_in_place`: assembly shows a corresponding call to `real_drop_in_place`:
@ -169,7 +169,7 @@ pub fn my_vec() {
``` ```
-- [Compiler Explorer](https://godbolt.org/z/1WkNtC) -- [Compiler Explorer](https://godbolt.org/z/1WkNtC)
But because the vector has no elements it is managing, no calls to the allocator But because the vector has no elements to manage, no calls to the allocator
will ever be dispatched: will ever be dispatched:
```rust ```rust
@ -218,12 +218,12 @@ and [`String::new()`](https://doc.rust-lang.org/std/string/struct.String.html#me
# Heap Alternatives # Heap Alternatives
While it is a bit strange for us to talk of the stack after spending time with the heap, While it is a bit strange to speak of the stack after spending time with the heap,
it's worth pointing out that some heap-allocated objects in Rust have stack-based counterparts it's worth pointing out that some heap-allocated objects in Rust have stack-based counterparts
provided by other crates. If you have need of the functionality, but want to avoid allocating, provided by other crates. If you have need of the functionality, but want to avoid allocating,
there are some great alternatives. there are typically alternatives available.
When it comes to some of the standard library smart pointers When it comes to some standard library smart pointers
([`RwLock`](https://doc.rust-lang.org/std/sync/struct.RwLock.html) and ([`RwLock`](https://doc.rust-lang.org/std/sync/struct.RwLock.html) and
[`Mutex`](https://doc.rust-lang.org/std/sync/struct.Mutex.html)), stack-based alternatives [`Mutex`](https://doc.rust-lang.org/std/sync/struct.Mutex.html)), stack-based alternatives
are provided in crates like [parking_lot](https://crates.io/crates/parking_lot) and are provided in crates like [parking_lot](https://crates.io/crates/parking_lot) and
@ -233,10 +233,9 @@ are provided in crates like [parking_lot](https://crates.io/crates/parking_lot)
[`spin::Once`](https://mvdnes.github.io/rust-docs/spin-rs/spin/struct.Once.html) [`spin::Once`](https://mvdnes.github.io/rust-docs/spin-rs/spin/struct.Once.html)
if you're in need of synchronization primitives. if you're in need of synchronization primitives.
[thread_id](https://crates.io/crates/thread-id) [thread_id](https://crates.io/crates/thread-id) may be necessary if you're implementing an allocator
may still be necessary if you're implementing an allocator (*cough cough* the author *cough cough*)
because [`thread::current().id()`](https://doc.rust-lang.org/std/thread/struct.ThreadId.html) because [`thread::current().id()`](https://doc.rust-lang.org/std/thread/struct.ThreadId.html)
[uses a `thread_local!` structure](https://doc.rust-lang.org/stable/src/std/sys_common/thread_info.rs.html#22-40) uses a [`thread_local!` structure](https://doc.rust-lang.org/stable/src/std/sys_common/thread_info.rs.html#17-36)
that needs heap allocation. that needs heap allocation.
# Tracing Allocators # Tracing Allocators
@ -248,7 +247,6 @@ You should never rely on your instincts when
[a microsecond is an eternity](https://www.youtube.com/watch?v=NH1Tta7purM). [a microsecond is an eternity](https://www.youtube.com/watch?v=NH1Tta7purM).
Similarly, there's great work going on in Rust with allocators that keep track of what Similarly, there's great work going on in Rust with allocators that keep track of what
they're doing. [`alloc_counter`](https://crates.io/crates/alloc_counter) was designed they're doing (like [`alloc_counter`](https://crates.io/crates/alloc_counter)).
for exactly this purpose. When it comes to tracking heap behavior, you shouldn't just When it comes to tracking heap behavior, it's easy to make mistakes;
rely on the language; please measure and make sure that you have tools in place to catch please write tests and make sure you have tools to guard against future issues.
any issues that come up.

View File

@ -12,25 +12,25 @@ We've spent time showing how those rules work themselves out in practice,
and become familiar with reading the assembly code needed to see each memory and become familiar with reading the assembly code needed to see each memory
type (global, stack, heap) in action. type (global, stack, heap) in action.
But throughout the content so far, we've put a handicap on the code. But throughout the series so far, we've put a handicap on the code.
In the name of consistent and understandable results, we've asked the In the name of consistent and understandable results, we've asked the
compiler to pretty please leave the training wheels on. Now is the time compiler to pretty please leave the training wheels on. Now is the time
where we throw out all the rules and take the kid gloves off. As it turns out, where we throw out all the rules and take off the kid gloves. As it turns out,
both the Rust compiler and the LLVM optimizers are incredibly sophisticated, both the Rust compiler and the LLVM optimizers are incredibly sophisticated,
and we'll step back and let them do their job. and we'll step back and let them do their job.
Similar to ["What Has My Compiler Done For Me Lately?"](https://www.youtube.com/watch?v=bSkpMdDe4g4), Similar to ["What Has My Compiler Done For Me Lately?"](https://www.youtube.com/watch?v=bSkpMdDe4g4),
we're focusing on interesting things the Rust language (and LLVM!) can do we're focusing on interesting things the Rust language (and LLVM!) can do
as regards memory management. We'll still be looking at assembly code to with memory management. We'll still be looking at assembly code to
understand what's going on, but it's important to mention again: understand what's going on, but it's important to mention again:
**please use automated tools like **please use automated tools like
[alloc-counter](https://crates.io/crates/alloc_counter) to double-check [alloc-counter](https://crates.io/crates/alloc_counter) to double-check
memory behavior if it's something you care about**. memory behavior if it's something you care about**.
It's far too easy to mis-read assembly in large code sections, you should It's far too easy to mis-read assembly in large code sections, you should
always have an automated tool verify behavior if you care about memory usage. always verify behavior if you care about memory usage.
The guiding principal as we move forward is this: *optimizing compilers The guiding principal as we move forward is this: *optimizing compilers
won't produce worse assembly than we started with.* There won't be any won't produce worse programs than we started with.* There won't be any
situations where stack allocations get moved to heap allocations. situations where stack allocations get moved to heap allocations.
There will, however, be an opera of optimization. There will, however, be an opera of optimization.
@ -40,7 +40,7 @@ Our first optimization comes when LLVM can reason that the lifetime of an object
is sufficiently short that heap allocations aren't necessary. In these cases, is sufficiently short that heap allocations aren't necessary. In these cases,
LLVM will move the allocation to the stack instead! The way this interacts LLVM will move the allocation to the stack instead! The way this interacts
with `#[inline]` attributes is a bit opaque, but the important part is that LLVM with `#[inline]` attributes is a bit opaque, but the important part is that LLVM
can sometimes do better than the baseline Rust language. can sometimes do better than the baseline Rust language:
```rust ```rust
use std::alloc::{GlobalAlloc, Layout, System}; use std::alloc::{GlobalAlloc, Layout, System};
@ -87,13 +87,13 @@ unsafe impl GlobalAlloc for PanicAllocator {
With some collections, LLVM can predict how large they will become With some collections, LLVM can predict how large they will become
and allocate the entire size on the stack instead of the heap. and allocate the entire size on the stack instead of the heap.
This works whether with both the pre-allocation (`Vec::with_capacity`) This works with both the pre-allocation (`Vec::with_capacity`)
*and re-allocation* (`Vec::push`) methods for collections types. *and re-allocation* (`Vec::push`) methods for collection types.
Not only can LLVM predict sizing if you reserve the fully size up front, Not only can LLVM predict sizing if you reserve everything up front,
it can see through the resizing operations and find the total size. it can see through the resizing operations and find the total size.
While this specific optimization is unlikely to come up in production While this specific optimization is unlikely to come up in production
usage, it's cool to note that LLVM does a considerable amount of work usage, it's cool to note that LLVM does a considerable amount of work
to understand what code actually does. to understand what the code will do:
```rust ```rust
use std::alloc::{GlobalAlloc, Layout, System}; use std::alloc::{GlobalAlloc, Layout, System};
@ -104,13 +104,16 @@ fn main() {
DO_PANIC.store(true, Ordering::SeqCst); DO_PANIC.store(true, Ordering::SeqCst);
// If the compiler can predict how large a vector will be, // If the compiler can predict how large a vector will be,
// it can optimize out the heap storage needed. This also // it can optimize out the heap storage needed.
// works with `Vec::with_capacity()`, but the push case
// is a bit more interesting.
let mut x: Vec<u64> = Vec::new(); let mut x: Vec<u64> = Vec::new();
x.push(12); x.push(12);
assert_eq!(x[0], 12);
let mut y: Vec<u64> = Vec::with_capacity(1);
y.push(12);
assert_eq!(x[0], y[0]);
drop(x); drop(x);
drop(y);
// Turn off panicking, as there are some deallocations // Turn off panicking, as there are some deallocations
// when we exit main. // when we exit main.
@ -138,21 +141,21 @@ unsafe impl GlobalAlloc for PanicAllocator {
} }
} }
``` ```
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=1dfccfcf63d8800e644a3b948f1eeb7b) -- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=af660a87b2cd94213afb906beeb32c15)
# Dr. Array or: How I Learned to Love the Optimizer # Dr. Array or: How I Learned to Love the Optimizer
Finally, this isn't so much about LLVM figuring out different memory behavior, Finally, this isn't so much about LLVM figuring out different memory behavior,
but LLVM totally stripping out code that has no side effects. Optimizations of but LLVM stripping out code that doesn't do anything. Optimizations of
this type have a lot of nuance to them; if you're not careful, they can this type have a lot of nuance to them; if you're not careful, they can
make your benchmarks look make your benchmarks look
[impossibly good](https://www.youtube.com/watch?v=nXaxk27zwlk&feature=youtu.be&t=1199). [impossibly good](https://www.youtube.com/watch?v=nXaxk27zwlk&feature=youtu.be&t=1199).
In Rust, the `black_box` function (in both In Rust, the `black_box` function (implemented in both
[`libtest`](https://doc.rust-lang.org/1.1.0/test/fn.black_box.html) and [`libtest`](https://doc.rust-lang.org/1.1.0/test/fn.black_box.html) and
[`criterion`](https://docs.rs/criterion/0.2.10/criterion/fn.black_box.html)) [`criterion`](https://docs.rs/criterion/0.2.10/criterion/fn.black_box.html))
will tell the compiler to disable this kind of optimization. But if you let will tell the compiler to disable this kind of optimization. But if you let
LLVM remove unnecessary code, you can end up with programs that LLVM remove unnecessary code, you can end up running programs that
would have previously caused errors running just fine: previously caused errors:
```rust ```rust
#[derive(Default)] #[derive(Default)]
@ -183,5 +186,5 @@ pub fn main() {
let _x = EightM::default(); let _x = EightM::default();
} }
``` ```
-- [Compiler Explorer](https://godbolt.org/z/daHn7P) -- [Compiler Explorer](https://godbolt.org/z/daHn7P)
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=4c253bf26072119896ab93c6ef064dc0) -- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=4c253bf26072119896ab93c6ef064dc0)

View File

@ -9,16 +9,8 @@ tags: [rust, understanding-allocations]
While there's a lot of interesting detail captured in this series, it's often helpful While there's a lot of interesting detail captured in this series, it's often helpful
to have a document that answers some "yes/no" questions. You may not care about to have a document that answers some "yes/no" questions. You may not care about
what an `Iterator` looks like in assembly, you just need to know whether it allocates what an `Iterator` looks like in assembly, you just need to know whether it allocates
an object on the heap or not. an object on the heap or not. And while Rust will prioritize the fastest behavior it can,
here are the rules for each memory type:
To that end, it should be said once again: if you care about memory behavior,
use an allocator to verify the correct behavior. Tools like
[`alloc_counter`](https://crates.io/crates/alloc_counter) are designed to make
testing this behavior simple easy.
Finally, a summary of the content that's been covered. Rust will prioritize
the fastest behavior it can, but here are the ground rules for understanding
the memory model in Rust:
**Heap Allocation**: **Heap Allocation**:
- Smart pointers (`Box`, `Rc`, `Mutex`, etc.) allocate their contents in heap memory. - Smart pointers (`Box`, `Rc`, `Mutex`, etc.) allocate their contents in heap memory.
@ -27,7 +19,7 @@ the memory model in Rust:
don't need heap memory. If possible, use those. don't need heap memory. If possible, use those.
**Stack Allocation**: **Stack Allocation**:
- Everything not using a smart pointer type will be allocated on the stack. - Everything not using a smart pointer will be allocated on the stack.
- Structs, enums, iterators, arrays, and closures are all stack allocated. - Structs, enums, iterators, arrays, and closures are all stack allocated.
- Cell types (`RefCell`) behave like smart pointers, but are stack-allocated. - Cell types (`RefCell`) behave like smart pointers, but are stack-allocated.
- Inlining (`#[inline]`) will not affect allocation behavior for better or worse. - Inlining (`#[inline]`) will not affect allocation behavior for better or worse.
@ -37,14 +29,5 @@ the memory model in Rust:
- `const` is a fixed value; the compiler is allowed to copy it wherever useful. - `const` is a fixed value; the compiler is allowed to copy it wherever useful.
- `static` is a fixed reference; the compiler will guarantee it is unique. - `static` is a fixed reference; the compiler will guarantee it is unique.
And a nice visualizaton of the rules, courtesy of
[Raph Levien](https://docs.google.com/presentation/d/1q-c7UAyrUlM-eZyTo1pd8SZ0qwA_wYxmPZVOQkoDmH4/edit?usp=sharing):
![Container Sizes in Rust](/assets/images/2019-02-04-container-size.svg) ![Container Sizes in Rust](/assets/images/2019-02-04-container-size.svg)
-- [Raph Levien](https://docs.google.com/presentation/d/1q-c7UAyrUlM-eZyTo1pd8SZ0qwA_wYxmPZVOQkoDmH4/edit?usp=sharing)
---
If you've taken the time to read through this series: thanks. I've enjoyed the
process that went into writing this, both in building new tools and learning
the material well enough to explain it. I hope this is valuable as a reference
to you as well.