Final draft!

I think.
This commit is contained in:
Bradlee Speice 2019-02-10 22:44:40 -05:00
parent f3dad2a34d
commit d1153d07f6
No known key found for this signature in database
GPG Key ID: 48BEA6257238E620
5 changed files with 182 additions and 142 deletions

View File

@ -7,33 +7,42 @@ tags: [rust, understanding-allocations]
---
The first memory type we'll look at is pretty special: when Rust can prove that
a *value* is fixed for the life of a program (`const`), and when a *reference* is valid for
the duration of the program (`static` as a declaration, not
a *value* is fixed for the life of a program (`const`), and when a *reference* is unique for
the life of a program (`static` as a declaration, not
[`'static`](https://doc.rust-lang.org/book/ch10-03-lifetime-syntax.html#the-static-lifetime)
as a lifetime).
Understanding the distinction between value and reference is important for reasons
we'll go into below. The
as a lifetime), we can make use of global memory. This special section of data is embedded
directly in the program binary so that variables are ready to go once the program loads;
no additional computation is necessary.
Understanding the value/reference distinction is important for reasons we'll go into below,
and while the
[full specification](https://github.com/rust-lang/rfcs/blob/master/text/0246-const-vs-static.md)
for these two memory types is available, but we'll take a hands-on approach to the topic.
for these two keywords is available, we'll take a hands-on approach to the topic.
# **const**
The quick summary is this: `const` declares a read-only block of memory that is loaded
as part of your program binary (during the call to [exec(3)](https://linux.die.net/man/3/exec)).
Any `const` value resulting from calling a `const fn` is guaranteed to be materialized
at compile-time (meaning that access at runtime will not invoke the `const fn`),
even though the `const fn` functions are available at run-time as well. The compiler
can choose to copy the constant value wherever it is deemed practical. Getting the address
of a `const` value is legal, but not guaranteed to be the same even when referring to the
same named identifier.
When a *value* is guaranteed to be unchanging in your program (where "value" may be scalars,
`struct`s, etc.), you can declare it `const`.
This tells the compiler that it's safe to treat the value as never changing, and enables
some interesting optimizations; not only is there no initialization cost to
creating the value (it is loaded at the same time as the executable parts of your program),
but the compiler can also copy the value around if it speeds up the code.
The first point is a bit strange - "read-only memory".
The points we need to address when talking about `const` are:
- `Const` values are stored in read-only memory - it's impossible to modify.
- Values resulting from calling a `const fn` are materialized at compile-time.
- The compiler may (or may not) copy `const` values wherever it chooses.
## Read-Only
The first point is a bit strange - "read-only memory."
[The Rust book](https://doc.rust-lang.org/book/ch03-01-variables-and-mutability.html#differences-between-variables-and-constants)
mentions in a couple places that using `mut` with constants is illegal,
but it's also important to demonstrate just how immutable they are. *Typically* in Rust
you can use "inner mutability" to modify things that aren't declared `mut`.
[`RefCell`](https://doc.rust-lang.org/std/cell/struct.RefCell.html) provides an API
to guarantee at runtime that some consistency rules are enforced:
you can use [interior mutability](https://doc.rust-lang.org/book/ch15-05-interior-mutability.html)
to modify things that aren't declared `mut`.
[`RefCell`](https://doc.rust-lang.org/std/cell/struct.RefCell.html) provides an
example of this pattern in action:
```rust
use std::cell::RefCell;
@ -55,7 +64,7 @@ fn main() {
```
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8e4bea1a718edaff4507944e825a54b2)
When `const` is involved though, modifications are silently ignored:
When `const` is involved though, interior mutability is impossible:
```rust
use std::cell::RefCell;
@ -95,10 +104,12 @@ fn main() {
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=c3cc5979b5e5434eca0f9ec4a06ee0ed)
When the [`const` specification](https://github.com/rust-lang/rfcs/blob/26197104b7bb9a5a35db243d639aee6e46d35d75/text/0246-const-vs-static.md)
refers to ["rvalues"](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3055.pdf), this is
what they mean. [Clippy](https://github.com/rust-lang/rust-clippy) will treat this as an error,
refers to ["rvalues"](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3055.pdf), this behavior
is what they refer to. [Clippy](https://github.com/rust-lang/rust-clippy) will treat this as an error,
but it's still something to be aware of.
## Initialization == Compilation
The next thing to mention is that `const` values are loaded into memory *as part of your program binary*.
Because of this, any `const` values declared in your program will be "realized" at compile-time;
accessing them may trigger a main-memory lookup (with a fixed address, so your CPU may
@ -110,13 +121,16 @@ use std::cell::RefCell;
const CELL: RefCell<u32> = RefCell::new(24);
pub fn multiply(value: u32) -> u32 {
// CELL is stored at `.L__unnamed_1`
value * (*CELL.get_mut())
}
```
-- [Compiler Explorer](https://godbolt.org/z/2KXUcN)
-- [Compiler Explorer](https://godbolt.org/z/Th8boO)
The compiler only creates one `RefCell`, and uses it everywhere. However, that value
is fully realized at compile time, and is fully stored in the `.L__unnamed_1` section.
The compiler creates one `RefCell`, uses it everywhere, and never
needs to call the `RefCell::new` function.
## Copying
If it's helpful though, the compiler can choose to copy `const` values.
@ -124,22 +138,24 @@ If it's helpful though, the compiler can choose to copy `const` values.
const FACTOR: u32 = 1000;
pub fn multiply(value: u32) -> u32 {
// See assembly line 4 for the `mov edi, 1000` instruction
value * FACTOR
}
pub fn multiply_twice(value: u32) -> u32 {
// See assembly lines 22 and 29 for `mov edi, 1000` instructions
value * FACTOR * FACTOR
}
```
-- [Compiler Explorer](https://godbolt.org/z/_JiT9O)
-- [Compiler Explorer](https://godbolt.org/z/ZtS54X)
In this example, the `FACTOR` value is turned into the `mov edi, 1000` instruction
in both the `multiply` and `multiply_twice` functions; the "1000" value is never
"stored" anywhere, as it's small enough to inline into the assembly instructions.
Finally, getting the address of a `const` value is possible but not guaranteed
to be unique (given that the compiler can choose to copy values). In my testing
I was never able to get the compiler to copy a `const` value and get differing pointers,
Finally, getting the address of a `const` value is possible, but not guaranteed
to be unique (because the compiler can choose to copy values). I was unable to
get non-unique pointers in my testing (even using different crates),
but the specifications are clear enough: *don't rely on pointers to `const`
values being consistent*. To be frank, caring about locations for `const` values
is almost certainly a code smell.
@ -147,20 +163,19 @@ is almost certainly a code smell.
# **static**
Static variables are related to `const` variables, but take a slightly different approach.
When the compiler can guarantee that a *reference* is fixed for the life of a program,
you end up with a `static` variable (as opposed to *values* that are fixed for the
duration a program is running). Because of this reference/value distinction,
static variables behave much more like what people expect from "global" variables.
We'll look at regular static variables first, and then address the `lazy_static!()`
and `thread_local!()` macros later.
When we declare that a *reference* is unique for the life of a program,
you have a `static` variable (unrelated to the `'static` lifetime). Because of the
reference/value distinction with `const`/`static`,
static variables behave much more like typical "global" variables.
More generally, `static` variables are globally unique locations in memory,
the contents of which are loaded as part of your program being read into main memory.
They allow initialization with both raw values and `const fn` calls, and the initial
value is loaded along with the program/library binary. All static variables must
be of a type that implements the [`Sync`](https://doc.rust-lang.org/std/marker/trait.Sync.html)
marker trait. And while `static mut` variables are allowed, mutating a static is considered
an `unsafe` operation.
But to understand `static`, here's what we'll look at:
- `static` variables are globally unique locations in memory.
- Like `const`, `static` variables are loaded at the same time as your program being read into memory.
- All `static` variables must implement the [`Sync`](https://doc.rust-lang.org/std/marker/trait.Sync.html)
marker trait.
- Interior mutability is safe and acceptable when using `static` variables.
## Memory Uniqueness
The single biggest difference between `const` and `static` is the guarantees
provided about uniqueness. Where `const` variables may or may not be copied
@ -171,20 +186,24 @@ in code, `static` variables are guarantee to be unique. If we take a previous
static FACTOR: u32 = 1000;
pub fn multiply(value: u32) -> u32 {
// The assembly to `mul dword ptr [rip + example::FACTOR]` is how FACTOR gets used
value * FACTOR
}
pub fn multiply_twice(value: u32) -> u32 {
// The assembly to `mul dword ptr [rip + example::FACTOR]` is how FACTOR gets used
value * FACTOR * FACTOR
}
```
-- [Compiler Explorer](https://godbolt.org/z/bSfBxn)
-- [Compiler Explorer](https://godbolt.org/z/uxmiRQ)
Where [previously](https://godbolt.org/z/_JiT90) there were plenty of
Where [previously](#copying) there were plenty of
references to multiplying by 1000, the new assembly refers to `FACTOR`
as a named memory location instead. No initialization work needs to be done,
but the compiler can no longer prove the value never changes during execution.
## Initialization == Compilation
Next, let's talk about initialization. The simplest case is initializing
static variables with either scalar or struct notation:
@ -208,7 +227,7 @@ fn main() {
```
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=b538dbc46076f12db047af4f4403ee6e)
Things get a bit weirder when using `const fn`. In most cases, things just work:
Things can get a bit weirder when using `const fn` though. In most cases, it just works:
```rust
#[derive(Debug)]
@ -231,9 +250,9 @@ fn main() {
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8c796a6e7fc273c12115091b707b0255)
However, there's a caveat: you're currently not allowed to use `const fn` to initialize
static variables of types that aren't marked `Sync`. As an example, even though
static variables of types that aren't marked `Sync`. For example,
[`RefCell::new()`](https://doc.rust-lang.org/std/cell/struct.RefCell.html#method.new)
is `const fn`, because [`RefCell` isn't `Sync`](https://doc.rust-lang.org/std/cell/struct.RefCell.html#impl-Sync),
is a `const fn`, but because [`RefCell` isn't `Sync`](https://doc.rust-lang.org/std/cell/struct.RefCell.html#impl-Sync),
you'll get an error at compile time:
```rust
@ -246,16 +265,18 @@ static MY_LOCK: RefCell<u8> = RefCell::new(0);
It's likely that this will [change in the future](https://github.com/rust-lang/rfcs/blob/master/text/0911-const-fn.md) though.
## **Sync**
Which leads well to the next point: static variable types must implement the
[`Sync` marker](https://doc.rust-lang.org/std/marker/trait.Sync.html).
Because they're globally unique, it must be safe for you to access static variables
from any thread at any time. Most `struct` definitions automatically implement the
`Sync` trait because they contain only elements which themselves
implement `Sync`. This is why earlier examples could get away with initializing
implement `Sync` (read more in the [Nomicon](https://doc.rust-lang.org/nomicon/send-and-sync.html)).
This is why earlier examples could get away with initializing
statics, even though we never included an `impl Sync for MyStruct` in the code.
For more on the `Sync` trait, the [Nomicon](https://doc.rust-lang.org/nomicon/send-and-sync.html)
has a much more thorough treatment. But as an example, Rust refuses to compile
our earlier example if we add a non-`Sync` element to the `struct` definition:
To demonstrate this property, Rust refuses to compile our earlier
example if we add a non-`Sync` element to the `struct` definition:
```rust
use std::cell::RefCell;
@ -273,8 +294,11 @@ static MY_STRUCT: MyStruct = MyStruct {
```
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=40074d0248f056c296b662dbbff97cfc)
## Interior Mutability
Finally, while `static mut` variables are allowed, mutating them is an `unsafe` operation.
Unlike `const` however, interior mutability is acceptable. To demonstrate:
If we want to stay in `safe` Rust, we can use interior mutability to accomplish
similar goals:
```rust
use std::sync::Once;

View File

@ -6,10 +6,10 @@ category:
tags: [rust, understanding-allocations]
---
`const` and `static` are perfectly fine, but it's very rare that we know
`const` and `static` are perfectly fine, but it's relatively rare that we know
at compile-time about either values or references that will be the same for the
duration of our program. Put another way, it's not often the case that either you
or your compiler knows how much memory your entire program will need.
or your compiler knows how much memory your entire program will ever need.
However, there are still some optimizations the compiler can do if it knows how much
memory individual functions will need. Specifically, the compiler can make use of
@ -19,9 +19,9 @@ both the short- and long-term. When requesting memory, the
can typically complete in [1 or 2 cycles](https://agner.org/optimize/instruction_tables.ods)
(<1 nanosecond on modern CPUs). Contrast that to heap memory which requires an allocator
(specialized software to track what memory is in use) to reserve space.
And when you're finished with your memory, the `pop` instruction likewise runs in
When you're finished with stack memory, the `pop` instruction runs in
1-3 cycles, as opposed to an allocator needing to worry about memory fragmentation
and other issues. All sorts of incredibly sophisticated techniques have been used
and other issues with the heap. All sorts of incredibly sophisticated techniques have been used
to design allocators:
- [Garbage Collection](https://en.wikipedia.org/wiki/Garbage_collection_(computer_science))
strategies like [Tracing](https://en.wikipedia.org/wiki/Tracing_garbage_collection)
@ -37,7 +37,7 @@ But no matter how fast your allocator is, the principle remains: the
fastest allocator is the one you never use. As such, we're not going to discuss how exactly the
[`push` and `pop` instructions work](http://www.cs.virginia.edu/~evans/cs216/guides/x86.html),
but we'll focus instead on the conditions that enable the Rust compiler to use
the faster stack-based allocation for variables.
faster stack-based allocation for variables.
So, **how do we know when Rust will or will not use stack allocation for objects we create?**
Looking at other languages, it's often easy to delineate
@ -46,14 +46,14 @@ between stack and heap. Managed memory languages (Python, Java,
place everything on the heap. JIT compilers ([PyPy](https://www.pypy.org/),
[HotSpot](https://www.oracle.com/technetwork/java/javase/tech/index-jsp-136373.html)) may
optimize some heap allocations away, but you should never assume it will happen.
C makes things clear with calls to special functions ([malloc(3)](https://linux.die.net/man/3/malloc)
is one) being the way to use heap memory. Old C++ has the [`new`](https://stackoverflow.com/a/655086/1454178)
C makes things clear with calls to special functions (like [malloc(3)](https://linux.die.net/man/3/malloc))
needed to access heap memory. Old C++ has the [`new`](https://stackoverflow.com/a/655086/1454178)
keyword, though modern C++/C++11 is more complicated with [RAII](https://en.cppreference.com/w/cpp/language/raii).
For Rust specifically, the principle is this: **stack allocation will be used for everything
that doesn't involve "smart pointers" and collections.** We'll skip over a precise definition
of the term "smart pointer" for now, and instead discuss what we should watch for when talking
about the memory region used for allocation:
For Rust, we can summarize as follows: **stack allocation will be used for everything
that doesn't involve "smart pointers" and collections**. We'll skip over a precise definition
of the term "smart pointer" for now, and instead discuss what we should watch for to understand
when stack and heap memory regions are used:
1. Stack manipulation instructions (`push`, `pop`, and `add`/`sub` of the `rsp` register)
indicate allocation of stack memory:
@ -68,7 +68,7 @@ about the memory region used for allocation:
```
-- [Compiler Explorer](https://godbolt.org/z/5WSgc9)
2. Tracking when exactly heap allocation calls happen is difficult. It's typically easier to
2. Tracking when exactly heap allocation calls occur is difficult. It's typically easier to
watch for `call core::ptr::real_drop_in_place`, and infer that a heap allocation happened
in the recent past:
```rust
@ -200,7 +200,7 @@ pub fn total_distance() {
-- [Compiler Explorer](https://godbolt.org/z/Qmx4ST)
As a consequence of function arguments never using heap memory, we can also
infer that functions using the `#[inline]` attributes also do not heap-allocate.
infer that functions using the `#[inline]` attributes also do not heap allocate.
But better than inferring, we can look at the assembly to prove it:
```rust
@ -239,8 +239,42 @@ pub fn total_distance() {
Finally, passing by value (arguments with type
[`Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html))
and passing by reference (either moving ownership or passing a pointer) may have
[slightly different layouts in assembly](https://godbolt.org/z/sKi_kl), but will
still use either stack memory or CPU registers.
slightly different layouts in assembly, but will still use either stack memory
or CPU registers:
```rust
pub struct Point {
x: i64,
y: i64,
}
// Moving values
pub fn distance_moved(a: Point, b: Point) -> i64 {
let x1 = a.x;
let x2 = b.x;
let y1 = a.y;
let y2 = b.y;
let x_pow = (x1 - x2) * (x1 - x2);
let y_pow = (y1 - y2) * (y1 - y2);
let squared = x_pow + y_pow;
squared / squared
}
// Borrowing values has two extra `mov` instructions on lines 21 and 22
pub fn distance_borrowed(a: &Point, b: &Point) -> i64 {
let x1 = a.x;
let x2 = b.x;
let y1 = a.y;
let y2 = b.y;
let x_pow = (x1 - x2) * (x1 - x2);
let y_pow = (y1 - y2) * (y1 - y2);
let squared = x_pow + y_pow;
squared / squared
}
```
-- [Compiler Explorer](https://godbolt.org/z/06hGiv)
# Enums
@ -340,9 +374,9 @@ both bind *everything* by reference normally, but Python can also
In Rust, arguments to closures are the same as arguments to other functions;
closures are simply functions that don't have a declared name. Some weird ordering
of the stack may be required to handle them, but it's the compiler's responsiblity
to figure it out.
to figure that out.
Each example below has the same effect, but compile to very different programs.
Each example below has the same effect, but a different assembly implementation.
In the simplest case, we immediately run a closure returned by another function.
Because we don't store a reference to the closure, the stack memory needed to
store the captured values is contiguous:
@ -457,7 +491,7 @@ used for objects that aren't heap allocated, but it technically can be done.
Understanding move semantics and copy semantics in Rust is weird at first. The Rust docs
[go into detail](https://doc.rust-lang.org/stable/core/marker/trait.Copy.html)
far better than can be addressed here, so I'll leave them to do the job.
Even from a memory perspective though, their guideline is reasonable:
From a memory perspective though, their guideline is reasonable:
[if your type can implemement `Copy`, it should](https://doc.rust-lang.org/stable/core/marker/trait.Copy.html#when-should-my-type-be-copy).
While there are potential speed tradeoffs to *benchmark* when discussing `Copy`
(move semantics for stack objects vs. copying stack pointers vs. copying stack `struct`s),
@ -471,8 +505,7 @@ because it's a marker trait. From there we'll note that a type
if (and only if) its components implement `Copy`, and that
[no heap-allocated types implement `Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html#implementors).
Thus, assignments involving heap types are always move semantics, and new heap
allocations won't occur without explicit calls to
[`clone()`](https://doc.rust-lang.org/std/clone/trait.Clone.html#tymethod.clone).
allocations won't occur because of implicit operator behavior.
```rust
#[derive(Clone)]
@ -490,8 +523,8 @@ struct NotCopyable {
# Iterators
In [managed memory languages](https://www.youtube.com/watch?v=bSkpMdDe4g4&feature=youtu.be&t=357)
(like Java), there's a subtle difference between these two code samples:
In managed memory languages (like [Java](https://www.youtube.com/watch?v=bSkpMdDe4g4&feature=youtu.be&t=357)),
there's a subtle difference between these two code samples:
```java
public static int sum_for(List<Long> vals) {
@ -522,8 +555,7 @@ once the function ends. Sounds exactly like the issue stack-allocated objects ad
In Rust, iterators are allocated on the stack. The objects to iterate over are almost
certainly in heap memory, but the iterator itself
([`Iter`](https://doc.rust-lang.org/std/slice/struct.Iter.html)) doesn't need to use the heap.
In each of the examples below we iterate over a collection, but will never need to allocate
a object on the heap to clean up:
In each of the examples below we iterate over a collection, but never use heap allocation:
```rust
use std::collections::HashMap;

View File

@ -12,11 +12,11 @@ how the language uses dynamic memory (also referred to as the **heap**) is a sys
And as the docs mention, ownership
[is Rust's most unique feature](https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html).
The heap is used in two situations: when the compiler is unable to predict the *total size
of memory needed*, or *how long the memory is needed for*, it will allocate space in the heap.
The heap is used in two situations; when the compiler is unable to predict either the *total size
of memory needed*, or *how long the memory is needed for*, it allocates space in the heap.
This happens pretty frequently; if you want to download the Google home page, you won't know
how large it is until your program runs. And when you're finished with Google, whenever that
happens to be, we deallocate the memory so it can be used to store other webpages. If you're
how large it is until your program runs. And when you're finished with Google, we deallocate
the memory so it can be used to store other webpages. If you're
interested in a slightly longer explanation of the heap, check out
[The Stack and the Heap](https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html#the-stack-and-the-heap)
in Rust's documentation.
@ -32,8 +32,8 @@ To start off, take a guess for how many allocations happen in the program below:
fn main() {}
```
It's obviously a trick question; while no heap allocations happen as a result of
the code listed above, the setup needed to call `main` does allocate on the heap.
It's obviously a trick question; while no heap allocations occur as a result of
that code, the setup needed to call `main` does allocate on the heap.
Here's a way to show it:
```rust
@ -78,8 +78,8 @@ we'll follow this guide:
Finally, there are two "addendum" issues that are important to address when discussing
Rust and the heap:
- Stack-based alternatives to some standard library types are available
- Special allocators to track memory behavior are available
- Non-heap alternatives to many standard library types are available.
- Special allocators to track memory behavior should be used to benchmark code.
# Smart pointers
@ -99,7 +99,7 @@ crate should look mostly familiar:
- [`Cow`](https://doc.rust-lang.org/alloc/borrow/enum.Cow.html)
The [standard library](https://doc.rust-lang.org/std/) also defines some smart pointers
to manage heap objects, though more than can be covered here. Some examples:
to manage heap objects, though more than can be covered here. Some examples are:
- [`RwLock`](https://doc.rust-lang.org/std/sync/struct.RwLock.html)
- [`Mutex`](https://doc.rust-lang.org/std/sync/struct.Mutex.html)
@ -112,8 +112,8 @@ have more information.
When a smart pointer is created, the data it is given is placed in heap memory and
the location of that data is recorded in the smart pointer. Once the smart pointer
has determined it's safe to deallocate that memory (when a `Box` has
[gone out of scope](https://doc.rust-lang.org/stable/std/boxed/index.html) or when
reference count for an object [goes to zero](https://doc.rust-lang.org/alloc/rc/index.html)),
[gone out of scope](https://doc.rust-lang.org/stable/std/boxed/index.html) or a
reference count [goes to zero](https://doc.rust-lang.org/alloc/rc/index.html)),
the heap space is reclaimed. We can prove these types use heap memory by
looking at code:
@ -146,18 +146,18 @@ pub fn my_cow() {
# Collections
Collections types use heap memory because their contents have dynamic size; they will request
Collection types use heap memory because their contents have dynamic size; they will request
more memory [when needed](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.reserve),
and can [release memory](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.shrink_to_fit)
when it's no longer necessary. This dynamic property forces Rust to heap allocate
everything they contain. In a way, **collections are smart pointers for many objects at once.**
everything they contain. In a way, **collections are smart pointers for many objects at a time**.
Common types that fall under this umbrella are
[`Vec`](https://doc.rust-lang.org/stable/alloc/vec/struct.Vec.html),
[`HashMap`](https://doc.rust-lang.org/stable/std/collections/struct.HashMap.html), and
[`String`](https://doc.rust-lang.org/stable/alloc/string/struct.String.html)
(not [`&str`](https://doc.rust-lang.org/std/primitive.str.html)).
(not [`str`](https://doc.rust-lang.org/std/primitive.str.html)).
But while collections store the objects they own in heap memory, *creating new collections
While collections store the objects they own in heap memory, *creating new collections
will not allocate on the heap*. This is a bit weird; if we call `Vec::new()`, the
assembly shows a corresponding call to `real_drop_in_place`:
@ -169,7 +169,7 @@ pub fn my_vec() {
```
-- [Compiler Explorer](https://godbolt.org/z/1WkNtC)
But because the vector has no elements it is managing, no calls to the allocator
But because the vector has no elements to manage, no calls to the allocator
will ever be dispatched:
```rust
@ -218,12 +218,12 @@ and [`String::new()`](https://doc.rust-lang.org/std/string/struct.String.html#me
# Heap Alternatives
While it is a bit strange for us to talk of the stack after spending time with the heap,
While it is a bit strange to speak of the stack after spending time with the heap,
it's worth pointing out that some heap-allocated objects in Rust have stack-based counterparts
provided by other crates. If you have need of the functionality, but want to avoid allocating,
there are some great alternatives.
there are typically alternatives available.
When it comes to some of the standard library smart pointers
When it comes to some standard library smart pointers
([`RwLock`](https://doc.rust-lang.org/std/sync/struct.RwLock.html) and
[`Mutex`](https://doc.rust-lang.org/std/sync/struct.Mutex.html)), stack-based alternatives
are provided in crates like [parking_lot](https://crates.io/crates/parking_lot) and
@ -233,10 +233,9 @@ are provided in crates like [parking_lot](https://crates.io/crates/parking_lot)
[`spin::Once`](https://mvdnes.github.io/rust-docs/spin-rs/spin/struct.Once.html)
if you're in need of synchronization primitives.
[thread_id](https://crates.io/crates/thread-id)
may still be necessary if you're implementing an allocator (*cough cough* the author *cough cough*)
[thread_id](https://crates.io/crates/thread-id) may be necessary if you're implementing an allocator
because [`thread::current().id()`](https://doc.rust-lang.org/std/thread/struct.ThreadId.html)
[uses a `thread_local!` structure](https://doc.rust-lang.org/stable/src/std/sys_common/thread_info.rs.html#22-40)
uses a [`thread_local!` structure](https://doc.rust-lang.org/stable/src/std/sys_common/thread_info.rs.html#17-36)
that needs heap allocation.
# Tracing Allocators
@ -248,7 +247,6 @@ You should never rely on your instincts when
[a microsecond is an eternity](https://www.youtube.com/watch?v=NH1Tta7purM).
Similarly, there's great work going on in Rust with allocators that keep track of what
they're doing. [`alloc_counter`](https://crates.io/crates/alloc_counter) was designed
for exactly this purpose. When it comes to tracking heap behavior, you shouldn't just
rely on the language; please measure and make sure that you have tools in place to catch
any issues that come up.
they're doing (like [`alloc_counter`](https://crates.io/crates/alloc_counter)).
When it comes to tracking heap behavior, it's easy to make mistakes;
please write tests and make sure you have tools to guard against future issues.

View File

@ -12,25 +12,25 @@ We've spent time showing how those rules work themselves out in practice,
and become familiar with reading the assembly code needed to see each memory
type (global, stack, heap) in action.
But throughout the content so far, we've put a handicap on the code.
But throughout the series so far, we've put a handicap on the code.
In the name of consistent and understandable results, we've asked the
compiler to pretty please leave the training wheels on. Now is the time
where we throw out all the rules and take the kid gloves off. As it turns out,
where we throw out all the rules and take off the kid gloves. As it turns out,
both the Rust compiler and the LLVM optimizers are incredibly sophisticated,
and we'll step back and let them do their job.
Similar to ["What Has My Compiler Done For Me Lately?"](https://www.youtube.com/watch?v=bSkpMdDe4g4),
we're focusing on interesting things the Rust language (and LLVM!) can do
as regards memory management. We'll still be looking at assembly code to
with memory management. We'll still be looking at assembly code to
understand what's going on, but it's important to mention again:
**please use automated tools like
[alloc-counter](https://crates.io/crates/alloc_counter) to double-check
memory behavior if it's something you care about**.
It's far too easy to mis-read assembly in large code sections, you should
always have an automated tool verify behavior if you care about memory usage.
always verify behavior if you care about memory usage.
The guiding principal as we move forward is this: *optimizing compilers
won't produce worse assembly than we started with.* There won't be any
won't produce worse programs than we started with.* There won't be any
situations where stack allocations get moved to heap allocations.
There will, however, be an opera of optimization.
@ -40,7 +40,7 @@ Our first optimization comes when LLVM can reason that the lifetime of an object
is sufficiently short that heap allocations aren't necessary. In these cases,
LLVM will move the allocation to the stack instead! The way this interacts
with `#[inline]` attributes is a bit opaque, but the important part is that LLVM
can sometimes do better than the baseline Rust language.
can sometimes do better than the baseline Rust language:
```rust
use std::alloc::{GlobalAlloc, Layout, System};
@ -87,13 +87,13 @@ unsafe impl GlobalAlloc for PanicAllocator {
With some collections, LLVM can predict how large they will become
and allocate the entire size on the stack instead of the heap.
This works whether with both the pre-allocation (`Vec::with_capacity`)
*and re-allocation* (`Vec::push`) methods for collections types.
Not only can LLVM predict sizing if you reserve the fully size up front,
This works with both the pre-allocation (`Vec::with_capacity`)
*and re-allocation* (`Vec::push`) methods for collection types.
Not only can LLVM predict sizing if you reserve everything up front,
it can see through the resizing operations and find the total size.
While this specific optimization is unlikely to come up in production
usage, it's cool to note that LLVM does a considerable amount of work
to understand what code actually does.
to understand what the code will do:
```rust
use std::alloc::{GlobalAlloc, Layout, System};
@ -104,13 +104,16 @@ fn main() {
DO_PANIC.store(true, Ordering::SeqCst);
// If the compiler can predict how large a vector will be,
// it can optimize out the heap storage needed. This also
// works with `Vec::with_capacity()`, but the push case
// is a bit more interesting.
// it can optimize out the heap storage needed.
let mut x: Vec<u64> = Vec::new();
x.push(12);
assert_eq!(x[0], 12);
let mut y: Vec<u64> = Vec::with_capacity(1);
y.push(12);
assert_eq!(x[0], y[0]);
drop(x);
drop(y);
// Turn off panicking, as there are some deallocations
// when we exit main.
@ -138,21 +141,21 @@ unsafe impl GlobalAlloc for PanicAllocator {
}
}
```
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=1dfccfcf63d8800e644a3b948f1eeb7b)
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=af660a87b2cd94213afb906beeb32c15)
# Dr. Array or: How I Learned to Love the Optimizer
Finally, this isn't so much about LLVM figuring out different memory behavior,
but LLVM totally stripping out code that has no side effects. Optimizations of
but LLVM stripping out code that doesn't do anything. Optimizations of
this type have a lot of nuance to them; if you're not careful, they can
make your benchmarks look
[impossibly good](https://www.youtube.com/watch?v=nXaxk27zwlk&feature=youtu.be&t=1199).
In Rust, the `black_box` function (in both
In Rust, the `black_box` function (implemented in both
[`libtest`](https://doc.rust-lang.org/1.1.0/test/fn.black_box.html) and
[`criterion`](https://docs.rs/criterion/0.2.10/criterion/fn.black_box.html))
will tell the compiler to disable this kind of optimization. But if you let
LLVM remove unnecessary code, you can end up with programs that
would have previously caused errors running just fine:
LLVM remove unnecessary code, you can end up running programs that
previously caused errors:
```rust
#[derive(Default)]
@ -183,5 +186,5 @@ pub fn main() {
let _x = EightM::default();
}
```
-- [Compiler Explorer](https://godbolt.org/z/daHn7P)
-- [Compiler Explorer](https://godbolt.org/z/daHn7P)
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=4c253bf26072119896ab93c6ef064dc0)

View File

@ -9,16 +9,8 @@ tags: [rust, understanding-allocations]
While there's a lot of interesting detail captured in this series, it's often helpful
to have a document that answers some "yes/no" questions. You may not care about
what an `Iterator` looks like in assembly, you just need to know whether it allocates
an object on the heap or not.
To that end, it should be said once again: if you care about memory behavior,
use an allocator to verify the correct behavior. Tools like
[`alloc_counter`](https://crates.io/crates/alloc_counter) are designed to make
testing this behavior simple easy.
Finally, a summary of the content that's been covered. Rust will prioritize
the fastest behavior it can, but here are the ground rules for understanding
the memory model in Rust:
an object on the heap or not. And while Rust will prioritize the fastest behavior it can,
here are the rules for each memory type:
**Heap Allocation**:
- Smart pointers (`Box`, `Rc`, `Mutex`, etc.) allocate their contents in heap memory.
@ -27,7 +19,7 @@ the memory model in Rust:
don't need heap memory. If possible, use those.
**Stack Allocation**:
- Everything not using a smart pointer type will be allocated on the stack.
- Everything not using a smart pointer will be allocated on the stack.
- Structs, enums, iterators, arrays, and closures are all stack allocated.
- Cell types (`RefCell`) behave like smart pointers, but are stack-allocated.
- Inlining (`#[inline]`) will not affect allocation behavior for better or worse.
@ -37,14 +29,5 @@ the memory model in Rust:
- `const` is a fixed value; the compiler is allowed to copy it wherever useful.
- `static` is a fixed reference; the compiler will guarantee it is unique.
And a nice visualizaton of the rules, courtesy of
[Raph Levien](https://docs.google.com/presentation/d/1q-c7UAyrUlM-eZyTo1pd8SZ0qwA_wYxmPZVOQkoDmH4/edit?usp=sharing):
![Container Sizes in Rust](/assets/images/2019-02-04-container-size.svg)
---
If you've taken the time to read through this series: thanks. I've enjoyed the
process that went into writing this, both in building new tools and learning
the material well enough to explain it. I hope this is valuable as a reference
to you as well.
-- [Raph Levien](https://docs.google.com/presentation/d/1q-c7UAyrUlM-eZyTo1pd8SZ0qwA_wYxmPZVOQkoDmH4/edit?usp=sharing)