mirror of
https://github.com/bspeice/speice.io
synced 2024-11-05 01:28:09 -05:00
Final draft!
I think.
This commit is contained in:
parent
f3dad2a34d
commit
d1153d07f6
@ -7,33 +7,42 @@ tags: [rust, understanding-allocations]
|
||||
---
|
||||
|
||||
The first memory type we'll look at is pretty special: when Rust can prove that
|
||||
a *value* is fixed for the life of a program (`const`), and when a *reference* is valid for
|
||||
the duration of the program (`static` as a declaration, not
|
||||
a *value* is fixed for the life of a program (`const`), and when a *reference* is unique for
|
||||
the life of a program (`static` as a declaration, not
|
||||
[`'static`](https://doc.rust-lang.org/book/ch10-03-lifetime-syntax.html#the-static-lifetime)
|
||||
as a lifetime).
|
||||
Understanding the distinction between value and reference is important for reasons
|
||||
we'll go into below. The
|
||||
as a lifetime), we can make use of global memory. This special section of data is embedded
|
||||
directly in the program binary so that variables are ready to go once the program loads;
|
||||
no additional computation is necessary.
|
||||
|
||||
Understanding the value/reference distinction is important for reasons we'll go into below,
|
||||
and while the
|
||||
[full specification](https://github.com/rust-lang/rfcs/blob/master/text/0246-const-vs-static.md)
|
||||
for these two memory types is available, but we'll take a hands-on approach to the topic.
|
||||
for these two keywords is available, we'll take a hands-on approach to the topic.
|
||||
|
||||
# **const**
|
||||
|
||||
The quick summary is this: `const` declares a read-only block of memory that is loaded
|
||||
as part of your program binary (during the call to [exec(3)](https://linux.die.net/man/3/exec)).
|
||||
Any `const` value resulting from calling a `const fn` is guaranteed to be materialized
|
||||
at compile-time (meaning that access at runtime will not invoke the `const fn`),
|
||||
even though the `const fn` functions are available at run-time as well. The compiler
|
||||
can choose to copy the constant value wherever it is deemed practical. Getting the address
|
||||
of a `const` value is legal, but not guaranteed to be the same even when referring to the
|
||||
same named identifier.
|
||||
When a *value* is guaranteed to be unchanging in your program (where "value" may be scalars,
|
||||
`struct`s, etc.), you can declare it `const`.
|
||||
This tells the compiler that it's safe to treat the value as never changing, and enables
|
||||
some interesting optimizations; not only is there no initialization cost to
|
||||
creating the value (it is loaded at the same time as the executable parts of your program),
|
||||
but the compiler can also copy the value around if it speeds up the code.
|
||||
|
||||
The first point is a bit strange - "read-only memory".
|
||||
The points we need to address when talking about `const` are:
|
||||
- `Const` values are stored in read-only memory - it's impossible to modify.
|
||||
- Values resulting from calling a `const fn` are materialized at compile-time.
|
||||
- The compiler may (or may not) copy `const` values wherever it chooses.
|
||||
|
||||
## Read-Only
|
||||
|
||||
The first point is a bit strange - "read-only memory."
|
||||
[The Rust book](https://doc.rust-lang.org/book/ch03-01-variables-and-mutability.html#differences-between-variables-and-constants)
|
||||
mentions in a couple places that using `mut` with constants is illegal,
|
||||
but it's also important to demonstrate just how immutable they are. *Typically* in Rust
|
||||
you can use "inner mutability" to modify things that aren't declared `mut`.
|
||||
[`RefCell`](https://doc.rust-lang.org/std/cell/struct.RefCell.html) provides an API
|
||||
to guarantee at runtime that some consistency rules are enforced:
|
||||
you can use [interior mutability](https://doc.rust-lang.org/book/ch15-05-interior-mutability.html)
|
||||
to modify things that aren't declared `mut`.
|
||||
[`RefCell`](https://doc.rust-lang.org/std/cell/struct.RefCell.html) provides an
|
||||
example of this pattern in action:
|
||||
|
||||
```rust
|
||||
use std::cell::RefCell;
|
||||
@ -55,7 +64,7 @@ fn main() {
|
||||
```
|
||||
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8e4bea1a718edaff4507944e825a54b2)
|
||||
|
||||
When `const` is involved though, modifications are silently ignored:
|
||||
When `const` is involved though, interior mutability is impossible:
|
||||
|
||||
```rust
|
||||
use std::cell::RefCell;
|
||||
@ -95,10 +104,12 @@ fn main() {
|
||||
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=c3cc5979b5e5434eca0f9ec4a06ee0ed)
|
||||
|
||||
When the [`const` specification](https://github.com/rust-lang/rfcs/blob/26197104b7bb9a5a35db243d639aee6e46d35d75/text/0246-const-vs-static.md)
|
||||
refers to ["rvalues"](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3055.pdf), this is
|
||||
what they mean. [Clippy](https://github.com/rust-lang/rust-clippy) will treat this as an error,
|
||||
refers to ["rvalues"](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3055.pdf), this behavior
|
||||
is what they refer to. [Clippy](https://github.com/rust-lang/rust-clippy) will treat this as an error,
|
||||
but it's still something to be aware of.
|
||||
|
||||
## Initialization == Compilation
|
||||
|
||||
The next thing to mention is that `const` values are loaded into memory *as part of your program binary*.
|
||||
Because of this, any `const` values declared in your program will be "realized" at compile-time;
|
||||
accessing them may trigger a main-memory lookup (with a fixed address, so your CPU may
|
||||
@ -110,13 +121,16 @@ use std::cell::RefCell;
|
||||
const CELL: RefCell<u32> = RefCell::new(24);
|
||||
|
||||
pub fn multiply(value: u32) -> u32 {
|
||||
// CELL is stored at `.L__unnamed_1`
|
||||
value * (*CELL.get_mut())
|
||||
}
|
||||
```
|
||||
-- [Compiler Explorer](https://godbolt.org/z/2KXUcN)
|
||||
-- [Compiler Explorer](https://godbolt.org/z/Th8boO)
|
||||
|
||||
The compiler only creates one `RefCell`, and uses it everywhere. However, that value
|
||||
is fully realized at compile time, and is fully stored in the `.L__unnamed_1` section.
|
||||
The compiler creates one `RefCell`, uses it everywhere, and never
|
||||
needs to call the `RefCell::new` function.
|
||||
|
||||
## Copying
|
||||
|
||||
If it's helpful though, the compiler can choose to copy `const` values.
|
||||
|
||||
@ -124,22 +138,24 @@ If it's helpful though, the compiler can choose to copy `const` values.
|
||||
const FACTOR: u32 = 1000;
|
||||
|
||||
pub fn multiply(value: u32) -> u32 {
|
||||
// See assembly line 4 for the `mov edi, 1000` instruction
|
||||
value * FACTOR
|
||||
}
|
||||
|
||||
pub fn multiply_twice(value: u32) -> u32 {
|
||||
// See assembly lines 22 and 29 for `mov edi, 1000` instructions
|
||||
value * FACTOR * FACTOR
|
||||
}
|
||||
```
|
||||
-- [Compiler Explorer](https://godbolt.org/z/_JiT9O)
|
||||
-- [Compiler Explorer](https://godbolt.org/z/ZtS54X)
|
||||
|
||||
In this example, the `FACTOR` value is turned into the `mov edi, 1000` instruction
|
||||
in both the `multiply` and `multiply_twice` functions; the "1000" value is never
|
||||
"stored" anywhere, as it's small enough to inline into the assembly instructions.
|
||||
|
||||
Finally, getting the address of a `const` value is possible but not guaranteed
|
||||
to be unique (given that the compiler can choose to copy values). In my testing
|
||||
I was never able to get the compiler to copy a `const` value and get differing pointers,
|
||||
Finally, getting the address of a `const` value is possible, but not guaranteed
|
||||
to be unique (because the compiler can choose to copy values). I was unable to
|
||||
get non-unique pointers in my testing (even using different crates),
|
||||
but the specifications are clear enough: *don't rely on pointers to `const`
|
||||
values being consistent*. To be frank, caring about locations for `const` values
|
||||
is almost certainly a code smell.
|
||||
@ -147,20 +163,19 @@ is almost certainly a code smell.
|
||||
# **static**
|
||||
|
||||
Static variables are related to `const` variables, but take a slightly different approach.
|
||||
When the compiler can guarantee that a *reference* is fixed for the life of a program,
|
||||
you end up with a `static` variable (as opposed to *values* that are fixed for the
|
||||
duration a program is running). Because of this reference/value distinction,
|
||||
static variables behave much more like what people expect from "global" variables.
|
||||
We'll look at regular static variables first, and then address the `lazy_static!()`
|
||||
and `thread_local!()` macros later.
|
||||
When we declare that a *reference* is unique for the life of a program,
|
||||
you have a `static` variable (unrelated to the `'static` lifetime). Because of the
|
||||
reference/value distinction with `const`/`static`,
|
||||
static variables behave much more like typical "global" variables.
|
||||
|
||||
More generally, `static` variables are globally unique locations in memory,
|
||||
the contents of which are loaded as part of your program being read into main memory.
|
||||
They allow initialization with both raw values and `const fn` calls, and the initial
|
||||
value is loaded along with the program/library binary. All static variables must
|
||||
be of a type that implements the [`Sync`](https://doc.rust-lang.org/std/marker/trait.Sync.html)
|
||||
marker trait. And while `static mut` variables are allowed, mutating a static is considered
|
||||
an `unsafe` operation.
|
||||
But to understand `static`, here's what we'll look at:
|
||||
- `static` variables are globally unique locations in memory.
|
||||
- Like `const`, `static` variables are loaded at the same time as your program being read into memory.
|
||||
- All `static` variables must implement the [`Sync`](https://doc.rust-lang.org/std/marker/trait.Sync.html)
|
||||
marker trait.
|
||||
- Interior mutability is safe and acceptable when using `static` variables.
|
||||
|
||||
## Memory Uniqueness
|
||||
|
||||
The single biggest difference between `const` and `static` is the guarantees
|
||||
provided about uniqueness. Where `const` variables may or may not be copied
|
||||
@ -171,20 +186,24 @@ in code, `static` variables are guarantee to be unique. If we take a previous
|
||||
static FACTOR: u32 = 1000;
|
||||
|
||||
pub fn multiply(value: u32) -> u32 {
|
||||
// The assembly to `mul dword ptr [rip + example::FACTOR]` is how FACTOR gets used
|
||||
value * FACTOR
|
||||
}
|
||||
|
||||
pub fn multiply_twice(value: u32) -> u32 {
|
||||
// The assembly to `mul dword ptr [rip + example::FACTOR]` is how FACTOR gets used
|
||||
value * FACTOR * FACTOR
|
||||
}
|
||||
```
|
||||
-- [Compiler Explorer](https://godbolt.org/z/bSfBxn)
|
||||
-- [Compiler Explorer](https://godbolt.org/z/uxmiRQ)
|
||||
|
||||
Where [previously](https://godbolt.org/z/_JiT90) there were plenty of
|
||||
Where [previously](#copying) there were plenty of
|
||||
references to multiplying by 1000, the new assembly refers to `FACTOR`
|
||||
as a named memory location instead. No initialization work needs to be done,
|
||||
but the compiler can no longer prove the value never changes during execution.
|
||||
|
||||
## Initialization == Compilation
|
||||
|
||||
Next, let's talk about initialization. The simplest case is initializing
|
||||
static variables with either scalar or struct notation:
|
||||
|
||||
@ -208,7 +227,7 @@ fn main() {
|
||||
```
|
||||
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=b538dbc46076f12db047af4f4403ee6e)
|
||||
|
||||
Things get a bit weirder when using `const fn`. In most cases, things just work:
|
||||
Things can get a bit weirder when using `const fn` though. In most cases, it just works:
|
||||
|
||||
```rust
|
||||
#[derive(Debug)]
|
||||
@ -231,9 +250,9 @@ fn main() {
|
||||
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8c796a6e7fc273c12115091b707b0255)
|
||||
|
||||
However, there's a caveat: you're currently not allowed to use `const fn` to initialize
|
||||
static variables of types that aren't marked `Sync`. As an example, even though
|
||||
static variables of types that aren't marked `Sync`. For example,
|
||||
[`RefCell::new()`](https://doc.rust-lang.org/std/cell/struct.RefCell.html#method.new)
|
||||
is `const fn`, because [`RefCell` isn't `Sync`](https://doc.rust-lang.org/std/cell/struct.RefCell.html#impl-Sync),
|
||||
is a `const fn`, but because [`RefCell` isn't `Sync`](https://doc.rust-lang.org/std/cell/struct.RefCell.html#impl-Sync),
|
||||
you'll get an error at compile time:
|
||||
|
||||
```rust
|
||||
@ -246,16 +265,18 @@ static MY_LOCK: RefCell<u8> = RefCell::new(0);
|
||||
|
||||
It's likely that this will [change in the future](https://github.com/rust-lang/rfcs/blob/master/text/0911-const-fn.md) though.
|
||||
|
||||
## **Sync**
|
||||
|
||||
Which leads well to the next point: static variable types must implement the
|
||||
[`Sync` marker](https://doc.rust-lang.org/std/marker/trait.Sync.html).
|
||||
Because they're globally unique, it must be safe for you to access static variables
|
||||
from any thread at any time. Most `struct` definitions automatically implement the
|
||||
`Sync` trait because they contain only elements which themselves
|
||||
implement `Sync`. This is why earlier examples could get away with initializing
|
||||
implement `Sync` (read more in the [Nomicon](https://doc.rust-lang.org/nomicon/send-and-sync.html)).
|
||||
This is why earlier examples could get away with initializing
|
||||
statics, even though we never included an `impl Sync for MyStruct` in the code.
|
||||
For more on the `Sync` trait, the [Nomicon](https://doc.rust-lang.org/nomicon/send-and-sync.html)
|
||||
has a much more thorough treatment. But as an example, Rust refuses to compile
|
||||
our earlier example if we add a non-`Sync` element to the `struct` definition:
|
||||
To demonstrate this property, Rust refuses to compile our earlier
|
||||
example if we add a non-`Sync` element to the `struct` definition:
|
||||
|
||||
```rust
|
||||
use std::cell::RefCell;
|
||||
@ -273,8 +294,11 @@ static MY_STRUCT: MyStruct = MyStruct {
|
||||
```
|
||||
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=40074d0248f056c296b662dbbff97cfc)
|
||||
|
||||
## Interior Mutability
|
||||
|
||||
Finally, while `static mut` variables are allowed, mutating them is an `unsafe` operation.
|
||||
Unlike `const` however, interior mutability is acceptable. To demonstrate:
|
||||
If we want to stay in `safe` Rust, we can use interior mutability to accomplish
|
||||
similar goals:
|
||||
|
||||
```rust
|
||||
use std::sync::Once;
|
||||
|
@ -6,10 +6,10 @@ category:
|
||||
tags: [rust, understanding-allocations]
|
||||
---
|
||||
|
||||
`const` and `static` are perfectly fine, but it's very rare that we know
|
||||
`const` and `static` are perfectly fine, but it's relatively rare that we know
|
||||
at compile-time about either values or references that will be the same for the
|
||||
duration of our program. Put another way, it's not often the case that either you
|
||||
or your compiler knows how much memory your entire program will need.
|
||||
or your compiler knows how much memory your entire program will ever need.
|
||||
|
||||
However, there are still some optimizations the compiler can do if it knows how much
|
||||
memory individual functions will need. Specifically, the compiler can make use of
|
||||
@ -19,9 +19,9 @@ both the short- and long-term. When requesting memory, the
|
||||
can typically complete in [1 or 2 cycles](https://agner.org/optimize/instruction_tables.ods)
|
||||
(<1 nanosecond on modern CPUs). Contrast that to heap memory which requires an allocator
|
||||
(specialized software to track what memory is in use) to reserve space.
|
||||
And when you're finished with your memory, the `pop` instruction likewise runs in
|
||||
When you're finished with stack memory, the `pop` instruction runs in
|
||||
1-3 cycles, as opposed to an allocator needing to worry about memory fragmentation
|
||||
and other issues. All sorts of incredibly sophisticated techniques have been used
|
||||
and other issues with the heap. All sorts of incredibly sophisticated techniques have been used
|
||||
to design allocators:
|
||||
- [Garbage Collection](https://en.wikipedia.org/wiki/Garbage_collection_(computer_science))
|
||||
strategies like [Tracing](https://en.wikipedia.org/wiki/Tracing_garbage_collection)
|
||||
@ -37,7 +37,7 @@ But no matter how fast your allocator is, the principle remains: the
|
||||
fastest allocator is the one you never use. As such, we're not going to discuss how exactly the
|
||||
[`push` and `pop` instructions work](http://www.cs.virginia.edu/~evans/cs216/guides/x86.html),
|
||||
but we'll focus instead on the conditions that enable the Rust compiler to use
|
||||
the faster stack-based allocation for variables.
|
||||
faster stack-based allocation for variables.
|
||||
|
||||
So, **how do we know when Rust will or will not use stack allocation for objects we create?**
|
||||
Looking at other languages, it's often easy to delineate
|
||||
@ -46,14 +46,14 @@ between stack and heap. Managed memory languages (Python, Java,
|
||||
place everything on the heap. JIT compilers ([PyPy](https://www.pypy.org/),
|
||||
[HotSpot](https://www.oracle.com/technetwork/java/javase/tech/index-jsp-136373.html)) may
|
||||
optimize some heap allocations away, but you should never assume it will happen.
|
||||
C makes things clear with calls to special functions ([malloc(3)](https://linux.die.net/man/3/malloc)
|
||||
is one) being the way to use heap memory. Old C++ has the [`new`](https://stackoverflow.com/a/655086/1454178)
|
||||
C makes things clear with calls to special functions (like [malloc(3)](https://linux.die.net/man/3/malloc))
|
||||
needed to access heap memory. Old C++ has the [`new`](https://stackoverflow.com/a/655086/1454178)
|
||||
keyword, though modern C++/C++11 is more complicated with [RAII](https://en.cppreference.com/w/cpp/language/raii).
|
||||
|
||||
For Rust specifically, the principle is this: **stack allocation will be used for everything
|
||||
that doesn't involve "smart pointers" and collections.** We'll skip over a precise definition
|
||||
of the term "smart pointer" for now, and instead discuss what we should watch for when talking
|
||||
about the memory region used for allocation:
|
||||
For Rust, we can summarize as follows: **stack allocation will be used for everything
|
||||
that doesn't involve "smart pointers" and collections**. We'll skip over a precise definition
|
||||
of the term "smart pointer" for now, and instead discuss what we should watch for to understand
|
||||
when stack and heap memory regions are used:
|
||||
|
||||
1. Stack manipulation instructions (`push`, `pop`, and `add`/`sub` of the `rsp` register)
|
||||
indicate allocation of stack memory:
|
||||
@ -68,7 +68,7 @@ about the memory region used for allocation:
|
||||
```
|
||||
-- [Compiler Explorer](https://godbolt.org/z/5WSgc9)
|
||||
|
||||
2. Tracking when exactly heap allocation calls happen is difficult. It's typically easier to
|
||||
2. Tracking when exactly heap allocation calls occur is difficult. It's typically easier to
|
||||
watch for `call core::ptr::real_drop_in_place`, and infer that a heap allocation happened
|
||||
in the recent past:
|
||||
```rust
|
||||
@ -200,7 +200,7 @@ pub fn total_distance() {
|
||||
-- [Compiler Explorer](https://godbolt.org/z/Qmx4ST)
|
||||
|
||||
As a consequence of function arguments never using heap memory, we can also
|
||||
infer that functions using the `#[inline]` attributes also do not heap-allocate.
|
||||
infer that functions using the `#[inline]` attributes also do not heap allocate.
|
||||
But better than inferring, we can look at the assembly to prove it:
|
||||
|
||||
```rust
|
||||
@ -239,8 +239,42 @@ pub fn total_distance() {
|
||||
Finally, passing by value (arguments with type
|
||||
[`Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html))
|
||||
and passing by reference (either moving ownership or passing a pointer) may have
|
||||
[slightly different layouts in assembly](https://godbolt.org/z/sKi_kl), but will
|
||||
still use either stack memory or CPU registers.
|
||||
slightly different layouts in assembly, but will still use either stack memory
|
||||
or CPU registers:
|
||||
|
||||
```rust
|
||||
pub struct Point {
|
||||
x: i64,
|
||||
y: i64,
|
||||
}
|
||||
|
||||
// Moving values
|
||||
pub fn distance_moved(a: Point, b: Point) -> i64 {
|
||||
let x1 = a.x;
|
||||
let x2 = b.x;
|
||||
let y1 = a.y;
|
||||
let y2 = b.y;
|
||||
|
||||
let x_pow = (x1 - x2) * (x1 - x2);
|
||||
let y_pow = (y1 - y2) * (y1 - y2);
|
||||
let squared = x_pow + y_pow;
|
||||
squared / squared
|
||||
}
|
||||
|
||||
// Borrowing values has two extra `mov` instructions on lines 21 and 22
|
||||
pub fn distance_borrowed(a: &Point, b: &Point) -> i64 {
|
||||
let x1 = a.x;
|
||||
let x2 = b.x;
|
||||
let y1 = a.y;
|
||||
let y2 = b.y;
|
||||
|
||||
let x_pow = (x1 - x2) * (x1 - x2);
|
||||
let y_pow = (y1 - y2) * (y1 - y2);
|
||||
let squared = x_pow + y_pow;
|
||||
squared / squared
|
||||
}
|
||||
```
|
||||
-- [Compiler Explorer](https://godbolt.org/z/06hGiv)
|
||||
|
||||
# Enums
|
||||
|
||||
@ -340,9 +374,9 @@ both bind *everything* by reference normally, but Python can also
|
||||
In Rust, arguments to closures are the same as arguments to other functions;
|
||||
closures are simply functions that don't have a declared name. Some weird ordering
|
||||
of the stack may be required to handle them, but it's the compiler's responsiblity
|
||||
to figure it out.
|
||||
to figure that out.
|
||||
|
||||
Each example below has the same effect, but compile to very different programs.
|
||||
Each example below has the same effect, but a different assembly implementation.
|
||||
In the simplest case, we immediately run a closure returned by another function.
|
||||
Because we don't store a reference to the closure, the stack memory needed to
|
||||
store the captured values is contiguous:
|
||||
@ -457,7 +491,7 @@ used for objects that aren't heap allocated, but it technically can be done.
|
||||
Understanding move semantics and copy semantics in Rust is weird at first. The Rust docs
|
||||
[go into detail](https://doc.rust-lang.org/stable/core/marker/trait.Copy.html)
|
||||
far better than can be addressed here, so I'll leave them to do the job.
|
||||
Even from a memory perspective though, their guideline is reasonable:
|
||||
From a memory perspective though, their guideline is reasonable:
|
||||
[if your type can implemement `Copy`, it should](https://doc.rust-lang.org/stable/core/marker/trait.Copy.html#when-should-my-type-be-copy).
|
||||
While there are potential speed tradeoffs to *benchmark* when discussing `Copy`
|
||||
(move semantics for stack objects vs. copying stack pointers vs. copying stack `struct`s),
|
||||
@ -471,8 +505,7 @@ because it's a marker trait. From there we'll note that a type
|
||||
if (and only if) its components implement `Copy`, and that
|
||||
[no heap-allocated types implement `Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html#implementors).
|
||||
Thus, assignments involving heap types are always move semantics, and new heap
|
||||
allocations won't occur without explicit calls to
|
||||
[`clone()`](https://doc.rust-lang.org/std/clone/trait.Clone.html#tymethod.clone).
|
||||
allocations won't occur because of implicit operator behavior.
|
||||
|
||||
```rust
|
||||
#[derive(Clone)]
|
||||
@ -490,8 +523,8 @@ struct NotCopyable {
|
||||
|
||||
# Iterators
|
||||
|
||||
In [managed memory languages](https://www.youtube.com/watch?v=bSkpMdDe4g4&feature=youtu.be&t=357)
|
||||
(like Java), there's a subtle difference between these two code samples:
|
||||
In managed memory languages (like [Java](https://www.youtube.com/watch?v=bSkpMdDe4g4&feature=youtu.be&t=357)),
|
||||
there's a subtle difference between these two code samples:
|
||||
|
||||
```java
|
||||
public static int sum_for(List<Long> vals) {
|
||||
@ -522,8 +555,7 @@ once the function ends. Sounds exactly like the issue stack-allocated objects ad
|
||||
In Rust, iterators are allocated on the stack. The objects to iterate over are almost
|
||||
certainly in heap memory, but the iterator itself
|
||||
([`Iter`](https://doc.rust-lang.org/std/slice/struct.Iter.html)) doesn't need to use the heap.
|
||||
In each of the examples below we iterate over a collection, but will never need to allocate
|
||||
a object on the heap to clean up:
|
||||
In each of the examples below we iterate over a collection, but never use heap allocation:
|
||||
|
||||
```rust
|
||||
use std::collections::HashMap;
|
||||
|
@ -12,11 +12,11 @@ how the language uses dynamic memory (also referred to as the **heap**) is a sys
|
||||
And as the docs mention, ownership
|
||||
[is Rust's most unique feature](https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html).
|
||||
|
||||
The heap is used in two situations: when the compiler is unable to predict the *total size
|
||||
of memory needed*, or *how long the memory is needed for*, it will allocate space in the heap.
|
||||
The heap is used in two situations; when the compiler is unable to predict either the *total size
|
||||
of memory needed*, or *how long the memory is needed for*, it allocates space in the heap.
|
||||
This happens pretty frequently; if you want to download the Google home page, you won't know
|
||||
how large it is until your program runs. And when you're finished with Google, whenever that
|
||||
happens to be, we deallocate the memory so it can be used to store other webpages. If you're
|
||||
how large it is until your program runs. And when you're finished with Google, we deallocate
|
||||
the memory so it can be used to store other webpages. If you're
|
||||
interested in a slightly longer explanation of the heap, check out
|
||||
[The Stack and the Heap](https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html#the-stack-and-the-heap)
|
||||
in Rust's documentation.
|
||||
@ -32,8 +32,8 @@ To start off, take a guess for how many allocations happen in the program below:
|
||||
fn main() {}
|
||||
```
|
||||
|
||||
It's obviously a trick question; while no heap allocations happen as a result of
|
||||
the code listed above, the setup needed to call `main` does allocate on the heap.
|
||||
It's obviously a trick question; while no heap allocations occur as a result of
|
||||
that code, the setup needed to call `main` does allocate on the heap.
|
||||
Here's a way to show it:
|
||||
|
||||
```rust
|
||||
@ -78,8 +78,8 @@ we'll follow this guide:
|
||||
|
||||
Finally, there are two "addendum" issues that are important to address when discussing
|
||||
Rust and the heap:
|
||||
- Stack-based alternatives to some standard library types are available
|
||||
- Special allocators to track memory behavior are available
|
||||
- Non-heap alternatives to many standard library types are available.
|
||||
- Special allocators to track memory behavior should be used to benchmark code.
|
||||
|
||||
# Smart pointers
|
||||
|
||||
@ -99,7 +99,7 @@ crate should look mostly familiar:
|
||||
- [`Cow`](https://doc.rust-lang.org/alloc/borrow/enum.Cow.html)
|
||||
|
||||
The [standard library](https://doc.rust-lang.org/std/) also defines some smart pointers
|
||||
to manage heap objects, though more than can be covered here. Some examples:
|
||||
to manage heap objects, though more than can be covered here. Some examples are:
|
||||
- [`RwLock`](https://doc.rust-lang.org/std/sync/struct.RwLock.html)
|
||||
- [`Mutex`](https://doc.rust-lang.org/std/sync/struct.Mutex.html)
|
||||
|
||||
@ -112,8 +112,8 @@ have more information.
|
||||
When a smart pointer is created, the data it is given is placed in heap memory and
|
||||
the location of that data is recorded in the smart pointer. Once the smart pointer
|
||||
has determined it's safe to deallocate that memory (when a `Box` has
|
||||
[gone out of scope](https://doc.rust-lang.org/stable/std/boxed/index.html) or when
|
||||
reference count for an object [goes to zero](https://doc.rust-lang.org/alloc/rc/index.html)),
|
||||
[gone out of scope](https://doc.rust-lang.org/stable/std/boxed/index.html) or a
|
||||
reference count [goes to zero](https://doc.rust-lang.org/alloc/rc/index.html)),
|
||||
the heap space is reclaimed. We can prove these types use heap memory by
|
||||
looking at code:
|
||||
|
||||
@ -146,18 +146,18 @@ pub fn my_cow() {
|
||||
|
||||
# Collections
|
||||
|
||||
Collections types use heap memory because their contents have dynamic size; they will request
|
||||
Collection types use heap memory because their contents have dynamic size; they will request
|
||||
more memory [when needed](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.reserve),
|
||||
and can [release memory](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.shrink_to_fit)
|
||||
when it's no longer necessary. This dynamic property forces Rust to heap allocate
|
||||
everything they contain. In a way, **collections are smart pointers for many objects at once.**
|
||||
everything they contain. In a way, **collections are smart pointers for many objects at a time**.
|
||||
Common types that fall under this umbrella are
|
||||
[`Vec`](https://doc.rust-lang.org/stable/alloc/vec/struct.Vec.html),
|
||||
[`HashMap`](https://doc.rust-lang.org/stable/std/collections/struct.HashMap.html), and
|
||||
[`String`](https://doc.rust-lang.org/stable/alloc/string/struct.String.html)
|
||||
(not [`&str`](https://doc.rust-lang.org/std/primitive.str.html)).
|
||||
(not [`str`](https://doc.rust-lang.org/std/primitive.str.html)).
|
||||
|
||||
But while collections store the objects they own in heap memory, *creating new collections
|
||||
While collections store the objects they own in heap memory, *creating new collections
|
||||
will not allocate on the heap*. This is a bit weird; if we call `Vec::new()`, the
|
||||
assembly shows a corresponding call to `real_drop_in_place`:
|
||||
|
||||
@ -169,7 +169,7 @@ pub fn my_vec() {
|
||||
```
|
||||
-- [Compiler Explorer](https://godbolt.org/z/1WkNtC)
|
||||
|
||||
But because the vector has no elements it is managing, no calls to the allocator
|
||||
But because the vector has no elements to manage, no calls to the allocator
|
||||
will ever be dispatched:
|
||||
|
||||
```rust
|
||||
@ -218,12 +218,12 @@ and [`String::new()`](https://doc.rust-lang.org/std/string/struct.String.html#me
|
||||
|
||||
# Heap Alternatives
|
||||
|
||||
While it is a bit strange for us to talk of the stack after spending time with the heap,
|
||||
While it is a bit strange to speak of the stack after spending time with the heap,
|
||||
it's worth pointing out that some heap-allocated objects in Rust have stack-based counterparts
|
||||
provided by other crates. If you have need of the functionality, but want to avoid allocating,
|
||||
there are some great alternatives.
|
||||
there are typically alternatives available.
|
||||
|
||||
When it comes to some of the standard library smart pointers
|
||||
When it comes to some standard library smart pointers
|
||||
([`RwLock`](https://doc.rust-lang.org/std/sync/struct.RwLock.html) and
|
||||
[`Mutex`](https://doc.rust-lang.org/std/sync/struct.Mutex.html)), stack-based alternatives
|
||||
are provided in crates like [parking_lot](https://crates.io/crates/parking_lot) and
|
||||
@ -233,10 +233,9 @@ are provided in crates like [parking_lot](https://crates.io/crates/parking_lot)
|
||||
[`spin::Once`](https://mvdnes.github.io/rust-docs/spin-rs/spin/struct.Once.html)
|
||||
if you're in need of synchronization primitives.
|
||||
|
||||
[thread_id](https://crates.io/crates/thread-id)
|
||||
may still be necessary if you're implementing an allocator (*cough cough* the author *cough cough*)
|
||||
[thread_id](https://crates.io/crates/thread-id) may be necessary if you're implementing an allocator
|
||||
because [`thread::current().id()`](https://doc.rust-lang.org/std/thread/struct.ThreadId.html)
|
||||
[uses a `thread_local!` structure](https://doc.rust-lang.org/stable/src/std/sys_common/thread_info.rs.html#22-40)
|
||||
uses a [`thread_local!` structure](https://doc.rust-lang.org/stable/src/std/sys_common/thread_info.rs.html#17-36)
|
||||
that needs heap allocation.
|
||||
|
||||
# Tracing Allocators
|
||||
@ -248,7 +247,6 @@ You should never rely on your instincts when
|
||||
[a microsecond is an eternity](https://www.youtube.com/watch?v=NH1Tta7purM).
|
||||
|
||||
Similarly, there's great work going on in Rust with allocators that keep track of what
|
||||
they're doing. [`alloc_counter`](https://crates.io/crates/alloc_counter) was designed
|
||||
for exactly this purpose. When it comes to tracking heap behavior, you shouldn't just
|
||||
rely on the language; please measure and make sure that you have tools in place to catch
|
||||
any issues that come up.
|
||||
they're doing (like [`alloc_counter`](https://crates.io/crates/alloc_counter)).
|
||||
When it comes to tracking heap behavior, it's easy to make mistakes;
|
||||
please write tests and make sure you have tools to guard against future issues.
|
||||
|
@ -12,25 +12,25 @@ We've spent time showing how those rules work themselves out in practice,
|
||||
and become familiar with reading the assembly code needed to see each memory
|
||||
type (global, stack, heap) in action.
|
||||
|
||||
But throughout the content so far, we've put a handicap on the code.
|
||||
But throughout the series so far, we've put a handicap on the code.
|
||||
In the name of consistent and understandable results, we've asked the
|
||||
compiler to pretty please leave the training wheels on. Now is the time
|
||||
where we throw out all the rules and take the kid gloves off. As it turns out,
|
||||
where we throw out all the rules and take off the kid gloves. As it turns out,
|
||||
both the Rust compiler and the LLVM optimizers are incredibly sophisticated,
|
||||
and we'll step back and let them do their job.
|
||||
|
||||
Similar to ["What Has My Compiler Done For Me Lately?"](https://www.youtube.com/watch?v=bSkpMdDe4g4),
|
||||
we're focusing on interesting things the Rust language (and LLVM!) can do
|
||||
as regards memory management. We'll still be looking at assembly code to
|
||||
with memory management. We'll still be looking at assembly code to
|
||||
understand what's going on, but it's important to mention again:
|
||||
**please use automated tools like
|
||||
[alloc-counter](https://crates.io/crates/alloc_counter) to double-check
|
||||
memory behavior if it's something you care about**.
|
||||
It's far too easy to mis-read assembly in large code sections, you should
|
||||
always have an automated tool verify behavior if you care about memory usage.
|
||||
always verify behavior if you care about memory usage.
|
||||
|
||||
The guiding principal as we move forward is this: *optimizing compilers
|
||||
won't produce worse assembly than we started with.* There won't be any
|
||||
won't produce worse programs than we started with.* There won't be any
|
||||
situations where stack allocations get moved to heap allocations.
|
||||
There will, however, be an opera of optimization.
|
||||
|
||||
@ -40,7 +40,7 @@ Our first optimization comes when LLVM can reason that the lifetime of an object
|
||||
is sufficiently short that heap allocations aren't necessary. In these cases,
|
||||
LLVM will move the allocation to the stack instead! The way this interacts
|
||||
with `#[inline]` attributes is a bit opaque, but the important part is that LLVM
|
||||
can sometimes do better than the baseline Rust language.
|
||||
can sometimes do better than the baseline Rust language:
|
||||
|
||||
```rust
|
||||
use std::alloc::{GlobalAlloc, Layout, System};
|
||||
@ -87,13 +87,13 @@ unsafe impl GlobalAlloc for PanicAllocator {
|
||||
|
||||
With some collections, LLVM can predict how large they will become
|
||||
and allocate the entire size on the stack instead of the heap.
|
||||
This works whether with both the pre-allocation (`Vec::with_capacity`)
|
||||
*and re-allocation* (`Vec::push`) methods for collections types.
|
||||
Not only can LLVM predict sizing if you reserve the fully size up front,
|
||||
This works with both the pre-allocation (`Vec::with_capacity`)
|
||||
*and re-allocation* (`Vec::push`) methods for collection types.
|
||||
Not only can LLVM predict sizing if you reserve everything up front,
|
||||
it can see through the resizing operations and find the total size.
|
||||
While this specific optimization is unlikely to come up in production
|
||||
usage, it's cool to note that LLVM does a considerable amount of work
|
||||
to understand what code actually does.
|
||||
to understand what the code will do:
|
||||
|
||||
```rust
|
||||
use std::alloc::{GlobalAlloc, Layout, System};
|
||||
@ -104,13 +104,16 @@ fn main() {
|
||||
DO_PANIC.store(true, Ordering::SeqCst);
|
||||
|
||||
// If the compiler can predict how large a vector will be,
|
||||
// it can optimize out the heap storage needed. This also
|
||||
// works with `Vec::with_capacity()`, but the push case
|
||||
// is a bit more interesting.
|
||||
// it can optimize out the heap storage needed.
|
||||
let mut x: Vec<u64> = Vec::new();
|
||||
x.push(12);
|
||||
assert_eq!(x[0], 12);
|
||||
|
||||
let mut y: Vec<u64> = Vec::with_capacity(1);
|
||||
y.push(12);
|
||||
|
||||
assert_eq!(x[0], y[0]);
|
||||
drop(x);
|
||||
drop(y);
|
||||
|
||||
// Turn off panicking, as there are some deallocations
|
||||
// when we exit main.
|
||||
@ -138,21 +141,21 @@ unsafe impl GlobalAlloc for PanicAllocator {
|
||||
}
|
||||
}
|
||||
```
|
||||
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=1dfccfcf63d8800e644a3b948f1eeb7b)
|
||||
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=af660a87b2cd94213afb906beeb32c15)
|
||||
|
||||
# Dr. Array or: How I Learned to Love the Optimizer
|
||||
|
||||
Finally, this isn't so much about LLVM figuring out different memory behavior,
|
||||
but LLVM totally stripping out code that has no side effects. Optimizations of
|
||||
but LLVM stripping out code that doesn't do anything. Optimizations of
|
||||
this type have a lot of nuance to them; if you're not careful, they can
|
||||
make your benchmarks look
|
||||
[impossibly good](https://www.youtube.com/watch?v=nXaxk27zwlk&feature=youtu.be&t=1199).
|
||||
In Rust, the `black_box` function (in both
|
||||
In Rust, the `black_box` function (implemented in both
|
||||
[`libtest`](https://doc.rust-lang.org/1.1.0/test/fn.black_box.html) and
|
||||
[`criterion`](https://docs.rs/criterion/0.2.10/criterion/fn.black_box.html))
|
||||
will tell the compiler to disable this kind of optimization. But if you let
|
||||
LLVM remove unnecessary code, you can end up with programs that
|
||||
would have previously caused errors running just fine:
|
||||
LLVM remove unnecessary code, you can end up running programs that
|
||||
previously caused errors:
|
||||
|
||||
```rust
|
||||
#[derive(Default)]
|
||||
@ -183,5 +186,5 @@ pub fn main() {
|
||||
let _x = EightM::default();
|
||||
}
|
||||
```
|
||||
-- [Compiler Explorer](https://godbolt.org/z/daHn7P)
|
||||
-- [Compiler Explorer](https://godbolt.org/z/daHn7P)
|
||||
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=4c253bf26072119896ab93c6ef064dc0)
|
||||
|
@ -9,16 +9,8 @@ tags: [rust, understanding-allocations]
|
||||
While there's a lot of interesting detail captured in this series, it's often helpful
|
||||
to have a document that answers some "yes/no" questions. You may not care about
|
||||
what an `Iterator` looks like in assembly, you just need to know whether it allocates
|
||||
an object on the heap or not.
|
||||
|
||||
To that end, it should be said once again: if you care about memory behavior,
|
||||
use an allocator to verify the correct behavior. Tools like
|
||||
[`alloc_counter`](https://crates.io/crates/alloc_counter) are designed to make
|
||||
testing this behavior simple easy.
|
||||
|
||||
Finally, a summary of the content that's been covered. Rust will prioritize
|
||||
the fastest behavior it can, but here are the ground rules for understanding
|
||||
the memory model in Rust:
|
||||
an object on the heap or not. And while Rust will prioritize the fastest behavior it can,
|
||||
here are the rules for each memory type:
|
||||
|
||||
**Heap Allocation**:
|
||||
- Smart pointers (`Box`, `Rc`, `Mutex`, etc.) allocate their contents in heap memory.
|
||||
@ -27,7 +19,7 @@ the memory model in Rust:
|
||||
don't need heap memory. If possible, use those.
|
||||
|
||||
**Stack Allocation**:
|
||||
- Everything not using a smart pointer type will be allocated on the stack.
|
||||
- Everything not using a smart pointer will be allocated on the stack.
|
||||
- Structs, enums, iterators, arrays, and closures are all stack allocated.
|
||||
- Cell types (`RefCell`) behave like smart pointers, but are stack-allocated.
|
||||
- Inlining (`#[inline]`) will not affect allocation behavior for better or worse.
|
||||
@ -37,14 +29,5 @@ the memory model in Rust:
|
||||
- `const` is a fixed value; the compiler is allowed to copy it wherever useful.
|
||||
- `static` is a fixed reference; the compiler will guarantee it is unique.
|
||||
|
||||
And a nice visualizaton of the rules, courtesy of
|
||||
[Raph Levien](https://docs.google.com/presentation/d/1q-c7UAyrUlM-eZyTo1pd8SZ0qwA_wYxmPZVOQkoDmH4/edit?usp=sharing):
|
||||
|
||||
![Container Sizes in Rust](/assets/images/2019-02-04-container-size.svg)
|
||||
|
||||
---
|
||||
|
||||
If you've taken the time to read through this series: thanks. I've enjoyed the
|
||||
process that went into writing this, both in building new tools and learning
|
||||
the material well enough to explain it. I hope this is valuable as a reference
|
||||
to you as well.
|
||||
-- [Raph Levien](https://docs.google.com/presentation/d/1q-c7UAyrUlM-eZyTo1pd8SZ0qwA_wYxmPZVOQkoDmH4/edit?usp=sharing)
|
||||
|
Loading…
Reference in New Issue
Block a user