Finish up all the code sections

Need to do some final writing, cleanup, and then publish on Monday?
This commit is contained in:
Bradlee Speice 2019-02-09 22:11:53 -05:00
parent 05e0f68c23
commit 31af7290ba
No known key found for this signature in database
GPG Key ID: 48BEA6257238E620
3 changed files with 242 additions and 47 deletions

View File

@ -12,18 +12,21 @@ how the language uses dynamic memory (also referred to as the **heap**) is a sys
And as the docs mention, ownership
[is Rust's most unique feature](https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html).
The heap is used in two situations; when the compiler is unable to predict either the *total size
The heap is used in two situations: when the compiler is unable to predict the *total size
of memory needed*, or *how long the memory is needed for*, it will allocate space in the heap.
This happens pretty frequently; if you want to download the Google home page, you won't know
how large it is until your program runs. And when you're finished with Google, whenever that might be,
we deallocate the memory so it can be used to store other webpages.
how large it is until your program runs. And when you're finished with Google, whenever that
happens to be, we deallocate the memory so it can be used to store other webpages. If you're
interested in a slightly longer explanation of the heap, check out
[The Stack and the Heap](https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html#the-stack-and-the-heap)
in Rust's documentation.
We won't go into detail on how the heap is managed; the
[ownership documentation](https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html)
does a phenomenal job explaining both the "why" and "how" of memory management. Instead,
we're going to focus on understanding "when" heap allocations occur in Rust.
To start off: take a guess for how many allocations happen in the program below:
To start off, take a guess for how many allocations happen in the program below:
```rust
fn main() {}
@ -72,8 +75,11 @@ we'll follow this guide:
- Smart pointers hold their contents in the heap
- Collections are smart pointers for many objects at a time, and reallocate
when they need to grow
- `lazy_static!` and `thread_local!` force heap allocation for everything.
- Stack-based alternatives to standard library types should be preferred (spin, parking_lot)
Finally, there are two "addendum" issues that are important to address when discussing
Rust and the heap:
- Stack-based alternatives to some standard library types are available
- Special allocators to track memory behavior are available
# Smart pointers
@ -98,10 +104,10 @@ to manage heap objects, though more than can be covered here. Some examples:
- [`Mutex`](https://doc.rust-lang.org/std/sync/struct.Mutex.html)
Finally, there is one ["gotcha"](https://www.merriam-webster.com/dictionary/gotcha):
cell types (like [`RefCell`](https://doc.rust-lang.org/stable/core/cell/struct.RefCell.html))
follow the RAII pattern, but don't involve heap allocation. Check out the
**cell types** (like [`RefCell`](https://doc.rust-lang.org/stable/core/cell/struct.RefCell.html))
look and behave similarly, but **don't involve heap allocation**. The
[`core::cell` docs](https://doc.rust-lang.org/stable/core/cell/index.html)
for more information.
have more information.
When a smart pointer is created, the data it is given is placed in heap memory and
the location of that data is recorded in the smart pointer. Once the smart pointer
@ -117,40 +123,43 @@ use std::sync::Arc;
use std::borrow::Cow;
pub fn my_box() {
// Drop at line 1640
// Drop at assembly line 1640
Box::new(0);
}
pub fn my_rc() {
// Drop at line 1650
// Drop at assembly line 1650
Rc::new(0);
}
pub fn my_arc() {
// Drop at line 1660
// Drop at assembly line 1660
Arc::new(0);
}
pub fn my_cow() {
// Drop at line 1672
// Drop at assembly line 1672
Cow::from("drop");
}
```
-- [Compiler Explorer](https://godbolt.org/z/SaDpWg)
-- [Compiler Explorer](https://godbolt.org/z/4AMQug)
# Collections
Collections types use heap memory because they have dynamic size; they will request more memory
[when needed](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.reserve),
Collections types use heap memory because their contents have dynamic size; they will request
more memory [when needed](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.reserve),
and can [release memory](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.shrink_to_fit)
when it's no longer necessary. This dynamic memory usage forces Rust to heap allocate
when it's no longer necessary. This dynamic property forces Rust to heap allocate
everything they contain. In a way, **collections are smart pointers for many objects at once.**
Common types that fall under this umbrella are `Vec`, `HashMap`, and `String`
Common types that fall under this umbrella are
[`Vec`](https://doc.rust-lang.org/stable/alloc/vec/struct.Vec.html),
[`HashMap`](https://doc.rust-lang.org/stable/std/collections/struct.HashMap.html), and
[`String`](https://doc.rust-lang.org/stable/alloc/string/struct.String.html)
(not [`&str`](https://doc.rust-lang.org/std/primitive.str.html)).
But while collections store the objects they own in heap memory, *creating new collections
will not allocate on the heap*. This is a bit weird, because if we call `Vec::new()` the
assembly shows a corresponding call to `drop_in_place`:
will not allocate on the heap*. This is a bit weird; if we call `Vec::new()`, the
assembly shows a corresponding call to `real_drop_in_place`:
```rust
pub fn my_vec() {
@ -161,27 +170,58 @@ pub fn my_vec() {
-- [Compiler Explorer](https://godbolt.org/z/1WkNtC)
But because the vector has no elements it is managing, no calls to the allocator
will ever be dispatched. A couple of places to look at for confirming this behavior:
[`Vec::new()`](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.new),
will ever be dispatched:
```rust
use std::alloc::{GlobalAlloc, Layout, System};
use std::sync::atomic::{AtomicBool, Ordering};
fn main() {
// Turn on panicking if we allocate on the heap
DO_PANIC.store(true, Ordering::SeqCst);
// Interesting bit happens here
let x: Vec<u8> = Vec::new();
drop(x);
// Turn panicking back off, some deallocations occur
// after main as well.
DO_PANIC.store(false, Ordering::SeqCst);
}
#[global_allocator]
static A: PanicAllocator = PanicAllocator;
static DO_PANIC: AtomicBool = AtomicBool::new(false);
struct PanicAllocator;
unsafe impl GlobalAlloc for PanicAllocator {
unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
if DO_PANIC.load(Ordering::SeqCst) {
panic!("Unexpected allocation.");
}
System.alloc(layout)
}
unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
if DO_PANIC.load(Ordering::SeqCst) {
panic!("Unexpected deallocation.");
}
System.dealloc(ptr, layout);
}
}
```
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=831a297d176d015b1f9ace01ae416cc6)
Other standard library types follow the same behavior; make sure to check out
[`HashMap::new()`](https://doc.rust-lang.org/std/collections/hash_map/struct.HashMap.html#method.new),
and [`String::new()`](https://doc.rust-lang.org/std/string/struct.String.html#method.new).
# **lazy_static!** and **thread_local!**
There are two macros worth addressing in a conversation about heap memory. The first isn't part
of the standard library, but it's the [5th most downloaded crate](https://crates.io/crates/lazy_static)
in Rust. The second
TODO: Not so sure about lazy_static anymore. Is thread_local possibly heap-allocated too?
- Think it may actually be that lazy_static has a no_std mode that uses `spin`, std-mode uses std::Once.
- Reasonably confident thread_local always allocates
# Heap Alternatives
While it is a bit strange for us to talk of the stack after spending so much time with the heap,
While it is a bit strange for us to talk of the stack after spending time with the heap,
it's worth pointing out that some heap-allocated objects in Rust have stack-based counterparts
provided by other crates. There are a number of cases where this may be helpful, so it's useful
to know that alternatives exist if you need them.
provided by other crates. If you have need of the functionality, but want to avoid allocating,
there are some great alternatives.
When it comes to some of the standard library smart pointers
([`RwLock`](https://doc.rust-lang.org/std/sync/struct.RwLock.html) and
@ -198,3 +238,15 @@ may still be necessary if you're implementing an allocator (*cough cough* the au
because [`thread::current().id()`](https://doc.rust-lang.org/std/thread/struct.ThreadId.html)
[uses a `thread_local!` structure](https://doc.rust-lang.org/stable/src/std/sys_common/thread_info.rs.html#22-40)
that needs heap allocation.
# Tracing Allocators
When writing performance-sensitive code, there's no alternative to measuring your code.
[Measure first](https://youtu.be/nXaxk27zwlk?t=583), because you should never rely on
your instincts when [a microsecond is an eternity](https://www.youtube.com/watch?v=NH1Tta7purM).
Similarly, there's great work going on in Rust with allocators that keep track of what
they're doing. [`alloc_counter`](https://crates.io/crates/alloc_counter) was designed
for exactly this purpose. When it comes to tracking heap behavior, you shouldn't just
rely on the language; please measure and make sure that you have tools in place to catch
any issues that come up.

View File

@ -19,13 +19,13 @@ where we throw out all the rules and take the kid gloves off. As it turns out,
both the Rust compiler and the LLVM optimizers are incredibly sophisticated,
and we'll step back and let them do their job.
Similar to ["What Has My Compiler Done For Me Lately?"](https://www.youtube.com/watch?v=bSkpMdDe4g4),
we're focusing on interesting things the Rust language (and LLVM!) can do.
We'll still be looking at assembly code to understand what's going on,
but it's important to mention again: **please use automated tools like
[qadapt](https://crates.io/crates/qadapt) to double-check memory behavior**.
[alloc-counter](https://crates.io/crates/alloc_counter) to double-check memory behavior**.
It's far too easy to mis-read assembly in large code sections, you should
always have an automated tool verify behavior if you care about memory usage.
Similar to ["What Has My Compiler Done For Me Lately?"](https://www.youtube.com/watch?v=bSkpMdDe4g4),
we're just focused on interesting things the Rust language can do.
The guiding principal as we move forward is this: *optimizing compilers
won't produce worse assembly than we started with.* There won't be any
@ -35,19 +35,24 @@ There will, however, be an opera of optimization.
# The Case of the Disappearing Box
```rust
// Currently doesn't work, not sure why.
use std::alloc::{GlobalAlloc, Layout, System};
use std::sync::atomic::{AtomicBool, Ordering};
fn allocate_box() {
let x = Box::new(0);
let _x = Box::new(0);
}
pub fn main() {
// Turn on panicking if we allocate on the heap
DO_PANIC.store(true, Ordering::SeqCst);
// This code will only run with the mode set to "Release".
// If you try running in "Debug", you'll get a panic.
allocate_box();
// Turn off panicking, as there are some deallocations
// when we exit main.
DO_PANIC.store(false, Ordering::SeqCst);
}
#[global_allocator]
@ -71,7 +76,81 @@ unsafe impl GlobalAlloc for PanicAllocator {
}
}
```
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=3fe2846dac6755dbb7bb90342d0bf135)
# Vectors of Usual Size
```rust
use std::alloc::{GlobalAlloc, Layout, System};
use std::sync::atomic::{AtomicBool, Ordering};
fn main() {
// Turn on panicking if we allocate on the heap
DO_PANIC.store(true, Ordering::SeqCst);
// If the compiler can predict how large a vector will be,
// it can optimize out the heap storage needed.
let x: Vec<u64> = Vec::with_capacity(5);
drop(x);
// Turn off panicking, as there are some deallocations
// when we exit main.
DO_PANIC.store(false, Ordering::SeqCst);
}
#[global_allocator]
static A: PanicAllocator = PanicAllocator;
static DO_PANIC: AtomicBool = AtomicBool::new(false);
struct PanicAllocator;
unsafe impl GlobalAlloc for PanicAllocator {
unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
if DO_PANIC.load(Ordering::SeqCst) {
panic!("Unexpected allocation.");
}
System.alloc(layout)
}
unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
if DO_PANIC.load(Ordering::SeqCst) {
panic!("Unexpected deallocation.");
}
System.dealloc(ptr, layout);
}
}
```
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=5e9761b63243018d094829d901dd85c4)
# Dr. Array or: How I Learned to Love the Optimizer
```rust
#[derive(Default)]
struct TwoFiftySix {
_a: [u64; 32]
}
#[derive(Default)]
struct EightK {
_a: [TwoFiftySix; 32]
}
#[derive(Default)]
struct TwoFiftySixK {
_a: [EightK; 32]
}
#[derive(Default)]
struct EightM {
_a: [TwoFiftySixK; 32]
}
pub fn main() {
// Normally this blows up because we can't reserve size on stack
// for the `EightM` struct. But because the compiler notices we
// never do anything with `_x`, it optimizes out the stack storage
// and the program completes successfully.
let _x = EightM::default();
}
```
-- [Compiler Explorer](https://godbolt.org/z/daHn7P)
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=4c253bf26072119896ab93c6ef064dc0)

View File

@ -42,16 +42,16 @@ the faster stack-based allocation for variables.
With that in mind, let's get into the details. How do we know when Rust will or will not use
stack allocation for objects we create? Looking at other languages, it's often easy to delineate
between stack and heap. Managed memory languages (Python, Java,
[C#](https://blogs.msdn.microsoft.com/ericlippert/2010/09/30/the-truth-about-value-types/)) assume
everything is on the heap. JIT compilers ([PyPy](https://www.pypy.org/),
[C#](https://blogs.msdn.microsoft.com/ericlippert/2010/09/30/the-truth-about-value-types/))
place everything on the heap. JIT compilers ([PyPy](https://www.pypy.org/),
[HotSpot](https://www.oracle.com/technetwork/java/javase/tech/index-jsp-136373.html)) may
optimize some heap allocations away, but you should never assume it will happen.
C makes things clear with calls to special functions ([malloc(3)](https://linux.die.net/man/3/malloc)
is one) being the way to use heap memory. Old C++ has the [`new`](https://stackoverflow.com/a/655086/1454178)
keyword, though modern C++/C++11 is more complicated with [RAII](https://en.cppreference.com/w/cpp/language/raii).
For Rust specifically, the principle is this: *stack allocation will be used for everything
that doesn't involve "smart pointers" and collections.* If we're interested in dissecting it though,
For Rust specifically, the principle is this: **stack allocation will be used for everything
that doesn't involve "smart pointers" and collections.** If we're interested in dissecting it though,
there are three things we pay attention to:
1. Stack manipulation instructions (`push`, `pop`, and `add`/`sub` of the `rsp` register)
@ -101,9 +101,7 @@ With all that in mind, let's talk about situations in which we're guaranteed to
- [`Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html) types are guaranteed to be
stack-allocated, and copying them will be done in stack memory.
- [`Iterator`s](https://doc.rust-lang.org/std/iter/trait.Iterator.html) in the standard library
are stack-allocated. No worrying about some
["managed languages"](https://www.youtube.com/watch?v=bSkpMdDe4g4&feature=youtu.be&t=357)
creating garbage.
are stack-allocated even when iterating over heap-based collections.
# Structs
@ -491,3 +489,69 @@ struct NotCopyable {
# Iterators
In [managed memory languages](https://www.youtube.com/watch?v=bSkpMdDe4g4&feature=youtu.be&t=357)
(like Java), there's a subtle difference between these two code samples:
```java
public static int sum_for(List<Long> vals) {
long sum = 0;
// Regular for loop
for (int i = 0; i < vals.length; i++) {
sum += vals[i];
}
return sum;
}
public static int sum_foreach(List<Long> vals) {
long sum = 0;
// "Foreach" loop - uses iteration
for (Long l : vals) {
sum += l;
}
return sum;
}
```
In the `sum_for` function, nothing terribly interesting happens. In `sum_foreach`,
an object of type [`Iterator`](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Iterator.html)
is allocated on the heap, and will eventually be garbage-collected. This isn't a great design;
iterators are often transient objects that you need during a function and can discard
once the function ends. Sounds exactly like the issue stack-allocated objects address, no?
In Rust, iterators are allocated on the stack. The objects to iterate over are almost
certainly in heap memory, but the iterator itself
([`Iter`](https://doc.rust-lang.org/std/slice/struct.Iter.html)) doesn't need to use the heap.
In each of the examples below we iterate over a collection, but will never need to allocate
a object on the heap to clean up:
```rust
use std::collections::HashMap;
// There's a lot of assembly generated, but if you search in the text,
// there are no references to `real_drop_in_place` anywhere.
pub fn sum_vec(x: &Vec<u32>) {
let mut s = 0;
// Basic iteration over vectors doesn't need allocation
for y in x {
s += y;
}
}
pub fn sum_enumerate(x: &Vec<u32>) {
let mut s = 0;
// More complex iterators are just fine too
for (_i, y) in x.iter().enumerate() {
s += y;
}
}
pub fn sum_hm(x: &HashMap<u32, u32>) {
let mut s = 0;
// And it's not just Vec, all types will allocate the iterator
// on stack memory
for y in x.values() {
s += y;
}
}
```
-- [Compiler Explorer](https://godbolt.org/z/FTT3CT)