mirror of
https://github.com/bspeice/speice.io
synced 2025-07-01 22:06:26 -04:00
Allocations in Rust series
This commit is contained in:
148
blog/2019-02-08-compiler-optimizations/_article.md
Normal file
148
blog/2019-02-08-compiler-optimizations/_article.md
Normal file
@ -0,0 +1,148 @@
|
||||
---
|
||||
layout: post
|
||||
title: "Compiler Optimizations: What It's Done Lately"
|
||||
description: "A lot. The answer is a lot."
|
||||
category:
|
||||
tags: [rust, understanding-allocations]
|
||||
---
|
||||
|
||||
**Update 2019-02-10**: When debugging a
|
||||
[related issue](https://gitlab.com/sio4/code/alloc-counter/issues/1), it was discovered that the
|
||||
original code worked because LLVM optimized out the entire function, rather than just the allocation
|
||||
segments. The code has been updated with proper use of
|
||||
[`read_volatile`](https://doc.rust-lang.org/std/ptr/fn.read_volatile.html), and a previous section
|
||||
on vector capacity has been removed.
|
||||
|
||||
---
|
||||
|
||||
Up to this point, we've been discussing memory usage in the Rust language by focusing on simple
|
||||
rules that are mostly right for small chunks of code. We've spent time showing how those rules work
|
||||
themselves out in practice, and become familiar with reading the assembly code needed to see each
|
||||
memory type (global, stack, heap) in action.
|
||||
|
||||
Throughout the series so far, we've put a handicap on the code. In the name of consistent and
|
||||
understandable results, we've asked the compiler to pretty please leave the training wheels on. Now
|
||||
is the time where we throw out all the rules and take off the kid gloves. As it turns out, both the
|
||||
Rust compiler and the LLVM optimizers are incredibly sophisticated, and we'll step back and let them
|
||||
do their job.
|
||||
|
||||
Similar to
|
||||
["What Has My Compiler Done For Me Lately?"](https://www.youtube.com/watch?v=bSkpMdDe4g4), we're
|
||||
focusing on interesting things the Rust language (and LLVM!) can do with memory management. We'll
|
||||
still be looking at assembly code to understand what's going on, but it's important to mention
|
||||
again: **please use automated tools like [alloc-counter](https://crates.io/crates/alloc_counter) to
|
||||
double-check memory behavior if it's something you care about**. It's far too easy to mis-read
|
||||
assembly in large code sections, you should always verify behavior if you care about memory usage.
|
||||
|
||||
The guiding principal as we move forward is this: _optimizing compilers won't produce worse programs
|
||||
than we started with._ There won't be any situations where stack allocations get moved to heap
|
||||
allocations. There will, however, be an opera of optimization.
|
||||
|
||||
# The Case of the Disappearing Box
|
||||
|
||||
Our first optimization comes when LLVM can reason that the lifetime of an object is sufficiently
|
||||
short that heap allocations aren't necessary. In these cases, LLVM will move the allocation to the
|
||||
stack instead! The way this interacts with `#[inline]` attributes is a bit opaque, but the important
|
||||
part is that LLVM can sometimes do better than the baseline Rust language:
|
||||
|
||||
```rust
|
||||
use std::alloc::{GlobalAlloc, Layout, System};
|
||||
use std::sync::atomic::{AtomicBool, Ordering};
|
||||
|
||||
pub fn cmp(x: u32) {
|
||||
// Turn on panicking if we allocate on the heap
|
||||
DO_PANIC.store(true, Ordering::SeqCst);
|
||||
|
||||
// The compiler is able to see through the constant `Box`
|
||||
// and directly compare `x` to 24 - assembly line 73
|
||||
let y = Box::new(24);
|
||||
let equals = x == *y;
|
||||
|
||||
// This call to drop is eliminated
|
||||
drop(y);
|
||||
|
||||
// Need to mark the comparison result as volatile so that
|
||||
// LLVM doesn't strip out all the code. If `y` is marked
|
||||
// volatile instead, allocation will be forced.
|
||||
unsafe { std::ptr::read_volatile(&equals) };
|
||||
|
||||
// Turn off panicking, as there are some deallocations
|
||||
// when we exit main.
|
||||
DO_PANIC.store(false, Ordering::SeqCst);
|
||||
}
|
||||
|
||||
fn main() {
|
||||
cmp(12)
|
||||
}
|
||||
|
||||
#[global_allocator]
|
||||
static A: PanicAllocator = PanicAllocator;
|
||||
static DO_PANIC: AtomicBool = AtomicBool::new(false);
|
||||
struct PanicAllocator;
|
||||
|
||||
unsafe impl GlobalAlloc for PanicAllocator {
|
||||
unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
|
||||
if DO_PANIC.load(Ordering::SeqCst) {
|
||||
panic!("Unexpected allocation.");
|
||||
}
|
||||
System.alloc(layout)
|
||||
}
|
||||
|
||||
unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
|
||||
if DO_PANIC.load(Ordering::SeqCst) {
|
||||
panic!("Unexpected deallocation.");
|
||||
}
|
||||
System.dealloc(ptr, layout);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## -- [Compiler Explorer](https://godbolt.org/z/BZ_Yp3)
|
||||
|
||||
[Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=4a765f753183d5b919f62c71d2109d5d)
|
||||
|
||||
# Dr. Array or: How I Learned to Love the Optimizer
|
||||
|
||||
Finally, this isn't so much about LLVM figuring out different memory behavior, but LLVM stripping
|
||||
out code that doesn't do anything. Optimizations of this type have a lot of nuance to them; if
|
||||
you're not careful, they can make your benchmarks look
|
||||
[impossibly good](https://www.youtube.com/watch?v=nXaxk27zwlk&feature=youtu.be&t=1199). In Rust, the
|
||||
`black_box` function (implemented in both
|
||||
[`libtest`](https://doc.rust-lang.org/1.1.0/test/fn.black_box.html) and
|
||||
[`criterion`](https://docs.rs/criterion/0.2.10/criterion/fn.black_box.html)) will tell the compiler
|
||||
to disable this kind of optimization. But if you let LLVM remove unnecessary code, you can end up
|
||||
running programs that previously caused errors:
|
||||
|
||||
```rust
|
||||
#[derive(Default)]
|
||||
struct TwoFiftySix {
|
||||
_a: [u64; 32]
|
||||
}
|
||||
|
||||
#[derive(Default)]
|
||||
struct EightK {
|
||||
_a: [TwoFiftySix; 32]
|
||||
}
|
||||
|
||||
#[derive(Default)]
|
||||
struct TwoFiftySixK {
|
||||
_a: [EightK; 32]
|
||||
}
|
||||
|
||||
#[derive(Default)]
|
||||
struct EightM {
|
||||
_a: [TwoFiftySixK; 32]
|
||||
}
|
||||
|
||||
pub fn main() {
|
||||
// Normally this blows up because we can't reserve size on stack
|
||||
// for the `EightM` struct. But because the compiler notices we
|
||||
// never do anything with `_x`, it optimizes out the stack storage
|
||||
// and the program completes successfully.
|
||||
let _x = EightM::default();
|
||||
}
|
||||
```
|
||||
|
||||
## -- [Compiler Explorer](https://godbolt.org/z/daHn7P)
|
||||
|
||||
[Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=4c253bf26072119896ab93c6ef064dc0)
|
149
blog/2019-02-08-compiler-optimizations/index.mdx
Normal file
149
blog/2019-02-08-compiler-optimizations/index.mdx
Normal file
@ -0,0 +1,149 @@
|
||||
---
|
||||
title: "Allocations in Rust: Compiler optimizations"
|
||||
description: "A lot. The answer is a lot."
|
||||
date: 2019-02-08 12:00:00
|
||||
last_updated:
|
||||
date: 2019-02-10 12:00:00
|
||||
tags: []
|
||||
---
|
||||
|
||||
Up to this point, we've been discussing memory usage in the Rust language by focusing on simple
|
||||
rules that are mostly right for small chunks of code. We've spent time showing how those rules work
|
||||
themselves out in practice, and become familiar with reading the assembly code needed to see each
|
||||
memory type (global, stack, heap) in action.
|
||||
|
||||
Throughout the series so far, we've put a handicap on the code. In the name of consistent and
|
||||
understandable results, we've asked the compiler to pretty please leave the training wheels on. Now
|
||||
is the time where we throw out all the rules and take off the kid gloves. As it turns out, both the
|
||||
Rust compiler and the LLVM optimizers are incredibly sophisticated, and we'll step back and let them
|
||||
do their job.
|
||||
|
||||
<!-- truncate -->
|
||||
|
||||
Similar to
|
||||
["What Has My Compiler Done For Me Lately?"](https://www.youtube.com/watch?v=bSkpMdDe4g4), we're
|
||||
focusing on interesting things the Rust language (and LLVM!) can do with memory management. We'll
|
||||
still be looking at assembly code to understand what's going on, but it's important to mention
|
||||
again: **please use automated tools like [alloc-counter](https://crates.io/crates/alloc_counter) to
|
||||
double-check memory behavior if it's something you care about**. It's far too easy to mis-read
|
||||
assembly in large code sections, you should always verify behavior if you care about memory usage.
|
||||
|
||||
The guiding principal as we move forward is this: _optimizing compilers won't produce worse programs
|
||||
than we started with._ There won't be any situations where stack allocations get moved to heap
|
||||
allocations. There will, however, be an opera of optimization.
|
||||
|
||||
**Update 2019-02-10**: When debugging a
|
||||
[related issue](https://gitlab.com/sio4/code/alloc-counter/issues/1), it was discovered that the
|
||||
original code worked because LLVM optimized out the entire function, rather than just the allocation
|
||||
segments. The code has been updated with proper use of
|
||||
[`read_volatile`](https://doc.rust-lang.org/std/ptr/fn.read_volatile.html), and a previous section
|
||||
on vector capacity has been removed.
|
||||
|
||||
## The Case of the Disappearing Box
|
||||
|
||||
Our first optimization comes when LLVM can reason that the lifetime of an object is sufficiently
|
||||
short that heap allocations aren't necessary. In these cases, LLVM will move the allocation to the
|
||||
stack instead! The way this interacts with `#[inline]` attributes is a bit opaque, but the important
|
||||
part is that LLVM can sometimes do better than the baseline Rust language:
|
||||
|
||||
```rust
|
||||
use std::alloc::{GlobalAlloc, Layout, System};
|
||||
use std::sync::atomic::{AtomicBool, Ordering};
|
||||
|
||||
pub fn cmp(x: u32) {
|
||||
// Turn on panicking if we allocate on the heap
|
||||
DO_PANIC.store(true, Ordering::SeqCst);
|
||||
|
||||
// The compiler is able to see through the constant `Box`
|
||||
// and directly compare `x` to 24 - assembly line 73
|
||||
let y = Box::new(24);
|
||||
let equals = x == *y;
|
||||
|
||||
// This call to drop is eliminated
|
||||
drop(y);
|
||||
|
||||
// Need to mark the comparison result as volatile so that
|
||||
// LLVM doesn't strip out all the code. If `y` is marked
|
||||
// volatile instead, allocation will be forced.
|
||||
unsafe { std::ptr::read_volatile(&equals) };
|
||||
|
||||
// Turn off panicking, as there are some deallocations
|
||||
// when we exit main.
|
||||
DO_PANIC.store(false, Ordering::SeqCst);
|
||||
}
|
||||
|
||||
fn main() {
|
||||
cmp(12)
|
||||
}
|
||||
|
||||
#[global_allocator]
|
||||
static A: PanicAllocator = PanicAllocator;
|
||||
static DO_PANIC: AtomicBool = AtomicBool::new(false);
|
||||
struct PanicAllocator;
|
||||
|
||||
unsafe impl GlobalAlloc for PanicAllocator {
|
||||
unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
|
||||
if DO_PANIC.load(Ordering::SeqCst) {
|
||||
panic!("Unexpected allocation.");
|
||||
}
|
||||
System.alloc(layout)
|
||||
}
|
||||
|
||||
unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
|
||||
if DO_PANIC.load(Ordering::SeqCst) {
|
||||
panic!("Unexpected deallocation.");
|
||||
}
|
||||
System.dealloc(ptr, layout);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
-- [Compiler Explorer](https://godbolt.org/z/BZ_Yp3)
|
||||
|
||||
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=4a765f753183d5b919f62c71d2109d5d)
|
||||
|
||||
## Dr. Array or: how I learned to love the optimizer
|
||||
|
||||
Finally, this isn't so much about LLVM figuring out different memory behavior, but LLVM stripping
|
||||
out code that doesn't do anything. Optimizations of this type have a lot of nuance to them; if
|
||||
you're not careful, they can make your benchmarks look
|
||||
[impossibly good](https://www.youtube.com/watch?v=nXaxk27zwlk&feature=youtu.be&t=1199). In Rust, the
|
||||
`black_box` function (implemented in both
|
||||
[`libtest`](https://doc.rust-lang.org/1.1.0/test/fn.black_box.html) and
|
||||
[`criterion`](https://docs.rs/criterion/0.2.10/criterion/fn.black_box.html)) will tell the compiler
|
||||
to disable this kind of optimization. But if you let LLVM remove unnecessary code, you can end up
|
||||
running programs that previously caused errors:
|
||||
|
||||
```rust
|
||||
#[derive(Default)]
|
||||
struct TwoFiftySix {
|
||||
_a: [u64; 32]
|
||||
}
|
||||
|
||||
#[derive(Default)]
|
||||
struct EightK {
|
||||
_a: [TwoFiftySix; 32]
|
||||
}
|
||||
|
||||
#[derive(Default)]
|
||||
struct TwoFiftySixK {
|
||||
_a: [EightK; 32]
|
||||
}
|
||||
|
||||
#[derive(Default)]
|
||||
struct EightM {
|
||||
_a: [TwoFiftySixK; 32]
|
||||
}
|
||||
|
||||
pub fn main() {
|
||||
// Normally this blows up because we can't reserve size on stack
|
||||
// for the `EightM` struct. But because the compiler notices we
|
||||
// never do anything with `_x`, it optimizes out the stack storage
|
||||
// and the program completes successfully.
|
||||
let _x = EightM::default();
|
||||
}
|
||||
```
|
||||
|
||||
-- [Compiler Explorer](https://godbolt.org/z/daHn7P)
|
||||
|
||||
-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=4c253bf26072119896ab93c6ef064dc0)
|
Reference in New Issue
Block a user