mirror of
				https://github.com/bspeice/speice.io
				synced 2025-11-03 18:10:32 -05:00 
			
		
		
		
	Allocations in Rust series
This commit is contained in:
		@ -12,7 +12,7 @@ bit over a month ago, I was dispensing sage wisdom for the ages:
 | 
				
			|||||||
> I had a really great idea: build a custom allocator that allows you to track your own allocations.
 | 
					> I had a really great idea: build a custom allocator that allows you to track your own allocations.
 | 
				
			||||||
> I gave it a shot, but learned very quickly: **never write your own allocator.**
 | 
					> I gave it a shot, but learned very quickly: **never write your own allocator.**
 | 
				
			||||||
>
 | 
					>
 | 
				
			||||||
> -- [me](../2018-10-08-case-study-optimization)
 | 
					> -- [me](/2018/10/case-study-optimization)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
I proceeded to ignore it, because we never really learn from our mistakes.
 | 
					I proceeded to ignore it, because we never really learn from our mistakes.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
				
			|||||||
							
								
								
									
										113
									
								
								blog/2019-02-04-understanding-allocations-in-rust/_article.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										113
									
								
								blog/2019-02-04-understanding-allocations-in-rust/_article.md
									
									
									
									
									
										Normal file
									
								
							@ -0,0 +1,113 @@
 | 
				
			|||||||
 | 
					---
 | 
				
			||||||
 | 
					layout: post
 | 
				
			||||||
 | 
					title: "Allocations in Rust"
 | 
				
			||||||
 | 
					description: "An introduction to the memory model."
 | 
				
			||||||
 | 
					category:
 | 
				
			||||||
 | 
					tags: [rust, understanding-allocations]
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					There's an alchemy of distilling complex technical topics into articles and videos that change the
 | 
				
			||||||
 | 
					way programmers see the tools they interact with on a regular basis. I knew what a linker was, but
 | 
				
			||||||
 | 
					there's a staggering amount of complexity in between
 | 
				
			||||||
 | 
					[the OS and `main()`](https://www.youtube.com/watch?v=dOfucXtyEsU). Rust programmers use the
 | 
				
			||||||
 | 
					[`Box`](https://doc.rust-lang.org/stable/std/boxed/struct.Box.html) type all the time, but there's a
 | 
				
			||||||
 | 
					rich history of the Rust language itself wrapped up in
 | 
				
			||||||
 | 
					[how special it is](https://manishearth.github.io/blog/2017/01/10/rust-tidbits-box-is-special/).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In a similar vein, this series attempts to look at code and understand how memory is used; the
 | 
				
			||||||
 | 
					complex choreography of operating system, compiler, and program that frees you to focus on
 | 
				
			||||||
 | 
					functionality far-flung from frivolous book-keeping. The Rust compiler relieves a great deal of the
 | 
				
			||||||
 | 
					cognitive burden associated with memory management, but we're going to step into its world for a
 | 
				
			||||||
 | 
					while.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Let's learn a bit about memory in Rust.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					# Table of Contents
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					This series is intended as both learning and reference material; we'll work through the different
 | 
				
			||||||
 | 
					memory types Rust uses, and explain the implications of each. Ultimately, a summary will be provided
 | 
				
			||||||
 | 
					as a cheat sheet for easy future reference. To that end, a table of contents is in order:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- Foreword
 | 
				
			||||||
 | 
					- [Global Memory Usage: The Whole World](/2019/02/the-whole-world.html)
 | 
				
			||||||
 | 
					- [Fixed Memory: Stacking Up](/2019/02/stacking-up.html)
 | 
				
			||||||
 | 
					- [Dynamic Memory: A Heaping Helping](/2019/02/a-heaping-helping.html)
 | 
				
			||||||
 | 
					- [Compiler Optimizations: What It's Done For You Lately](/2019/02/compiler-optimizations.html)
 | 
				
			||||||
 | 
					- [Summary: What Are the Rules?](/2019/02/summary.html)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					# Foreword
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Rust's three defining features of
 | 
				
			||||||
 | 
					[Performance, Reliability, and Productivity](https://www.rust-lang.org/) are all driven to a great
 | 
				
			||||||
 | 
					degree by the how the Rust compiler understands memory usage. Unlike managed memory languages (Java,
 | 
				
			||||||
 | 
					Python), Rust
 | 
				
			||||||
 | 
					[doesn't really](https://words.steveklabnik.com/borrow-checking-escape-analysis-and-the-generational-hypothesis)
 | 
				
			||||||
 | 
					garbage collect; instead, it uses an
 | 
				
			||||||
 | 
					[ownership](https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html) system to reason about
 | 
				
			||||||
 | 
					how long objects will last in your program. In some cases, if the life of an object is fairly
 | 
				
			||||||
 | 
					transient, Rust can make use of a very fast region called the "stack." When that's not possible,
 | 
				
			||||||
 | 
					Rust uses
 | 
				
			||||||
 | 
					[dynamic (heap) memory](https://en.wikipedia.org/wiki/Memory_management#Dynamic_memory_allocation)
 | 
				
			||||||
 | 
					and the ownership system to ensure you can't accidentally corrupt memory. It's not as fast, but it
 | 
				
			||||||
 | 
					is important to have available.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					That said, there are specific situations in Rust where you'd never need to worry about the
 | 
				
			||||||
 | 
					stack/heap distinction! If you:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					1. Never use `unsafe`
 | 
				
			||||||
 | 
					2. Never use `#![feature(alloc)]` or the [`alloc` crate](https://doc.rust-lang.org/alloc/index.html)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					...then it's not possible for you to use dynamic memory!
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					For some uses of Rust, typically embedded devices, these constraints are OK. They have very limited
 | 
				
			||||||
 | 
					memory, and the program binary size itself may significantly affect what's available! There's no
 | 
				
			||||||
 | 
					operating system able to manage this
 | 
				
			||||||
 | 
					["virtual memory"](https://en.wikipedia.org/wiki/Virtual_memory) thing, but that's not an issue
 | 
				
			||||||
 | 
					because there's only one running application. The
 | 
				
			||||||
 | 
					[embedonomicon](https://docs.rust-embedded.org/embedonomicon/preface.html) is ever in mind, and
 | 
				
			||||||
 | 
					interacting with the "real world" through extra peripherals is accomplished by reading and writing
 | 
				
			||||||
 | 
					to [specific memory addresses](https://bob.cs.sonoma.edu/IntroCompOrg-RPi/sec-gpio-mem.html).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Most Rust programs find these requirements overly burdensome though. C++ developers would struggle
 | 
				
			||||||
 | 
					without access to [`std::vector`](https://en.cppreference.com/w/cpp/container/vector) (except those
 | 
				
			||||||
 | 
					hardcore no-STL people), and Rust developers would struggle without
 | 
				
			||||||
 | 
					[`std::vec`](https://doc.rust-lang.org/std/vec/struct.Vec.html). But with the constraints above,
 | 
				
			||||||
 | 
					`std::vec` is actually a part of the
 | 
				
			||||||
 | 
					[`alloc` crate](https://doc.rust-lang.org/alloc/vec/struct.Vec.html), and thus off-limits. `Box`,
 | 
				
			||||||
 | 
					`Rc`, etc., are also unusable for the same reason.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Whether writing code for embedded devices or not, the important thing in both situations is how much
 | 
				
			||||||
 | 
					you know _before your application starts_ about what its memory usage will look like. In embedded
 | 
				
			||||||
 | 
					devices, there's a small, fixed amount of memory to use. In a browser, you have no idea how large
 | 
				
			||||||
 | 
					[google.com](https://www.google.com)'s home page is until you start trying to download it. The
 | 
				
			||||||
 | 
					compiler uses this knowledge (or lack thereof) to optimize how memory is used; put simply, your code
 | 
				
			||||||
 | 
					runs faster when the compiler can guarantee exactly how much memory your program needs while it's
 | 
				
			||||||
 | 
					running. This series is all about understanding how the compiler reasons about your program, with an
 | 
				
			||||||
 | 
					emphasis on the implications for performance.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Now let's address some conditions and caveats before going much further:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- We'll focus on "safe" Rust only; `unsafe` lets you use platform-specific allocation API's
 | 
				
			||||||
 | 
					  ([`malloc`](https://www.tutorialspoint.com/c_standard_library/c_function_malloc.htm)) that we'll
 | 
				
			||||||
 | 
					  ignore.
 | 
				
			||||||
 | 
					- We'll assume a "debug" build of Rust code (what you get with `cargo run` and `cargo test`) and
 | 
				
			||||||
 | 
					  address (pun intended) release mode at the end (`cargo run --release` and `cargo test --release`).
 | 
				
			||||||
 | 
					- All content will be run using Rust 1.32, as that's the highest currently supported in the
 | 
				
			||||||
 | 
					  [Compiler Exporer](https://godbolt.org/). As such, we'll avoid upcoming innovations like
 | 
				
			||||||
 | 
					  [compile-time evaluation of `static`](https://github.com/rust-lang/rfcs/blob/master/text/0911-const-fn.md)
 | 
				
			||||||
 | 
					  that are available in nightly.
 | 
				
			||||||
 | 
					- Because of the nature of the content, being able to read assembly is helpful. We'll keep it
 | 
				
			||||||
 | 
					  simple, but I [found](https://stackoverflow.com/a/4584131/1454178) a
 | 
				
			||||||
 | 
					  [refresher](https://stackoverflow.com/a/26026278/1454178) on the `push` and `pop`
 | 
				
			||||||
 | 
					  [instructions](http://www.cs.virginia.edu/~evans/cs216/guides/x86.html) was helpful while writing
 | 
				
			||||||
 | 
					  this.
 | 
				
			||||||
 | 
					- I've tried to be precise in saying only what I can prove using the tools (ASM, docs) that are
 | 
				
			||||||
 | 
					  available, but if there's something said in error it will be corrected expeditiously. Please let
 | 
				
			||||||
 | 
					  me know at [bradlee@speice.io](mailto:bradlee@speice.io)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Finally, I'll do what I can to flag potential future changes but the Rust docs have a notice worth
 | 
				
			||||||
 | 
					repeating:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					> Rust does not currently have a rigorously and formally defined memory model.
 | 
				
			||||||
 | 
					>
 | 
				
			||||||
 | 
					> -- [the docs](https://doc.rust-lang.org/std/ptr/fn.read_volatile.html)
 | 
				
			||||||
							
								
								
									
										102
									
								
								blog/2019-02-04-understanding-allocations-in-rust/index.mdx
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										102
									
								
								blog/2019-02-04-understanding-allocations-in-rust/index.mdx
									
									
									
									
									
										Normal file
									
								
							@ -0,0 +1,102 @@
 | 
				
			|||||||
 | 
					---
 | 
				
			||||||
 | 
					slug: 2019/02/understanding-allocations-in-rust
 | 
				
			||||||
 | 
					title: "Allocations in Rust: Foreword"
 | 
				
			||||||
 | 
					date: 2019-02-04 12:00:00
 | 
				
			||||||
 | 
					authors: [bspeice]
 | 
				
			||||||
 | 
					tags: []
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					There's an alchemy of distilling complex technical topics into articles and videos that change the
 | 
				
			||||||
 | 
					way programmers see the tools they interact with on a regular basis. I knew what a linker was, but
 | 
				
			||||||
 | 
					there's a staggering amount of complexity in between
 | 
				
			||||||
 | 
					[the OS and `main()`](https://www.youtube.com/watch?v=dOfucXtyEsU). Rust programmers use the
 | 
				
			||||||
 | 
					[`Box`](https://doc.rust-lang.org/stable/std/boxed/struct.Box.html) type all the time, but there's a
 | 
				
			||||||
 | 
					rich history of the Rust language itself wrapped up in
 | 
				
			||||||
 | 
					[how special it is](https://manishearth.github.io/blog/2017/01/10/rust-tidbits-box-is-special/).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In a similar vein, this series attempts to look at code and understand how memory is used; the
 | 
				
			||||||
 | 
					complex choreography of operating system, compiler, and program that frees you to focus on
 | 
				
			||||||
 | 
					functionality far-flung from frivolous book-keeping. The Rust compiler relieves a great deal of the
 | 
				
			||||||
 | 
					cognitive burden associated with memory management, but we're going to step into its world for a
 | 
				
			||||||
 | 
					while.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Let's learn a bit about memory in Rust.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					<!-- truncate -->
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Rust's three defining features of
 | 
				
			||||||
 | 
					[Performance, Reliability, and Productivity](https://www.rust-lang.org/) are all driven to a great
 | 
				
			||||||
 | 
					degree by the how the Rust compiler understands memory usage. Unlike managed memory languages (Java,
 | 
				
			||||||
 | 
					Python), Rust
 | 
				
			||||||
 | 
					[doesn't really](https://words.steveklabnik.com/borrow-checking-escape-analysis-and-the-generational-hypothesis)
 | 
				
			||||||
 | 
					garbage collect; instead, it uses an
 | 
				
			||||||
 | 
					[ownership](https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html) system to reason about
 | 
				
			||||||
 | 
					how long objects will last in your program. In some cases, if the life of an object is fairly
 | 
				
			||||||
 | 
					transient, Rust can make use of a very fast region called the "stack." When that's not possible,
 | 
				
			||||||
 | 
					Rust uses
 | 
				
			||||||
 | 
					[dynamic (heap) memory](https://en.wikipedia.org/wiki/Memory_management#Dynamic_memory_allocation)
 | 
				
			||||||
 | 
					and the ownership system to ensure you can't accidentally corrupt memory. It's not as fast, but it
 | 
				
			||||||
 | 
					is important to have available.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					That said, there are specific situations in Rust where you'd never need to worry about the
 | 
				
			||||||
 | 
					stack/heap distinction! If you:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					1. Never use `unsafe`
 | 
				
			||||||
 | 
					2. Never use `#![feature(alloc)]` or the [`alloc` crate](https://doc.rust-lang.org/alloc/index.html)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					...then it's not possible for you to use dynamic memory!
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					For some uses of Rust, typically embedded devices, these constraints are OK. They have very limited
 | 
				
			||||||
 | 
					memory, and the program binary size itself may significantly affect what's available! There's no
 | 
				
			||||||
 | 
					operating system able to manage this
 | 
				
			||||||
 | 
					["virtual memory"](https://en.wikipedia.org/wiki/Virtual_memory) thing, but that's not an issue
 | 
				
			||||||
 | 
					because there's only one running application. The
 | 
				
			||||||
 | 
					[embedonomicon](https://docs.rust-embedded.org/embedonomicon/preface.html) is ever in mind, and
 | 
				
			||||||
 | 
					interacting with the "real world" through extra peripherals is accomplished by reading and writing
 | 
				
			||||||
 | 
					to [specific memory addresses](https://bob.cs.sonoma.edu/IntroCompOrg-RPi/sec-gpio-mem.html).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Most Rust programs find these requirements overly burdensome though. C++ developers would struggle
 | 
				
			||||||
 | 
					without access to [`std::vector`](https://en.cppreference.com/w/cpp/container/vector) (except those
 | 
				
			||||||
 | 
					hardcore no-STL people), and Rust developers would struggle without
 | 
				
			||||||
 | 
					[`std::vec`](https://doc.rust-lang.org/std/vec/struct.Vec.html). But with the constraints above,
 | 
				
			||||||
 | 
					`std::vec` is actually a part of the
 | 
				
			||||||
 | 
					[`alloc` crate](https://doc.rust-lang.org/alloc/vec/struct.Vec.html), and thus off-limits. `Box`,
 | 
				
			||||||
 | 
					`Rc`, etc., are also unusable for the same reason.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Whether writing code for embedded devices or not, the important thing in both situations is how much
 | 
				
			||||||
 | 
					you know _before your application starts_ about what its memory usage will look like. In embedded
 | 
				
			||||||
 | 
					devices, there's a small, fixed amount of memory to use. In a browser, you have no idea how large
 | 
				
			||||||
 | 
					[google.com](https://www.google.com)'s home page is until you start trying to download it. The
 | 
				
			||||||
 | 
					compiler uses this knowledge (or lack thereof) to optimize how memory is used; put simply, your code
 | 
				
			||||||
 | 
					runs faster when the compiler can guarantee exactly how much memory your program needs while it's
 | 
				
			||||||
 | 
					running. This series is all about understanding how the compiler reasons about your program, with an
 | 
				
			||||||
 | 
					emphasis on the implications for performance.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Now let's address some conditions and caveats before going much further:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- We'll focus on "safe" Rust only; `unsafe` lets you use platform-specific allocation API's
 | 
				
			||||||
 | 
					  ([`malloc`](https://www.tutorialspoint.com/c_standard_library/c_function_malloc.htm)) that we'll
 | 
				
			||||||
 | 
					  ignore.
 | 
				
			||||||
 | 
					- We'll assume a "debug" build of Rust code (what you get with `cargo run` and `cargo test`) and
 | 
				
			||||||
 | 
					  address (pun intended) release mode at the end (`cargo run --release` and `cargo test --release`).
 | 
				
			||||||
 | 
					- All content will be run using Rust 1.32, as that's the highest currently supported in the
 | 
				
			||||||
 | 
					  [Compiler Exporer](https://godbolt.org/). As such, we'll avoid upcoming innovations like
 | 
				
			||||||
 | 
					  [compile-time evaluation of `static`](https://github.com/rust-lang/rfcs/blob/master/text/0911-const-fn.md)
 | 
				
			||||||
 | 
					  that are available in nightly.
 | 
				
			||||||
 | 
					- Because of the nature of the content, being able to read assembly is helpful. We'll keep it
 | 
				
			||||||
 | 
					  simple, but I [found](https://stackoverflow.com/a/4584131/1454178) a
 | 
				
			||||||
 | 
					  [refresher](https://stackoverflow.com/a/26026278/1454178) on the `push` and `pop`
 | 
				
			||||||
 | 
					  [instructions](http://www.cs.virginia.edu/~evans/cs216/guides/x86.html) was helpful while writing
 | 
				
			||||||
 | 
					  this.
 | 
				
			||||||
 | 
					- I've tried to be precise in saying only what I can prove using the tools (ASM, docs) that are
 | 
				
			||||||
 | 
					  available, but if there's something said in error it will be corrected expeditiously. Please let
 | 
				
			||||||
 | 
					  me know at [bradlee@speice.io](mailto:bradlee@speice.io)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Finally, I'll do what I can to flag potential future changes but the Rust docs have a notice worth
 | 
				
			||||||
 | 
					repeating:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					> Rust does not currently have a rigorously and formally defined memory model.
 | 
				
			||||||
 | 
					>
 | 
				
			||||||
 | 
					> -- [the docs](https://doc.rust-lang.org/std/ptr/fn.read_volatile.html)
 | 
				
			||||||
							
								
								
									
										337
									
								
								blog/2019-02-05-the-whole-world/_article.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										337
									
								
								blog/2019-02-05-the-whole-world/_article.md
									
									
									
									
									
										Normal file
									
								
							@ -0,0 +1,337 @@
 | 
				
			|||||||
 | 
					---
 | 
				
			||||||
 | 
					layout: post
 | 
				
			||||||
 | 
					title: "Global Memory Usage: The Whole World"
 | 
				
			||||||
 | 
					description: "Static considered slightly less harmful."
 | 
				
			||||||
 | 
					category:
 | 
				
			||||||
 | 
					tags: [rust, understanding-allocations]
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The first memory type we'll look at is pretty special: when Rust can prove that a _value_ is fixed
 | 
				
			||||||
 | 
					for the life of a program (`const`), and when a _reference_ is unique for the life of a program
 | 
				
			||||||
 | 
					(`static` as a declaration, not
 | 
				
			||||||
 | 
					[`'static`](https://doc.rust-lang.org/book/ch10-03-lifetime-syntax.html#the-static-lifetime) as a
 | 
				
			||||||
 | 
					lifetime), we can make use of global memory. This special section of data is embedded directly in
 | 
				
			||||||
 | 
					the program binary so that variables are ready to go once the program loads; no additional
 | 
				
			||||||
 | 
					computation is necessary.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Understanding the value/reference distinction is important for reasons we'll go into below, and
 | 
				
			||||||
 | 
					while the
 | 
				
			||||||
 | 
					[full specification](https://github.com/rust-lang/rfcs/blob/master/text/0246-const-vs-static.md) for
 | 
				
			||||||
 | 
					these two keywords is available, we'll take a hands-on approach to the topic.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					# **const**
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					When a _value_ is guaranteed to be unchanging in your program (where "value" may be scalars,
 | 
				
			||||||
 | 
					`struct`s, etc.), you can declare it `const`. This tells the compiler that it's safe to treat the
 | 
				
			||||||
 | 
					value as never changing, and enables some interesting optimizations; not only is there no
 | 
				
			||||||
 | 
					initialization cost to creating the value (it is loaded at the same time as the executable parts of
 | 
				
			||||||
 | 
					your program), but the compiler can also copy the value around if it speeds up the code.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The points we need to address when talking about `const` are:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- `Const` values are stored in read-only memory - it's impossible to modify.
 | 
				
			||||||
 | 
					- Values resulting from calling a `const fn` are materialized at compile-time.
 | 
				
			||||||
 | 
					- The compiler may (or may not) copy `const` values wherever it chooses.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Read-Only
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The first point is a bit strange - "read-only memory."
 | 
				
			||||||
 | 
					[The Rust book](https://doc.rust-lang.org/book/ch03-01-variables-and-mutability.html#differences-between-variables-and-constants)
 | 
				
			||||||
 | 
					mentions in a couple places that using `mut` with constants is illegal, but it's also important to
 | 
				
			||||||
 | 
					demonstrate just how immutable they are. _Typically_ in Rust you can use
 | 
				
			||||||
 | 
					[interior mutability](https://doc.rust-lang.org/book/ch15-05-interior-mutability.html) to modify
 | 
				
			||||||
 | 
					things that aren't declared `mut`.
 | 
				
			||||||
 | 
					[`RefCell`](https://doc.rust-lang.org/std/cell/struct.RefCell.html) provides an example of this
 | 
				
			||||||
 | 
					pattern in action:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					use std::cell::RefCell;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					fn my_mutator(cell: &RefCell<u8>) {
 | 
				
			||||||
 | 
					    // Even though we're given an immutable reference,
 | 
				
			||||||
 | 
					    // the `replace` method allows us to modify the inner value.
 | 
				
			||||||
 | 
					    cell.replace(14);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					fn main() {
 | 
				
			||||||
 | 
					    let cell = RefCell::new(25);
 | 
				
			||||||
 | 
					    // Prints out 25
 | 
				
			||||||
 | 
					    println!("Cell: {:?}", cell);
 | 
				
			||||||
 | 
					    my_mutator(&cell);
 | 
				
			||||||
 | 
					    // Prints out 14
 | 
				
			||||||
 | 
					    println!("Cell: {:?}", cell);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					--
 | 
				
			||||||
 | 
					[Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8e4bea1a718edaff4507944e825a54b2)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					When `const` is involved though, interior mutability is impossible:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					use std::cell::RefCell;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					const CELL: RefCell<u8> = RefCell::new(25);
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					fn my_mutator(cell: &RefCell<u8>) {
 | 
				
			||||||
 | 
					    cell.replace(14);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					fn main() {
 | 
				
			||||||
 | 
					    // First line prints 25 as expected
 | 
				
			||||||
 | 
					    println!("Cell: {:?}", &CELL);
 | 
				
			||||||
 | 
					    my_mutator(&CELL);
 | 
				
			||||||
 | 
					    // Second line *still* prints 25
 | 
				
			||||||
 | 
					    println!("Cell: {:?}", &CELL);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					--
 | 
				
			||||||
 | 
					[Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=88fe98110c33c1b3a51e341f48b8ae00)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					And a second example using [`Once`](https://doc.rust-lang.org/std/sync/struct.Once.html):
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					use std::sync::Once;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					const SURPRISE: Once = Once::new();
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					fn main() {
 | 
				
			||||||
 | 
					    // This is how `Once` is supposed to be used
 | 
				
			||||||
 | 
					    SURPRISE.call_once(|| println!("Initializing..."));
 | 
				
			||||||
 | 
					    // Because `Once` is a `const` value, we never record it
 | 
				
			||||||
 | 
					    // having been initialized the first time, and this closure
 | 
				
			||||||
 | 
					    // will also execute.
 | 
				
			||||||
 | 
					    SURPRISE.call_once(|| println!("Initializing again???"));
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					--
 | 
				
			||||||
 | 
					[Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=c3cc5979b5e5434eca0f9ec4a06ee0ed)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					When the
 | 
				
			||||||
 | 
					[`const` specification](https://github.com/rust-lang/rfcs/blob/26197104b7bb9a5a35db243d639aee6e46d35d75/text/0246-const-vs-static.md)
 | 
				
			||||||
 | 
					refers to ["rvalues"](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3055.pdf), this
 | 
				
			||||||
 | 
					behavior is what they refer to. [Clippy](https://github.com/rust-lang/rust-clippy) will treat this
 | 
				
			||||||
 | 
					as an error, but it's still something to be aware of.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Initialization == Compilation
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The next thing to mention is that `const` values are loaded into memory _as part of your program
 | 
				
			||||||
 | 
					binary_. Because of this, any `const` values declared in your program will be "realized" at
 | 
				
			||||||
 | 
					compile-time; accessing them may trigger a main-memory lookup (with a fixed address, so your CPU may
 | 
				
			||||||
 | 
					be able to prefetch the value), but that's it.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					use std::cell::RefCell;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					const CELL: RefCell<u32> = RefCell::new(24);
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn multiply(value: u32) -> u32 {
 | 
				
			||||||
 | 
					    // CELL is stored at `.L__unnamed_1`
 | 
				
			||||||
 | 
					    value * (*CELL.get_mut())
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/Th8boO)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The compiler creates one `RefCell`, uses it everywhere, and never needs to call the `RefCell::new`
 | 
				
			||||||
 | 
					function.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Copying
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					If it's helpful though, the compiler can choose to copy `const` values.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					const FACTOR: u32 = 1000;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn multiply(value: u32) -> u32 {
 | 
				
			||||||
 | 
					    // See assembly line 4 for the `mov edi, 1000` instruction
 | 
				
			||||||
 | 
					    value * FACTOR
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn multiply_twice(value: u32) -> u32 {
 | 
				
			||||||
 | 
					    // See assembly lines 22 and 29 for `mov edi, 1000` instructions
 | 
				
			||||||
 | 
					    value * FACTOR * FACTOR
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/ZtS54X)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In this example, the `FACTOR` value is turned into the `mov edi, 1000` instruction in both the
 | 
				
			||||||
 | 
					`multiply` and `multiply_twice` functions; the "1000" value is never "stored" anywhere, as it's
 | 
				
			||||||
 | 
					small enough to inline into the assembly instructions.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Finally, getting the address of a `const` value is possible, but not guaranteed to be unique
 | 
				
			||||||
 | 
					(because the compiler can choose to copy values). I was unable to get non-unique pointers in my
 | 
				
			||||||
 | 
					testing (even using different crates), but the specifications are clear enough: _don't rely on
 | 
				
			||||||
 | 
					pointers to `const` values being consistent_. To be frank, caring about locations for `const` values
 | 
				
			||||||
 | 
					is almost certainly a code smell.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					# **static**
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Static variables are related to `const` variables, but take a slightly different approach. When we
 | 
				
			||||||
 | 
					declare that a _reference_ is unique for the life of a program, you have a `static` variable
 | 
				
			||||||
 | 
					(unrelated to the `'static` lifetime). Because of the reference/value distinction with
 | 
				
			||||||
 | 
					`const`/`static`, static variables behave much more like typical "global" variables.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					But to understand `static`, here's what we'll look at:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- `static` variables are globally unique locations in memory.
 | 
				
			||||||
 | 
					- Like `const`, `static` variables are loaded at the same time as your program being read into
 | 
				
			||||||
 | 
					  memory.
 | 
				
			||||||
 | 
					- All `static` variables must implement the
 | 
				
			||||||
 | 
					  [`Sync`](https://doc.rust-lang.org/std/marker/trait.Sync.html) marker trait.
 | 
				
			||||||
 | 
					- Interior mutability is safe and acceptable when using `static` variables.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Memory Uniqueness
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The single biggest difference between `const` and `static` is the guarantees provided about
 | 
				
			||||||
 | 
					uniqueness. Where `const` variables may or may not be copied in code, `static` variables are
 | 
				
			||||||
 | 
					guarantee to be unique. If we take a previous `const` example and change it to `static`, the
 | 
				
			||||||
 | 
					difference should be clear:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					static FACTOR: u32 = 1000;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn multiply(value: u32) -> u32 {
 | 
				
			||||||
 | 
					    // The assembly to `mul dword ptr [rip + example::FACTOR]` is how FACTOR gets used
 | 
				
			||||||
 | 
					    value * FACTOR
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn multiply_twice(value: u32) -> u32 {
 | 
				
			||||||
 | 
					    // The assembly to `mul dword ptr [rip + example::FACTOR]` is how FACTOR gets used
 | 
				
			||||||
 | 
					    value * FACTOR * FACTOR
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/uxmiRQ)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Where [previously](#copying) there were plenty of references to multiplying by 1000, the new
 | 
				
			||||||
 | 
					assembly refers to `FACTOR` as a named memory location instead. No initialization work needs to be
 | 
				
			||||||
 | 
					done, but the compiler can no longer prove the value never changes during execution.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Initialization == Compilation
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Next, let's talk about initialization. The simplest case is initializing static variables with
 | 
				
			||||||
 | 
					either scalar or struct notation:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					#[derive(Debug)]
 | 
				
			||||||
 | 
					struct MyStruct {
 | 
				
			||||||
 | 
					    x: u32
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					static MY_STRUCT: MyStruct = MyStruct {
 | 
				
			||||||
 | 
					    // You can even reference other statics
 | 
				
			||||||
 | 
					    // declared later
 | 
				
			||||||
 | 
					    x: MY_VAL
 | 
				
			||||||
 | 
					};
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					static MY_VAL: u32 = 24;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					fn main() {
 | 
				
			||||||
 | 
					    println!("Static MyStruct: {:?}", MY_STRUCT);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					--
 | 
				
			||||||
 | 
					[Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=b538dbc46076f12db047af4f4403ee6e)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Things can get a bit weirder when using `const fn` though. In most cases, it just works:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					#[derive(Debug)]
 | 
				
			||||||
 | 
					struct MyStruct {
 | 
				
			||||||
 | 
					    x: u32
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					impl MyStruct {
 | 
				
			||||||
 | 
					    const fn new() -> MyStruct {
 | 
				
			||||||
 | 
					        MyStruct { x: 24 }
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					static MY_STRUCT: MyStruct = MyStruct::new();
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					fn main() {
 | 
				
			||||||
 | 
					    println!("const fn Static MyStruct: {:?}", MY_STRUCT);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					--
 | 
				
			||||||
 | 
					[Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8c796a6e7fc273c12115091b707b0255)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					However, there's a caveat: you're currently not allowed to use `const fn` to initialize static
 | 
				
			||||||
 | 
					variables of types that aren't marked `Sync`. For example,
 | 
				
			||||||
 | 
					[`RefCell::new()`](https://doc.rust-lang.org/std/cell/struct.RefCell.html#method.new) is a
 | 
				
			||||||
 | 
					`const fn`, but because
 | 
				
			||||||
 | 
					[`RefCell` isn't `Sync`](https://doc.rust-lang.org/std/cell/struct.RefCell.html#impl-Sync), you'll
 | 
				
			||||||
 | 
					get an error at compile time:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					use std::cell::RefCell;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// error[E0277]: `std::cell::RefCell<u8>` cannot be shared between threads safely
 | 
				
			||||||
 | 
					static MY_LOCK: RefCell<u8> = RefCell::new(0);
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					--
 | 
				
			||||||
 | 
					[Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=c76ef86e473d07117a1700e21fd45560)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					It's likely that this will
 | 
				
			||||||
 | 
					[change in the future](https://github.com/rust-lang/rfcs/blob/master/text/0911-const-fn.md) though.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## **Sync**
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Which leads well to the next point: static variable types must implement the
 | 
				
			||||||
 | 
					[`Sync` marker](https://doc.rust-lang.org/std/marker/trait.Sync.html). Because they're globally
 | 
				
			||||||
 | 
					unique, it must be safe for you to access static variables from any thread at any time. Most
 | 
				
			||||||
 | 
					`struct` definitions automatically implement the `Sync` trait because they contain only elements
 | 
				
			||||||
 | 
					which themselves implement `Sync` (read more in the
 | 
				
			||||||
 | 
					[Nomicon](https://doc.rust-lang.org/nomicon/send-and-sync.html)). This is why earlier examples could
 | 
				
			||||||
 | 
					get away with initializing statics, even though we never included an `impl Sync for MyStruct` in the
 | 
				
			||||||
 | 
					code. To demonstrate this property, Rust refuses to compile our earlier example if we add a
 | 
				
			||||||
 | 
					non-`Sync` element to the `struct` definition:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					use std::cell::RefCell;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					struct MyStruct {
 | 
				
			||||||
 | 
					    x: u32,
 | 
				
			||||||
 | 
					    y: RefCell<u8>,
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// error[E0277]: `std::cell::RefCell<u8>` cannot be shared between threads safely
 | 
				
			||||||
 | 
					static MY_STRUCT: MyStruct = MyStruct {
 | 
				
			||||||
 | 
					    x: 8,
 | 
				
			||||||
 | 
					    y: RefCell::new(8)
 | 
				
			||||||
 | 
					};
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					--
 | 
				
			||||||
 | 
					[Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=40074d0248f056c296b662dbbff97cfc)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Interior Mutability
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Finally, while `static mut` variables are allowed, mutating them is an `unsafe` operation. If we
 | 
				
			||||||
 | 
					want to stay in `safe` Rust, we can use interior mutability to accomplish similar goals:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					use std::sync::Once;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// This example adapted from https://doc.rust-lang.org/std/sync/struct.Once.html#method.call_once
 | 
				
			||||||
 | 
					static INIT: Once = Once::new();
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					fn main() {
 | 
				
			||||||
 | 
					    // Note that while `INIT` is declared immutable, we're still allowed
 | 
				
			||||||
 | 
					    // to mutate its interior
 | 
				
			||||||
 | 
					    INIT.call_once(|| println!("Initializing..."));
 | 
				
			||||||
 | 
					    // This code won't panic, as the interior of INIT was modified
 | 
				
			||||||
 | 
					    // as part of the previous `call_once`
 | 
				
			||||||
 | 
					    INIT.call_once(|| panic!("INIT was called twice!"));
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					--
 | 
				
			||||||
 | 
					[Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=3ba003a981a7ed7400240caadd384d59)
 | 
				
			||||||
							
								
								
									
										339
									
								
								blog/2019-02-05-the-whole-world/index.mdx
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										339
									
								
								blog/2019-02-05-the-whole-world/index.mdx
									
									
									
									
									
										Normal file
									
								
							@ -0,0 +1,339 @@
 | 
				
			|||||||
 | 
					---
 | 
				
			||||||
 | 
					slug: 2019/02/the-whole-world
 | 
				
			||||||
 | 
					title: "Allocations in Rust: Global memory"
 | 
				
			||||||
 | 
					date: 2019-02-05 12:00:00
 | 
				
			||||||
 | 
					authors: [bspeice]
 | 
				
			||||||
 | 
					tags: []
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The first memory type we'll look at is pretty special: when Rust can prove that a _value_ is fixed
 | 
				
			||||||
 | 
					for the life of a program (`const`), and when a _reference_ is unique for the life of a program
 | 
				
			||||||
 | 
					(`static` as a declaration, not
 | 
				
			||||||
 | 
					[`'static`](https://doc.rust-lang.org/book/ch10-03-lifetime-syntax.html#the-static-lifetime) as a
 | 
				
			||||||
 | 
					lifetime), we can make use of global memory. This special section of data is embedded directly in
 | 
				
			||||||
 | 
					the program binary so that variables are ready to go once the program loads; no additional
 | 
				
			||||||
 | 
					computation is necessary.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Understanding the value/reference distinction is important for reasons we'll go into below, and
 | 
				
			||||||
 | 
					while the
 | 
				
			||||||
 | 
					[full specification](https://github.com/rust-lang/rfcs/blob/master/text/0246-const-vs-static.md) for
 | 
				
			||||||
 | 
					these two keywords is available, we'll take a hands-on approach to the topic.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					<!-- truncate -->
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## `const` values
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					When a _value_ is guaranteed to be unchanging in your program (where "value" may be scalars,
 | 
				
			||||||
 | 
					`struct`s, etc.), you can declare it `const`. This tells the compiler that it's safe to treat the
 | 
				
			||||||
 | 
					value as never changing, and enables some interesting optimizations; not only is there no
 | 
				
			||||||
 | 
					initialization cost to creating the value (it is loaded at the same time as the executable parts of
 | 
				
			||||||
 | 
					your program), but the compiler can also copy the value around if it speeds up the code.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The points we need to address when talking about `const` are:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- `Const` values are stored in read-only memory - it's impossible to modify.
 | 
				
			||||||
 | 
					- Values resulting from calling a `const fn` are materialized at compile-time.
 | 
				
			||||||
 | 
					- The compiler may (or may not) copy `const` values wherever it chooses.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Read-Only
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The first point is a bit strange - "read-only memory."
 | 
				
			||||||
 | 
					[The Rust book](https://doc.rust-lang.org/book/ch03-01-variables-and-mutability.html#differences-between-variables-and-constants)
 | 
				
			||||||
 | 
					mentions in a couple places that using `mut` with constants is illegal, but it's also important to
 | 
				
			||||||
 | 
					demonstrate just how immutable they are. _Typically_ in Rust you can use
 | 
				
			||||||
 | 
					[interior mutability](https://doc.rust-lang.org/book/ch15-05-interior-mutability.html) to modify
 | 
				
			||||||
 | 
					things that aren't declared `mut`.
 | 
				
			||||||
 | 
					[`RefCell`](https://doc.rust-lang.org/std/cell/struct.RefCell.html) provides an example of this
 | 
				
			||||||
 | 
					pattern in action:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					use std::cell::RefCell;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					fn my_mutator(cell: &RefCell<u8>) {
 | 
				
			||||||
 | 
					    // Even though we're given an immutable reference,
 | 
				
			||||||
 | 
					    // the `replace` method allows us to modify the inner value.
 | 
				
			||||||
 | 
					    cell.replace(14);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					fn main() {
 | 
				
			||||||
 | 
					    let cell = RefCell::new(25);
 | 
				
			||||||
 | 
					    // Prints out 25
 | 
				
			||||||
 | 
					    println!("Cell: {:?}", cell);
 | 
				
			||||||
 | 
					    my_mutator(&cell);
 | 
				
			||||||
 | 
					    // Prints out 14
 | 
				
			||||||
 | 
					    println!("Cell: {:?}", cell);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					--
 | 
				
			||||||
 | 
					[Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8e4bea1a718edaff4507944e825a54b2)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					When `const` is involved though, interior mutability is impossible:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					use std::cell::RefCell;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					const CELL: RefCell<u8> = RefCell::new(25);
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					fn my_mutator(cell: &RefCell<u8>) {
 | 
				
			||||||
 | 
					    cell.replace(14);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					fn main() {
 | 
				
			||||||
 | 
					    // First line prints 25 as expected
 | 
				
			||||||
 | 
					    println!("Cell: {:?}", &CELL);
 | 
				
			||||||
 | 
					    my_mutator(&CELL);
 | 
				
			||||||
 | 
					    // Second line *still* prints 25
 | 
				
			||||||
 | 
					    println!("Cell: {:?}", &CELL);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					--
 | 
				
			||||||
 | 
					[Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=88fe98110c33c1b3a51e341f48b8ae00)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					And a second example using [`Once`](https://doc.rust-lang.org/std/sync/struct.Once.html):
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					use std::sync::Once;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					const SURPRISE: Once = Once::new();
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					fn main() {
 | 
				
			||||||
 | 
					    // This is how `Once` is supposed to be used
 | 
				
			||||||
 | 
					    SURPRISE.call_once(|| println!("Initializing..."));
 | 
				
			||||||
 | 
					    // Because `Once` is a `const` value, we never record it
 | 
				
			||||||
 | 
					    // having been initialized the first time, and this closure
 | 
				
			||||||
 | 
					    // will also execute.
 | 
				
			||||||
 | 
					    SURPRISE.call_once(|| println!("Initializing again???"));
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					--
 | 
				
			||||||
 | 
					[Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=c3cc5979b5e5434eca0f9ec4a06ee0ed)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					When the
 | 
				
			||||||
 | 
					[`const` specification](https://github.com/rust-lang/rfcs/blob/26197104b7bb9a5a35db243d639aee6e46d35d75/text/0246-const-vs-static.md)
 | 
				
			||||||
 | 
					refers to ["rvalues"](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3055.pdf), this
 | 
				
			||||||
 | 
					behavior is what they refer to. [Clippy](https://github.com/rust-lang/rust-clippy) will treat this
 | 
				
			||||||
 | 
					as an error, but it's still something to be aware of.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Initialization
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The next thing to mention is that `const` values are loaded into memory _as part of your program
 | 
				
			||||||
 | 
					binary_. Because of this, any `const` values declared in your program will be "realized" at
 | 
				
			||||||
 | 
					compile-time; accessing them may trigger a main-memory lookup (with a fixed address, so your CPU may
 | 
				
			||||||
 | 
					be able to prefetch the value), but that's it.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					use std::cell::RefCell;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					const CELL: RefCell<u32> = RefCell::new(24);
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn multiply(value: u32) -> u32 {
 | 
				
			||||||
 | 
					    // CELL is stored at `.L__unnamed_1`
 | 
				
			||||||
 | 
					    value * (*CELL.get_mut())
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/Th8boO)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The compiler creates one `RefCell`, uses it everywhere, and never needs to call the `RefCell::new`
 | 
				
			||||||
 | 
					function.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Copying
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					If it's helpful though, the compiler can choose to copy `const` values.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					const FACTOR: u32 = 1000;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn multiply(value: u32) -> u32 {
 | 
				
			||||||
 | 
					    // See assembly line 4 for the `mov edi, 1000` instruction
 | 
				
			||||||
 | 
					    value * FACTOR
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn multiply_twice(value: u32) -> u32 {
 | 
				
			||||||
 | 
					    // See assembly lines 22 and 29 for `mov edi, 1000` instructions
 | 
				
			||||||
 | 
					    value * FACTOR * FACTOR
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/ZtS54X)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In this example, the `FACTOR` value is turned into the `mov edi, 1000` instruction in both the
 | 
				
			||||||
 | 
					`multiply` and `multiply_twice` functions; the "1000" value is never "stored" anywhere, as it's
 | 
				
			||||||
 | 
					small enough to inline into the assembly instructions.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Finally, getting the address of a `const` value is possible, but not guaranteed to be unique
 | 
				
			||||||
 | 
					(because the compiler can choose to copy values). I was unable to get non-unique pointers in my
 | 
				
			||||||
 | 
					testing (even using different crates), but the specifications are clear enough: _don't rely on
 | 
				
			||||||
 | 
					pointers to `const` values being consistent_. To be frank, caring about locations for `const` values
 | 
				
			||||||
 | 
					is almost certainly a code smell.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## `static` values
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Static variables are related to `const` variables, but take a slightly different approach. When we
 | 
				
			||||||
 | 
					declare that a _reference_ is unique for the life of a program, you have a `static` variable
 | 
				
			||||||
 | 
					(unrelated to the `'static` lifetime). Because of the reference/value distinction with
 | 
				
			||||||
 | 
					`const`/`static`, static variables behave much more like typical "global" variables.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					But to understand `static`, here's what we'll look at:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- `static` variables are globally unique locations in memory.
 | 
				
			||||||
 | 
					- Like `const`, `static` variables are loaded at the same time as your program being read into
 | 
				
			||||||
 | 
					  memory.
 | 
				
			||||||
 | 
					- All `static` variables must implement the
 | 
				
			||||||
 | 
					  [`Sync`](https://doc.rust-lang.org/std/marker/trait.Sync.html) marker trait.
 | 
				
			||||||
 | 
					- Interior mutability is safe and acceptable when using `static` variables.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Memory Uniqueness
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The single biggest difference between `const` and `static` is the guarantees provided about
 | 
				
			||||||
 | 
					uniqueness. Where `const` variables may or may not be copied in code, `static` variables are
 | 
				
			||||||
 | 
					guarantee to be unique. If we take a previous `const` example and change it to `static`, the
 | 
				
			||||||
 | 
					difference should be clear:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					static FACTOR: u32 = 1000;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn multiply(value: u32) -> u32 {
 | 
				
			||||||
 | 
					    // The assembly to `mul dword ptr [rip + example::FACTOR]` is how FACTOR gets used
 | 
				
			||||||
 | 
					    value * FACTOR
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn multiply_twice(value: u32) -> u32 {
 | 
				
			||||||
 | 
					    // The assembly to `mul dword ptr [rip + example::FACTOR]` is how FACTOR gets used
 | 
				
			||||||
 | 
					    value * FACTOR * FACTOR
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/uxmiRQ)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Where [previously](#copying) there were plenty of references to multiplying by 1000, the new
 | 
				
			||||||
 | 
					assembly refers to `FACTOR` as a named memory location instead. No initialization work needs to be
 | 
				
			||||||
 | 
					done, but the compiler can no longer prove the value never changes during execution.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Initialization
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Next, let's talk about initialization. The simplest case is initializing static variables with
 | 
				
			||||||
 | 
					either scalar or struct notation:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					#[derive(Debug)]
 | 
				
			||||||
 | 
					struct MyStruct {
 | 
				
			||||||
 | 
					    x: u32
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					static MY_STRUCT: MyStruct = MyStruct {
 | 
				
			||||||
 | 
					    // You can even reference other statics
 | 
				
			||||||
 | 
					    // declared later
 | 
				
			||||||
 | 
					    x: MY_VAL
 | 
				
			||||||
 | 
					};
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					static MY_VAL: u32 = 24;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					fn main() {
 | 
				
			||||||
 | 
					    println!("Static MyStruct: {:?}", MY_STRUCT);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					--
 | 
				
			||||||
 | 
					[Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=b538dbc46076f12db047af4f4403ee6e)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Things can get a bit weirder when using `const fn` though. In most cases, it just works:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					#[derive(Debug)]
 | 
				
			||||||
 | 
					struct MyStruct {
 | 
				
			||||||
 | 
					    x: u32
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					impl MyStruct {
 | 
				
			||||||
 | 
					    const fn new() -> MyStruct {
 | 
				
			||||||
 | 
					        MyStruct { x: 24 }
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					static MY_STRUCT: MyStruct = MyStruct::new();
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					fn main() {
 | 
				
			||||||
 | 
					    println!("const fn Static MyStruct: {:?}", MY_STRUCT);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					--
 | 
				
			||||||
 | 
					[Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8c796a6e7fc273c12115091b707b0255)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					However, there's a caveat: you're currently not allowed to use `const fn` to initialize static
 | 
				
			||||||
 | 
					variables of types that aren't marked `Sync`. For example,
 | 
				
			||||||
 | 
					[`RefCell::new()`](https://doc.rust-lang.org/std/cell/struct.RefCell.html#method.new) is a
 | 
				
			||||||
 | 
					`const fn`, but because
 | 
				
			||||||
 | 
					[`RefCell` isn't `Sync`](https://doc.rust-lang.org/std/cell/struct.RefCell.html#impl-Sync), you'll
 | 
				
			||||||
 | 
					get an error at compile time:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					use std::cell::RefCell;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// error[E0277]: `std::cell::RefCell<u8>` cannot be shared between threads safely
 | 
				
			||||||
 | 
					static MY_LOCK: RefCell<u8> = RefCell::new(0);
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					--
 | 
				
			||||||
 | 
					[Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=c76ef86e473d07117a1700e21fd45560)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					It's likely that this will
 | 
				
			||||||
 | 
					[change in the future](https://github.com/rust-lang/rfcs/blob/master/text/0911-const-fn.md) though.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### The `Sync` marker
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Which leads well to the next point: static variable types must implement the
 | 
				
			||||||
 | 
					[`Sync` marker](https://doc.rust-lang.org/std/marker/trait.Sync.html). Because they're globally
 | 
				
			||||||
 | 
					unique, it must be safe for you to access static variables from any thread at any time. Most
 | 
				
			||||||
 | 
					`struct` definitions automatically implement the `Sync` trait because they contain only elements
 | 
				
			||||||
 | 
					which themselves implement `Sync` (read more in the
 | 
				
			||||||
 | 
					[Nomicon](https://doc.rust-lang.org/nomicon/send-and-sync.html)). This is why earlier examples could
 | 
				
			||||||
 | 
					get away with initializing statics, even though we never included an `impl Sync for MyStruct` in the
 | 
				
			||||||
 | 
					code. To demonstrate this property, Rust refuses to compile our earlier example if we add a
 | 
				
			||||||
 | 
					non-`Sync` element to the `struct` definition:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					use std::cell::RefCell;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					struct MyStruct {
 | 
				
			||||||
 | 
					    x: u32,
 | 
				
			||||||
 | 
					    y: RefCell<u8>,
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// error[E0277]: `std::cell::RefCell<u8>` cannot be shared between threads safely
 | 
				
			||||||
 | 
					static MY_STRUCT: MyStruct = MyStruct {
 | 
				
			||||||
 | 
					    x: 8,
 | 
				
			||||||
 | 
					    y: RefCell::new(8)
 | 
				
			||||||
 | 
					};
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					--
 | 
				
			||||||
 | 
					[Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=40074d0248f056c296b662dbbff97cfc)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					### Interior mutability
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Finally, while `static mut` variables are allowed, mutating them is an `unsafe` operation. If we
 | 
				
			||||||
 | 
					want to stay in `safe` Rust, we can use interior mutability to accomplish similar goals:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					use std::sync::Once;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// This example adapted from https://doc.rust-lang.org/std/sync/struct.Once.html#method.call_once
 | 
				
			||||||
 | 
					static INIT: Once = Once::new();
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					fn main() {
 | 
				
			||||||
 | 
					    // Note that while `INIT` is declared immutable, we're still allowed
 | 
				
			||||||
 | 
					    // to mutate its interior
 | 
				
			||||||
 | 
					    INIT.call_once(|| println!("Initializing..."));
 | 
				
			||||||
 | 
					    // This code won't panic, as the interior of INIT was modified
 | 
				
			||||||
 | 
					    // as part of the previous `call_once`
 | 
				
			||||||
 | 
					    INIT.call_once(|| panic!("INIT was called twice!"));
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					--
 | 
				
			||||||
 | 
					[Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=3ba003a981a7ed7400240caadd384d59)
 | 
				
			||||||
							
								
								
									
										601
									
								
								blog/2019-02-06-stacking-up/_article.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										601
									
								
								blog/2019-02-06-stacking-up/_article.md
									
									
									
									
									
										Normal file
									
								
							@ -0,0 +1,601 @@
 | 
				
			|||||||
 | 
					---
 | 
				
			||||||
 | 
					layout: post
 | 
				
			||||||
 | 
					title: "Fixed Memory: Stacking Up"
 | 
				
			||||||
 | 
					description: "We don't need no allocator."
 | 
				
			||||||
 | 
					category:
 | 
				
			||||||
 | 
					tags: [rust, understanding-allocations]
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					`const` and `static` are perfectly fine, but it's relatively rare that we know at compile-time about
 | 
				
			||||||
 | 
					either values or references that will be the same for the duration of our program. Put another way,
 | 
				
			||||||
 | 
					it's not often the case that either you or your compiler knows how much memory your entire program
 | 
				
			||||||
 | 
					will ever need.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					However, there are still some optimizations the compiler can do if it knows how much memory
 | 
				
			||||||
 | 
					individual functions will need. Specifically, the compiler can make use of "stack" memory (as
 | 
				
			||||||
 | 
					opposed to "heap" memory) which can be managed far faster in both the short- and long-term. When
 | 
				
			||||||
 | 
					requesting memory, the [`push` instruction](http://www.cs.virginia.edu/~evans/cs216/guides/x86.html)
 | 
				
			||||||
 | 
					can typically complete in [1 or 2 cycles](https://agner.org/optimize/instruction_tables.ods) (<1
 | 
				
			||||||
 | 
					nanosecond on modern CPUs). Contrast that to heap memory which requires an allocator (specialized
 | 
				
			||||||
 | 
					software to track what memory is in use) to reserve space. When you're finished with stack memory,
 | 
				
			||||||
 | 
					the `pop` instruction runs in 1-3 cycles, as opposed to an allocator needing to worry about memory
 | 
				
			||||||
 | 
					fragmentation and other issues with the heap. All sorts of incredibly sophisticated techniques have
 | 
				
			||||||
 | 
					been used to design allocators:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- [Garbage Collection](<https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)>)
 | 
				
			||||||
 | 
					  strategies like [Tracing](https://en.wikipedia.org/wiki/Tracing_garbage_collection) (used in
 | 
				
			||||||
 | 
					  [Java](https://www.oracle.com/technetwork/java/javase/tech/g1-intro-jsp-135488.html)) and
 | 
				
			||||||
 | 
					  [Reference counting](https://en.wikipedia.org/wiki/Reference_counting) (used in
 | 
				
			||||||
 | 
					  [Python](https://docs.python.org/3/extending/extending.html#reference-counts))
 | 
				
			||||||
 | 
					- Thread-local structures to prevent locking the allocator in
 | 
				
			||||||
 | 
					  [tcmalloc](https://jamesgolick.com/2013/5/19/how-tcmalloc-works.html)
 | 
				
			||||||
 | 
					- Arena structures used in [jemalloc](http://jemalloc.net/), which
 | 
				
			||||||
 | 
					  [until recently](https://blog.rust-lang.org/2019/01/17/Rust-1.32.0.html#jemalloc-is-removed-by-default)
 | 
				
			||||||
 | 
					  was the primary allocator for Rust programs!
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					But no matter how fast your allocator is, the principle remains: the fastest allocator is the one
 | 
				
			||||||
 | 
					you never use. As such, we're not going to discuss how exactly the
 | 
				
			||||||
 | 
					[`push` and `pop` instructions work](http://www.cs.virginia.edu/~evans/cs216/guides/x86.html), but
 | 
				
			||||||
 | 
					we'll focus instead on the conditions that enable the Rust compiler to use faster stack-based
 | 
				
			||||||
 | 
					allocation for variables.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					So, **how do we know when Rust will or will not use stack allocation for objects we create?**
 | 
				
			||||||
 | 
					Looking at other languages, it's often easy to delineate between stack and heap. Managed memory
 | 
				
			||||||
 | 
					languages (Python, Java,
 | 
				
			||||||
 | 
					[C#](https://blogs.msdn.microsoft.com/ericlippert/2010/09/30/the-truth-about-value-types/)) place
 | 
				
			||||||
 | 
					everything on the heap. JIT compilers ([PyPy](https://www.pypy.org/),
 | 
				
			||||||
 | 
					[HotSpot](https://www.oracle.com/technetwork/java/javase/tech/index-jsp-136373.html)) may optimize
 | 
				
			||||||
 | 
					some heap allocations away, but you should never assume it will happen. C makes things clear with
 | 
				
			||||||
 | 
					calls to special functions (like [malloc(3)](https://linux.die.net/man/3/malloc)) needed to access
 | 
				
			||||||
 | 
					heap memory. Old C++ has the [`new`](https://stackoverflow.com/a/655086/1454178) keyword, though
 | 
				
			||||||
 | 
					modern C++/C++11 is more complicated with [RAII](https://en.cppreference.com/w/cpp/language/raii).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					For Rust, we can summarize as follows: **stack allocation will be used for everything that doesn't
 | 
				
			||||||
 | 
					involve "smart pointers" and collections**. We'll skip over a precise definition of the term "smart
 | 
				
			||||||
 | 
					pointer" for now, and instead discuss what we should watch for to understand when stack and heap
 | 
				
			||||||
 | 
					memory regions are used:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					1. Stack manipulation instructions (`push`, `pop`, and `add`/`sub` of the `rsp` register) indicate
 | 
				
			||||||
 | 
					   allocation of stack memory:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					   ```rust
 | 
				
			||||||
 | 
					   pub fn stack_alloc(x: u32) -> u32 {
 | 
				
			||||||
 | 
					       // Space for `y` is allocated by subtracting from `rsp`,
 | 
				
			||||||
 | 
					       // and then populated
 | 
				
			||||||
 | 
					       let y = [1u8, 2, 3, 4];
 | 
				
			||||||
 | 
					       // Space for `y` is deallocated by adding back to `rsp`
 | 
				
			||||||
 | 
					       x
 | 
				
			||||||
 | 
					   }
 | 
				
			||||||
 | 
					   ```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					   -- [Compiler Explorer](https://godbolt.org/z/5WSgc9)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					2. Tracking when exactly heap allocation calls occur is difficult. It's typically easier to watch
 | 
				
			||||||
 | 
					   for `call core::ptr::real_drop_in_place`, and infer that a heap allocation happened in the recent
 | 
				
			||||||
 | 
					   past:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					   ```rust
 | 
				
			||||||
 | 
					   pub fn heap_alloc(x: usize) -> usize {
 | 
				
			||||||
 | 
					       // Space for elements in a vector has to be allocated
 | 
				
			||||||
 | 
					       // on the heap, and is then de-allocated once the
 | 
				
			||||||
 | 
					       // vector goes out of scope
 | 
				
			||||||
 | 
					       let y: Vec<u8> = Vec::with_capacity(x);
 | 
				
			||||||
 | 
					       x
 | 
				
			||||||
 | 
					   }
 | 
				
			||||||
 | 
					   ```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					   -- [Compiler Explorer](https://godbolt.org/z/epfgoQ) (`real_drop_in_place` happens on line 1317)
 | 
				
			||||||
 | 
					   <span style="font-size: .8em">Note: While the
 | 
				
			||||||
 | 
					   [`Drop` trait](https://doc.rust-lang.org/std/ops/trait.Drop.html) is
 | 
				
			||||||
 | 
					   [called for stack-allocated objects](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=87edf374d8983816eb3d8cfeac657b46),
 | 
				
			||||||
 | 
					   the Rust standard library only defines `Drop` implementations for types that involve heap
 | 
				
			||||||
 | 
					   allocation.</span>
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					3. If you don't want to inspect the assembly, use a custom allocator that's able to track and alert
 | 
				
			||||||
 | 
					   when heap allocations occur. Crates like
 | 
				
			||||||
 | 
					   [`alloc_counter`](https://crates.io/crates/alloc_counter) are designed for exactly this purpose.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					With all that in mind, let's talk about situations in which we're guaranteed to use stack memory:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- Structs are created on the stack.
 | 
				
			||||||
 | 
					- Function arguments are passed on the stack, meaning the
 | 
				
			||||||
 | 
					  [`#[inline]` attribute](https://doc.rust-lang.org/reference/attributes.html#inline-attribute) will
 | 
				
			||||||
 | 
					  not change the memory region used.
 | 
				
			||||||
 | 
					- Enums and unions are stack-allocated.
 | 
				
			||||||
 | 
					- [Arrays](https://doc.rust-lang.org/std/primitive.array.html) are always stack-allocated.
 | 
				
			||||||
 | 
					- Closures capture their arguments on the stack.
 | 
				
			||||||
 | 
					- Generics will use stack allocation, even with dynamic dispatch.
 | 
				
			||||||
 | 
					- [`Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html) types are guaranteed to be
 | 
				
			||||||
 | 
					  stack-allocated, and copying them will be done in stack memory.
 | 
				
			||||||
 | 
					- [`Iterator`s](https://doc.rust-lang.org/std/iter/trait.Iterator.html) in the standard library are
 | 
				
			||||||
 | 
					  stack-allocated even when iterating over heap-based collections.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					# Structs
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The simplest case comes first. When creating vanilla `struct` objects, we use stack memory to hold
 | 
				
			||||||
 | 
					their contents:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					struct Point {
 | 
				
			||||||
 | 
					    x: u64,
 | 
				
			||||||
 | 
					    y: u64,
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					struct Line {
 | 
				
			||||||
 | 
					    a: Point,
 | 
				
			||||||
 | 
					    b: Point,
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn make_line() {
 | 
				
			||||||
 | 
					    // `origin` is stored in the first 16 bytes of memory
 | 
				
			||||||
 | 
					    // starting at location `rsp`
 | 
				
			||||||
 | 
					    let origin = Point { x: 0, y: 0 };
 | 
				
			||||||
 | 
					    // `point` makes up the next 16 bytes of memory
 | 
				
			||||||
 | 
					    let point = Point { x: 1, y: 2 };
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    // When creating `ray`, we just move the content out of
 | 
				
			||||||
 | 
					    // `origin` and `point` into the next 32 bytes of memory
 | 
				
			||||||
 | 
					    let ray = Line { a: origin, b: point };
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/vri9BE)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Note that while some extra-fancy instructions are used for memory manipulation in the assembly, the
 | 
				
			||||||
 | 
					`sub rsp, 64` instruction indicates we're still working with the stack.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					# Function arguments
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Have you ever wondered how functions communicate with each other? Like, once the variables are given
 | 
				
			||||||
 | 
					to you, everything's fine. But how do you "give" those variables to another function? How do you get
 | 
				
			||||||
 | 
					the results back afterward? The answer: the compiler arranges memory and assembly instructions using
 | 
				
			||||||
 | 
					a pre-determined [calling convention](http://llvm.org/docs/LangRef.html#calling-conventions). This
 | 
				
			||||||
 | 
					convention governs the rules around where arguments needed by a function will be located (either in
 | 
				
			||||||
 | 
					memory offsets relative to the stack pointer `rsp`, or in other registers), and where the results
 | 
				
			||||||
 | 
					can be found once the function has finished. And when multiple languages agree on what the calling
 | 
				
			||||||
 | 
					conventions are, you can do things like having [Go call Rust code](https://blog.filippo.io/rustgo/)!
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Put simply: it's the compiler's job to figure out how to call other functions, and you can assume
 | 
				
			||||||
 | 
					that the compiler is good at its job.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					We can see this in action using a simple example:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					struct Point {
 | 
				
			||||||
 | 
					    x: i64,
 | 
				
			||||||
 | 
					    y: i64,
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// We use integer division operations to keep
 | 
				
			||||||
 | 
					// the assembly clean, understanding the result
 | 
				
			||||||
 | 
					// isn't accurate.
 | 
				
			||||||
 | 
					fn distance(a: &Point, b: &Point) -> i64 {
 | 
				
			||||||
 | 
					    // Immediately subtract from `rsp` the bytes needed
 | 
				
			||||||
 | 
					    // to hold all the intermediate results - this is
 | 
				
			||||||
 | 
					    // the stack allocation step
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    // The compiler used the `rdi` and `rsi` registers
 | 
				
			||||||
 | 
					    // to pass our arguments, so read them in
 | 
				
			||||||
 | 
					    let x1 = a.x;
 | 
				
			||||||
 | 
					    let x2 = b.x;
 | 
				
			||||||
 | 
					    let y1 = a.y;
 | 
				
			||||||
 | 
					    let y2 = b.y;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    // Do the actual math work
 | 
				
			||||||
 | 
					    let x_pow = (x1 - x2) * (x1 - x2);
 | 
				
			||||||
 | 
					    let y_pow = (y1 - y2) * (y1 - y2);
 | 
				
			||||||
 | 
					    let squared = x_pow + y_pow;
 | 
				
			||||||
 | 
					    squared / squared
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    // Our final result will be stored in the `rax` register
 | 
				
			||||||
 | 
					    // so that our caller knows where to retrieve it.
 | 
				
			||||||
 | 
					    // Finally, add back to `rsp` the stack memory that is
 | 
				
			||||||
 | 
					    // now ready to be used by other functions.
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn total_distance() {
 | 
				
			||||||
 | 
					    let start = Point { x: 1, y: 2 };
 | 
				
			||||||
 | 
					    let middle = Point { x: 3, y: 4 };
 | 
				
			||||||
 | 
					    let end = Point { x: 5, y: 6 };
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    let _dist_1 = distance(&start, &middle);
 | 
				
			||||||
 | 
					    let _dist_2 = distance(&middle, &end);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/Qmx4ST)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					As a consequence of function arguments never using heap memory, we can also infer that functions
 | 
				
			||||||
 | 
					using the `#[inline]` attributes also do not heap allocate. But better than inferring, we can look
 | 
				
			||||||
 | 
					at the assembly to prove it:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					struct Point {
 | 
				
			||||||
 | 
					    x: i64,
 | 
				
			||||||
 | 
					    y: i64,
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// Note that there is no `distance` function in the assembly output,
 | 
				
			||||||
 | 
					// and the total line count goes from 229 with inlining off
 | 
				
			||||||
 | 
					// to 306 with inline on. Even still, no heap allocations occur.
 | 
				
			||||||
 | 
					#[inline(always)]
 | 
				
			||||||
 | 
					fn distance(a: &Point, b: &Point) -> i64 {
 | 
				
			||||||
 | 
					    let x1 = a.x;
 | 
				
			||||||
 | 
					    let x2 = b.x;
 | 
				
			||||||
 | 
					    let y1 = a.y;
 | 
				
			||||||
 | 
					    let y2 = b.y;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    let x_pow = (a.x - b.x) * (a.x - b.x);
 | 
				
			||||||
 | 
					    let y_pow = (a.y - b.y) * (a.y - b.y);
 | 
				
			||||||
 | 
					    let squared = x_pow + y_pow;
 | 
				
			||||||
 | 
					    squared / squared
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn total_distance() {
 | 
				
			||||||
 | 
					    let start = Point { x: 1, y: 2 };
 | 
				
			||||||
 | 
					    let middle = Point { x: 3, y: 4 };
 | 
				
			||||||
 | 
					    let end = Point { x: 5, y: 6 };
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    let _dist_1 = distance(&start, &middle);
 | 
				
			||||||
 | 
					    let _dist_2 = distance(&middle, &end);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/30Sh66)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Finally, passing by value (arguments with type
 | 
				
			||||||
 | 
					[`Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html)) and passing by reference (either
 | 
				
			||||||
 | 
					moving ownership or passing a pointer) may have slightly different layouts in assembly, but will
 | 
				
			||||||
 | 
					still use either stack memory or CPU registers:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					pub struct Point {
 | 
				
			||||||
 | 
					    x: i64,
 | 
				
			||||||
 | 
					    y: i64,
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// Moving values
 | 
				
			||||||
 | 
					pub fn distance_moved(a: Point, b: Point) -> i64 {
 | 
				
			||||||
 | 
					    let x1 = a.x;
 | 
				
			||||||
 | 
					    let x2 = b.x;
 | 
				
			||||||
 | 
					    let y1 = a.y;
 | 
				
			||||||
 | 
					    let y2 = b.y;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    let x_pow = (x1 - x2) * (x1 - x2);
 | 
				
			||||||
 | 
					    let y_pow = (y1 - y2) * (y1 - y2);
 | 
				
			||||||
 | 
					    let squared = x_pow + y_pow;
 | 
				
			||||||
 | 
					    squared / squared
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// Borrowing values has two extra `mov` instructions on lines 21 and 22
 | 
				
			||||||
 | 
					pub fn distance_borrowed(a: &Point, b: &Point) -> i64 {
 | 
				
			||||||
 | 
					    let x1 = a.x;
 | 
				
			||||||
 | 
					    let x2 = b.x;
 | 
				
			||||||
 | 
					    let y1 = a.y;
 | 
				
			||||||
 | 
					    let y2 = b.y;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    let x_pow = (x1 - x2) * (x1 - x2);
 | 
				
			||||||
 | 
					    let y_pow = (y1 - y2) * (y1 - y2);
 | 
				
			||||||
 | 
					    let squared = x_pow + y_pow;
 | 
				
			||||||
 | 
					    squared / squared
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/06hGiv)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					# Enums
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					If you've ever worried that wrapping your types in
 | 
				
			||||||
 | 
					[`Option`](https://doc.rust-lang.org/stable/core/option/enum.Option.html) or
 | 
				
			||||||
 | 
					[`Result`](https://doc.rust-lang.org/stable/core/result/enum.Result.html) would finally make them
 | 
				
			||||||
 | 
					large enough that Rust decides to use heap allocation instead, fear no longer: `enum` and union
 | 
				
			||||||
 | 
					types don't use heap allocation:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					enum MyEnum {
 | 
				
			||||||
 | 
					    Small(u8),
 | 
				
			||||||
 | 
					    Large(u64)
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					struct MyStruct {
 | 
				
			||||||
 | 
					    x: MyEnum,
 | 
				
			||||||
 | 
					    y: MyEnum,
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn enum_compare() {
 | 
				
			||||||
 | 
					    let x = MyEnum::Small(0);
 | 
				
			||||||
 | 
					    let y = MyEnum::Large(0);
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    let z = MyStruct { x, y };
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    let opt = Option::Some(z);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/HK7zBx)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Because the size of an `enum` is the size of its largest element plus a flag, the compiler can
 | 
				
			||||||
 | 
					predict how much memory is used no matter which variant of an enum is currently stored in a
 | 
				
			||||||
 | 
					variable. Thus, enums and unions have no need of heap allocation. There's unfortunately not a great
 | 
				
			||||||
 | 
					way to show this in assembly, so I'll instead point you to the
 | 
				
			||||||
 | 
					[`core::mem::size_of`](https://doc.rust-lang.org/stable/core/mem/fn.size_of.html#size-of-enums)
 | 
				
			||||||
 | 
					documentation.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					# Arrays
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The array type is guaranteed to be stack allocated, which is why the array size must be declared.
 | 
				
			||||||
 | 
					Interestingly enough, this can be used to cause safe Rust programs to crash:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					// 256 bytes
 | 
				
			||||||
 | 
					#[derive(Default)]
 | 
				
			||||||
 | 
					struct TwoFiftySix {
 | 
				
			||||||
 | 
					    _a: [u64; 32]
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// 8 kilobytes
 | 
				
			||||||
 | 
					#[derive(Default)]
 | 
				
			||||||
 | 
					struct EightK {
 | 
				
			||||||
 | 
					    _a: [TwoFiftySix; 32]
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// 256 kilobytes
 | 
				
			||||||
 | 
					#[derive(Default)]
 | 
				
			||||||
 | 
					struct TwoFiftySixK {
 | 
				
			||||||
 | 
					    _a: [EightK; 32]
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// 8 megabytes - exceeds space typically provided for the stack,
 | 
				
			||||||
 | 
					// though the kernel can be instructed to allocate more.
 | 
				
			||||||
 | 
					// On Linux, you can check stack size using `ulimit -s`
 | 
				
			||||||
 | 
					#[derive(Default)]
 | 
				
			||||||
 | 
					struct EightM {
 | 
				
			||||||
 | 
					    _a: [TwoFiftySixK; 32]
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					fn main() {
 | 
				
			||||||
 | 
					    // Because we already have things in stack memory
 | 
				
			||||||
 | 
					    // (like the current function call stack), allocating another
 | 
				
			||||||
 | 
					    // eight megabytes of stack memory crashes the program
 | 
				
			||||||
 | 
					    let _x = EightM::default();
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					--
 | 
				
			||||||
 | 
					[Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=587a6380a4914bcbcef4192c90c01dc4)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					There aren't any security implications of this (no memory corruption occurs), but it's good to note
 | 
				
			||||||
 | 
					that the Rust compiler won't move arrays into heap memory even if they can be reasonably expected to
 | 
				
			||||||
 | 
					overflow the stack.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					# Closures
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Rules for how anonymous functions capture their arguments are typically language-specific. In Java,
 | 
				
			||||||
 | 
					[Lambda Expressions](https://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html) are
 | 
				
			||||||
 | 
					actually objects created on the heap that capture local primitives by copying, and capture local
 | 
				
			||||||
 | 
					non-primitives as (`final`) references.
 | 
				
			||||||
 | 
					[Python](https://docs.python.org/3.7/reference/expressions.html#lambda) and
 | 
				
			||||||
 | 
					[JavaScript](https://javascriptweblog.wordpress.com/2010/10/25/understanding-javascript-closures/)
 | 
				
			||||||
 | 
					both bind _everything_ by reference normally, but Python can also
 | 
				
			||||||
 | 
					[capture values](https://stackoverflow.com/a/235764/1454178) and JavaScript has
 | 
				
			||||||
 | 
					[Arrow functions](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/Arrow_functions).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In Rust, arguments to closures are the same as arguments to other functions; closures are simply
 | 
				
			||||||
 | 
					functions that don't have a declared name. Some weird ordering of the stack may be required to
 | 
				
			||||||
 | 
					handle them, but it's the compiler's responsiblity to figure that out.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Each example below has the same effect, but a different assembly implementation. In the simplest
 | 
				
			||||||
 | 
					case, we immediately run a closure returned by another function. Because we don't store a reference
 | 
				
			||||||
 | 
					to the closure, the stack memory needed to store the captured values is contiguous:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					fn my_func() -> impl FnOnce() {
 | 
				
			||||||
 | 
					    let x = 24;
 | 
				
			||||||
 | 
					    // Note that this closure in assembly looks exactly like
 | 
				
			||||||
 | 
					    // any other function; you even use the `call` instruction
 | 
				
			||||||
 | 
					    // to start running it.
 | 
				
			||||||
 | 
					    move || { x; }
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn immediate() {
 | 
				
			||||||
 | 
					    my_func()();
 | 
				
			||||||
 | 
					    my_func()();
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/mgJ2zl), 25 total assembly instructions
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					If we store a reference to the closure, the Rust compiler keeps values it needs in the stack memory
 | 
				
			||||||
 | 
					of the original function. Getting the details right is a bit harder, so the instruction count goes
 | 
				
			||||||
 | 
					up even though this code is functionally equivalent to our original example:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					pub fn simple_reference() {
 | 
				
			||||||
 | 
					    let x = my_func();
 | 
				
			||||||
 | 
					    let y = my_func();
 | 
				
			||||||
 | 
					    y();
 | 
				
			||||||
 | 
					    x();
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/K_dj5n), 55 total assembly instructions
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Even things like variable order can make a difference in instruction count:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					pub fn complex() {
 | 
				
			||||||
 | 
					    let x = my_func();
 | 
				
			||||||
 | 
					    let y = my_func();
 | 
				
			||||||
 | 
					    x();
 | 
				
			||||||
 | 
					    y();
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/p37qFl), 70 total assembly instructions
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In every circumstance though, the compiler ensured that no heap allocations were necessary.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					# Generics
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Traits in Rust come in two broad forms: static dispatch (monomorphization, `impl Trait`) and dynamic
 | 
				
			||||||
 | 
					dispatch (trait objects, `dyn Trait`). While dynamic dispatch is often _associated_ with trait
 | 
				
			||||||
 | 
					objects being stored in the heap, dynamic dispatch can be used with stack allocated objects as well:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					trait GetInt {
 | 
				
			||||||
 | 
					    fn get_int(&self) -> u64;
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// vtable stored at section L__unnamed_1
 | 
				
			||||||
 | 
					struct WhyNotU8 {
 | 
				
			||||||
 | 
					    x: u8
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					impl GetInt for WhyNotU8 {
 | 
				
			||||||
 | 
					    fn get_int(&self) -> u64 {
 | 
				
			||||||
 | 
					        self.x as u64
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// vtable stored at section L__unnamed_2
 | 
				
			||||||
 | 
					struct ActualU64 {
 | 
				
			||||||
 | 
					    x: u64
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					impl GetInt for ActualU64 {
 | 
				
			||||||
 | 
					    fn get_int(&self) -> u64 {
 | 
				
			||||||
 | 
					        self.x
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// `&dyn` declares that we want to use dynamic dispatch
 | 
				
			||||||
 | 
					// rather than monomorphization, so there is only one
 | 
				
			||||||
 | 
					// `retrieve_int` function that shows up in the final assembly.
 | 
				
			||||||
 | 
					// If we used generics, there would be one implementation of
 | 
				
			||||||
 | 
					// `retrieve_int` for each type that implements `GetInt`.
 | 
				
			||||||
 | 
					pub fn retrieve_int(u: &dyn GetInt) {
 | 
				
			||||||
 | 
					    // In the assembly, we just call an address given to us
 | 
				
			||||||
 | 
					    // in the `rsi` register and hope that it was set up
 | 
				
			||||||
 | 
					    // correctly when this function was invoked.
 | 
				
			||||||
 | 
					    let x = u.get_int();
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn do_call() {
 | 
				
			||||||
 | 
					    // Note that even though the vtable for `WhyNotU8` and
 | 
				
			||||||
 | 
					    // `ActualU64` includes a pointer to
 | 
				
			||||||
 | 
					    // `core::ptr::real_drop_in_place`, it is never invoked.
 | 
				
			||||||
 | 
					    let a = WhyNotU8 { x: 0 };
 | 
				
			||||||
 | 
					    let b = ActualU64 { x: 0 };
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    retrieve_int(&a);
 | 
				
			||||||
 | 
					    retrieve_int(&b);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/u_yguS)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					It's hard to imagine practical situations where dynamic dispatch would be used for objects that
 | 
				
			||||||
 | 
					aren't heap allocated, but it technically can be done.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					# Copy types
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Understanding move semantics and copy semantics in Rust is weird at first. The Rust docs
 | 
				
			||||||
 | 
					[go into detail](https://doc.rust-lang.org/stable/core/marker/trait.Copy.html) far better than can
 | 
				
			||||||
 | 
					be addressed here, so I'll leave them to do the job. From a memory perspective though, their
 | 
				
			||||||
 | 
					guideline is reasonable:
 | 
				
			||||||
 | 
					[if your type can implemement `Copy`, it should](https://doc.rust-lang.org/stable/core/marker/trait.Copy.html#when-should-my-type-be-copy).
 | 
				
			||||||
 | 
					While there are potential speed tradeoffs to _benchmark_ when discussing `Copy` (move semantics for
 | 
				
			||||||
 | 
					stack objects vs. copying stack pointers vs. copying stack `struct`s), _it's impossible for `Copy`
 | 
				
			||||||
 | 
					to introduce a heap allocation_.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					But why is this the case? Fundamentally, it's because the language controls what `Copy` means -
 | 
				
			||||||
 | 
					["the behavior of `Copy` is not overloadable"](https://doc.rust-lang.org/std/marker/trait.Copy.html#whats-the-difference-between-copy-and-clone)
 | 
				
			||||||
 | 
					because it's a marker trait. From there we'll note that a type
 | 
				
			||||||
 | 
					[can implement `Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html#when-can-my-type-be-copy)
 | 
				
			||||||
 | 
					if (and only if) its components implement `Copy`, and that
 | 
				
			||||||
 | 
					[no heap-allocated types implement `Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html#implementors).
 | 
				
			||||||
 | 
					Thus, assignments involving heap types are always move semantics, and new heap allocations won't
 | 
				
			||||||
 | 
					occur because of implicit operator behavior.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					#[derive(Clone)]
 | 
				
			||||||
 | 
					struct Cloneable {
 | 
				
			||||||
 | 
					    x: Box<u64>
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// error[E0204]: the trait `Copy` may not be implemented for this type
 | 
				
			||||||
 | 
					#[derive(Copy, Clone)]
 | 
				
			||||||
 | 
					struct NotCopyable {
 | 
				
			||||||
 | 
					    x: Box<u64>
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/VToRuK)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					# Iterators
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In managed memory languages (like
 | 
				
			||||||
 | 
					[Java](https://www.youtube.com/watch?v=bSkpMdDe4g4&feature=youtu.be&t=357)), there's a subtle
 | 
				
			||||||
 | 
					difference between these two code samples:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```java
 | 
				
			||||||
 | 
					public static int sum_for(List<Long> vals) {
 | 
				
			||||||
 | 
					    long sum = 0;
 | 
				
			||||||
 | 
					    // Regular for loop
 | 
				
			||||||
 | 
					    for (int i = 0; i < vals.length; i++) {
 | 
				
			||||||
 | 
					        sum += vals[i];
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					    return sum;
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					public static int sum_foreach(List<Long> vals) {
 | 
				
			||||||
 | 
					    long sum = 0;
 | 
				
			||||||
 | 
					    // "Foreach" loop - uses iteration
 | 
				
			||||||
 | 
					    for (Long l : vals) {
 | 
				
			||||||
 | 
					        sum += l;
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					    return sum;
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In the `sum_for` function, nothing terribly interesting happens. In `sum_foreach`, an object of type
 | 
				
			||||||
 | 
					[`Iterator`](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Iterator.html)
 | 
				
			||||||
 | 
					is allocated on the heap, and will eventually be garbage-collected. This isn't a great design;
 | 
				
			||||||
 | 
					iterators are often transient objects that you need during a function and can discard once the
 | 
				
			||||||
 | 
					function ends. Sounds exactly like the issue stack-allocated objects address, no?
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In Rust, iterators are allocated on the stack. The objects to iterate over are almost certainly in
 | 
				
			||||||
 | 
					heap memory, but the iterator itself
 | 
				
			||||||
 | 
					([`Iter`](https://doc.rust-lang.org/std/slice/struct.Iter.html)) doesn't need to use the heap. In
 | 
				
			||||||
 | 
					each of the examples below we iterate over a collection, but never use heap allocation:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					use std::collections::HashMap;
 | 
				
			||||||
 | 
					// There's a lot of assembly generated, but if you search in the text,
 | 
				
			||||||
 | 
					// there are no references to `real_drop_in_place` anywhere.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn sum_vec(x: &Vec<u32>) {
 | 
				
			||||||
 | 
					    let mut s = 0;
 | 
				
			||||||
 | 
					    // Basic iteration over vectors doesn't need allocation
 | 
				
			||||||
 | 
					    for y in x {
 | 
				
			||||||
 | 
					        s += y;
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn sum_enumerate(x: &Vec<u32>) {
 | 
				
			||||||
 | 
					    let mut s = 0;
 | 
				
			||||||
 | 
					    // More complex iterators are just fine too
 | 
				
			||||||
 | 
					    for (_i, y) in x.iter().enumerate() {
 | 
				
			||||||
 | 
					        s += y;
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn sum_hm(x: &HashMap<u32, u32>) {
 | 
				
			||||||
 | 
					    let mut s = 0;
 | 
				
			||||||
 | 
					    // And it's not just Vec, all types will allocate the iterator
 | 
				
			||||||
 | 
					    // on stack memory
 | 
				
			||||||
 | 
					    for y in x.values() {
 | 
				
			||||||
 | 
					        s += y;
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/FTT3CT)
 | 
				
			||||||
							
								
								
									
										604
									
								
								blog/2019-02-06-stacking-up/index.mdx
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										604
									
								
								blog/2019-02-06-stacking-up/index.mdx
									
									
									
									
									
										Normal file
									
								
							@ -0,0 +1,604 @@
 | 
				
			|||||||
 | 
					---
 | 
				
			||||||
 | 
					slug: 2019/02/stacking-up
 | 
				
			||||||
 | 
					title: "Allocations in Rust: Fixed memory"
 | 
				
			||||||
 | 
					date: 2019-02-06 12:00:00
 | 
				
			||||||
 | 
					authors: [bspeice]
 | 
				
			||||||
 | 
					tags: []
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					`const` and `static` are perfectly fine, but it's relatively rare that we know at compile-time about
 | 
				
			||||||
 | 
					either values or references that will be the same for the duration of our program. Put another way,
 | 
				
			||||||
 | 
					it's not often the case that either you or your compiler knows how much memory your entire program
 | 
				
			||||||
 | 
					will ever need.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					However, there are still some optimizations the compiler can do if it knows how much memory
 | 
				
			||||||
 | 
					individual functions will need. Specifically, the compiler can make use of "stack" memory (as
 | 
				
			||||||
 | 
					opposed to "heap" memory) which can be managed far faster in both the short- and long-term.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					<!-- truncate -->
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					When requesting memory, the [`push` instruction](http://www.cs.virginia.edu/~evans/cs216/guides/x86.html)
 | 
				
			||||||
 | 
					can typically complete in [1 or 2 cycles](https://agner.org/optimize/instruction_tables.ods) (<1ns
 | 
				
			||||||
 | 
					on modern CPUs). Contrast that to heap memory which requires an allocator (specialized
 | 
				
			||||||
 | 
					software to track what memory is in use) to reserve space. When you're finished with stack memory,
 | 
				
			||||||
 | 
					the `pop` instruction runs in 1-3 cycles, as opposed to an allocator needing to worry about memory
 | 
				
			||||||
 | 
					fragmentation and other issues with the heap. All sorts of incredibly sophisticated techniques have
 | 
				
			||||||
 | 
					been used to design allocators:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- [Garbage Collection](<https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)>)
 | 
				
			||||||
 | 
					  strategies like [Tracing](https://en.wikipedia.org/wiki/Tracing_garbage_collection) (used in
 | 
				
			||||||
 | 
					  [Java](https://www.oracle.com/technetwork/java/javase/tech/g1-intro-jsp-135488.html)) and
 | 
				
			||||||
 | 
					  [Reference counting](https://en.wikipedia.org/wiki/Reference_counting) (used in
 | 
				
			||||||
 | 
					  [Python](https://docs.python.org/3/extending/extending.html#reference-counts))
 | 
				
			||||||
 | 
					- Thread-local structures to prevent locking the allocator in
 | 
				
			||||||
 | 
					  [tcmalloc](https://jamesgolick.com/2013/5/19/how-tcmalloc-works.html)
 | 
				
			||||||
 | 
					- Arena structures used in [jemalloc](http://jemalloc.net/), which
 | 
				
			||||||
 | 
					  [until recently](https://blog.rust-lang.org/2019/01/17/Rust-1.32.0.html#jemalloc-is-removed-by-default)
 | 
				
			||||||
 | 
					  was the primary allocator for Rust programs!
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					But no matter how fast your allocator is, the principle remains: the fastest allocator is the one
 | 
				
			||||||
 | 
					you never use. As such, we're not going to discuss how exactly the
 | 
				
			||||||
 | 
					[`push` and `pop` instructions work](http://www.cs.virginia.edu/~evans/cs216/guides/x86.html), but
 | 
				
			||||||
 | 
					we'll focus instead on the conditions that enable the Rust compiler to use faster stack-based
 | 
				
			||||||
 | 
					allocation for variables.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					So, **how do we know when Rust will or will not use stack allocation for objects we create?**
 | 
				
			||||||
 | 
					Looking at other languages, it's often easy to delineate between stack and heap. Managed memory
 | 
				
			||||||
 | 
					languages (Python, Java,
 | 
				
			||||||
 | 
					[C#](https://blogs.msdn.microsoft.com/ericlippert/2010/09/30/the-truth-about-value-types/)) place
 | 
				
			||||||
 | 
					everything on the heap. JIT compilers ([PyPy](https://www.pypy.org/),
 | 
				
			||||||
 | 
					[HotSpot](https://www.oracle.com/technetwork/java/javase/tech/index-jsp-136373.html)) may optimize
 | 
				
			||||||
 | 
					some heap allocations away, but you should never assume it will happen. C makes things clear with
 | 
				
			||||||
 | 
					calls to special functions (like [malloc(3)](https://linux.die.net/man/3/malloc)) needed to access
 | 
				
			||||||
 | 
					heap memory. Old C++ has the [`new`](https://stackoverflow.com/a/655086/1454178) keyword, though
 | 
				
			||||||
 | 
					modern C++/C++11 is more complicated with [RAII](https://en.cppreference.com/w/cpp/language/raii).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					For Rust, we can summarize as follows: **stack allocation will be used for everything that doesn't
 | 
				
			||||||
 | 
					involve "smart pointers" and collections**. We'll skip over a precise definition of the term "smart
 | 
				
			||||||
 | 
					pointer" for now, and instead discuss what we should watch for to understand when stack and heap
 | 
				
			||||||
 | 
					memory regions are used:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					1. Stack manipulation instructions (`push`, `pop`, and `add`/`sub` of the `rsp` register) indicate
 | 
				
			||||||
 | 
					   allocation of stack memory:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					   ```rust
 | 
				
			||||||
 | 
					   pub fn stack_alloc(x: u32) -> u32 {
 | 
				
			||||||
 | 
					       // Space for `y` is allocated by subtracting from `rsp`,
 | 
				
			||||||
 | 
					       // and then populated
 | 
				
			||||||
 | 
					       let y = [1u8, 2, 3, 4];
 | 
				
			||||||
 | 
					       // Space for `y` is deallocated by adding back to `rsp`
 | 
				
			||||||
 | 
					       x
 | 
				
			||||||
 | 
					   }
 | 
				
			||||||
 | 
					   ```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					   -- [Compiler Explorer](https://godbolt.org/z/5WSgc9)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					2. Tracking when exactly heap allocation calls occur is difficult. It's typically easier to watch
 | 
				
			||||||
 | 
					   for `call core::ptr::real_drop_in_place`, and infer that a heap allocation happened in the recent
 | 
				
			||||||
 | 
					   past:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					   ```rust
 | 
				
			||||||
 | 
					   pub fn heap_alloc(x: usize) -> usize {
 | 
				
			||||||
 | 
					       // Space for elements in a vector has to be allocated
 | 
				
			||||||
 | 
					       // on the heap, and is then de-allocated once the
 | 
				
			||||||
 | 
					       // vector goes out of scope
 | 
				
			||||||
 | 
					       let y: Vec<u8> = Vec::with_capacity(x);
 | 
				
			||||||
 | 
					       x
 | 
				
			||||||
 | 
					   }
 | 
				
			||||||
 | 
					   ```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					   -- [Compiler Explorer](https://godbolt.org/z/epfgoQ) (`real_drop_in_place` happens on line 1317)
 | 
				
			||||||
 | 
					   <small>Note: While the
 | 
				
			||||||
 | 
					   [`Drop` trait](https://doc.rust-lang.org/std/ops/trait.Drop.html) is
 | 
				
			||||||
 | 
					   [called for stack-allocated objects](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=87edf374d8983816eb3d8cfeac657b46),
 | 
				
			||||||
 | 
					   the Rust standard library only defines `Drop` implementations for types that involve heap
 | 
				
			||||||
 | 
					   allocation.</small>
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					3. If you don't want to inspect the assembly, use a custom allocator that's able to track and alert
 | 
				
			||||||
 | 
					   when heap allocations occur. Crates like
 | 
				
			||||||
 | 
					   [`alloc_counter`](https://crates.io/crates/alloc_counter) are designed for exactly this purpose.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					With all that in mind, let's talk about situations in which we're guaranteed to use stack memory:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- Structs are created on the stack.
 | 
				
			||||||
 | 
					- Function arguments are passed on the stack, meaning the
 | 
				
			||||||
 | 
					  [`#[inline]` attribute](https://doc.rust-lang.org/reference/attributes.html#inline-attribute) will
 | 
				
			||||||
 | 
					  not change the memory region used.
 | 
				
			||||||
 | 
					- Enums and unions are stack-allocated.
 | 
				
			||||||
 | 
					- [Arrays](https://doc.rust-lang.org/std/primitive.array.html) are always stack-allocated.
 | 
				
			||||||
 | 
					- Closures capture their arguments on the stack.
 | 
				
			||||||
 | 
					- Generics will use stack allocation, even with dynamic dispatch.
 | 
				
			||||||
 | 
					- [`Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html) types are guaranteed to be
 | 
				
			||||||
 | 
					  stack-allocated, and copying them will be done in stack memory.
 | 
				
			||||||
 | 
					- [`Iterator`s](https://doc.rust-lang.org/std/iter/trait.Iterator.html) in the standard library are
 | 
				
			||||||
 | 
					  stack-allocated even when iterating over heap-based collections.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Structs
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The simplest case comes first. When creating vanilla `struct` objects, we use stack memory to hold
 | 
				
			||||||
 | 
					their contents:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					struct Point {
 | 
				
			||||||
 | 
					    x: u64,
 | 
				
			||||||
 | 
					    y: u64,
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					struct Line {
 | 
				
			||||||
 | 
					    a: Point,
 | 
				
			||||||
 | 
					    b: Point,
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn make_line() {
 | 
				
			||||||
 | 
					    // `origin` is stored in the first 16 bytes of memory
 | 
				
			||||||
 | 
					    // starting at location `rsp`
 | 
				
			||||||
 | 
					    let origin = Point { x: 0, y: 0 };
 | 
				
			||||||
 | 
					    // `point` makes up the next 16 bytes of memory
 | 
				
			||||||
 | 
					    let point = Point { x: 1, y: 2 };
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    // When creating `ray`, we just move the content out of
 | 
				
			||||||
 | 
					    // `origin` and `point` into the next 32 bytes of memory
 | 
				
			||||||
 | 
					    let ray = Line { a: origin, b: point };
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/vri9BE)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Note that while some extra-fancy instructions are used for memory manipulation in the assembly, the
 | 
				
			||||||
 | 
					`sub rsp, 64` instruction indicates we're still working with the stack.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Function arguments
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Have you ever wondered how functions communicate with each other? Like, once the variables are given
 | 
				
			||||||
 | 
					to you, everything's fine. But how do you "give" those variables to another function? How do you get
 | 
				
			||||||
 | 
					the results back afterward? The answer: the compiler arranges memory and assembly instructions using
 | 
				
			||||||
 | 
					a pre-determined [calling convention](http://llvm.org/docs/LangRef.html#calling-conventions). This
 | 
				
			||||||
 | 
					convention governs the rules around where arguments needed by a function will be located (either in
 | 
				
			||||||
 | 
					memory offsets relative to the stack pointer `rsp`, or in other registers), and where the results
 | 
				
			||||||
 | 
					can be found once the function has finished. And when multiple languages agree on what the calling
 | 
				
			||||||
 | 
					conventions are, you can do things like having [Go call Rust code](https://blog.filippo.io/rustgo/)!
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Put simply: it's the compiler's job to figure out how to call other functions, and you can assume
 | 
				
			||||||
 | 
					that the compiler is good at its job.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					We can see this in action using a simple example:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					struct Point {
 | 
				
			||||||
 | 
					    x: i64,
 | 
				
			||||||
 | 
					    y: i64,
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// We use integer division operations to keep
 | 
				
			||||||
 | 
					// the assembly clean, understanding the result
 | 
				
			||||||
 | 
					// isn't accurate.
 | 
				
			||||||
 | 
					fn distance(a: &Point, b: &Point) -> i64 {
 | 
				
			||||||
 | 
					    // Immediately subtract from `rsp` the bytes needed
 | 
				
			||||||
 | 
					    // to hold all the intermediate results - this is
 | 
				
			||||||
 | 
					    // the stack allocation step
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    // The compiler used the `rdi` and `rsi` registers
 | 
				
			||||||
 | 
					    // to pass our arguments, so read them in
 | 
				
			||||||
 | 
					    let x1 = a.x;
 | 
				
			||||||
 | 
					    let x2 = b.x;
 | 
				
			||||||
 | 
					    let y1 = a.y;
 | 
				
			||||||
 | 
					    let y2 = b.y;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    // Do the actual math work
 | 
				
			||||||
 | 
					    let x_pow = (x1 - x2) * (x1 - x2);
 | 
				
			||||||
 | 
					    let y_pow = (y1 - y2) * (y1 - y2);
 | 
				
			||||||
 | 
					    let squared = x_pow + y_pow;
 | 
				
			||||||
 | 
					    squared / squared
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    // Our final result will be stored in the `rax` register
 | 
				
			||||||
 | 
					    // so that our caller knows where to retrieve it.
 | 
				
			||||||
 | 
					    // Finally, add back to `rsp` the stack memory that is
 | 
				
			||||||
 | 
					    // now ready to be used by other functions.
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn total_distance() {
 | 
				
			||||||
 | 
					    let start = Point { x: 1, y: 2 };
 | 
				
			||||||
 | 
					    let middle = Point { x: 3, y: 4 };
 | 
				
			||||||
 | 
					    let end = Point { x: 5, y: 6 };
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    let _dist_1 = distance(&start, &middle);
 | 
				
			||||||
 | 
					    let _dist_2 = distance(&middle, &end);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/Qmx4ST)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					As a consequence of function arguments never using heap memory, we can also infer that functions
 | 
				
			||||||
 | 
					using the `#[inline]` attributes also do not heap allocate. But better than inferring, we can look
 | 
				
			||||||
 | 
					at the assembly to prove it:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					struct Point {
 | 
				
			||||||
 | 
					    x: i64,
 | 
				
			||||||
 | 
					    y: i64,
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// Note that there is no `distance` function in the assembly output,
 | 
				
			||||||
 | 
					// and the total line count goes from 229 with inlining off
 | 
				
			||||||
 | 
					// to 306 with inline on. Even still, no heap allocations occur.
 | 
				
			||||||
 | 
					#[inline(always)]
 | 
				
			||||||
 | 
					fn distance(a: &Point, b: &Point) -> i64 {
 | 
				
			||||||
 | 
					    let x1 = a.x;
 | 
				
			||||||
 | 
					    let x2 = b.x;
 | 
				
			||||||
 | 
					    let y1 = a.y;
 | 
				
			||||||
 | 
					    let y2 = b.y;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    let x_pow = (a.x - b.x) * (a.x - b.x);
 | 
				
			||||||
 | 
					    let y_pow = (a.y - b.y) * (a.y - b.y);
 | 
				
			||||||
 | 
					    let squared = x_pow + y_pow;
 | 
				
			||||||
 | 
					    squared / squared
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn total_distance() {
 | 
				
			||||||
 | 
					    let start = Point { x: 1, y: 2 };
 | 
				
			||||||
 | 
					    let middle = Point { x: 3, y: 4 };
 | 
				
			||||||
 | 
					    let end = Point { x: 5, y: 6 };
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    let _dist_1 = distance(&start, &middle);
 | 
				
			||||||
 | 
					    let _dist_2 = distance(&middle, &end);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/30Sh66)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Finally, passing by value (arguments with type
 | 
				
			||||||
 | 
					[`Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html)) and passing by reference (either
 | 
				
			||||||
 | 
					moving ownership or passing a pointer) may have slightly different layouts in assembly, but will
 | 
				
			||||||
 | 
					still use either stack memory or CPU registers:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					pub struct Point {
 | 
				
			||||||
 | 
					    x: i64,
 | 
				
			||||||
 | 
					    y: i64,
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// Moving values
 | 
				
			||||||
 | 
					pub fn distance_moved(a: Point, b: Point) -> i64 {
 | 
				
			||||||
 | 
					    let x1 = a.x;
 | 
				
			||||||
 | 
					    let x2 = b.x;
 | 
				
			||||||
 | 
					    let y1 = a.y;
 | 
				
			||||||
 | 
					    let y2 = b.y;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    let x_pow = (x1 - x2) * (x1 - x2);
 | 
				
			||||||
 | 
					    let y_pow = (y1 - y2) * (y1 - y2);
 | 
				
			||||||
 | 
					    let squared = x_pow + y_pow;
 | 
				
			||||||
 | 
					    squared / squared
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// Borrowing values has two extra `mov` instructions on lines 21 and 22
 | 
				
			||||||
 | 
					pub fn distance_borrowed(a: &Point, b: &Point) -> i64 {
 | 
				
			||||||
 | 
					    let x1 = a.x;
 | 
				
			||||||
 | 
					    let x2 = b.x;
 | 
				
			||||||
 | 
					    let y1 = a.y;
 | 
				
			||||||
 | 
					    let y2 = b.y;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    let x_pow = (x1 - x2) * (x1 - x2);
 | 
				
			||||||
 | 
					    let y_pow = (y1 - y2) * (y1 - y2);
 | 
				
			||||||
 | 
					    let squared = x_pow + y_pow;
 | 
				
			||||||
 | 
					    squared / squared
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/06hGiv)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Enums
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					If you've ever worried that wrapping your types in
 | 
				
			||||||
 | 
					[`Option`](https://doc.rust-lang.org/stable/core/option/enum.Option.html) or
 | 
				
			||||||
 | 
					[`Result`](https://doc.rust-lang.org/stable/core/result/enum.Result.html) would finally make them
 | 
				
			||||||
 | 
					large enough that Rust decides to use heap allocation instead, fear no longer: `enum` and union
 | 
				
			||||||
 | 
					types don't use heap allocation:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					enum MyEnum {
 | 
				
			||||||
 | 
					    Small(u8),
 | 
				
			||||||
 | 
					    Large(u64)
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					struct MyStruct {
 | 
				
			||||||
 | 
					    x: MyEnum,
 | 
				
			||||||
 | 
					    y: MyEnum,
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn enum_compare() {
 | 
				
			||||||
 | 
					    let x = MyEnum::Small(0);
 | 
				
			||||||
 | 
					    let y = MyEnum::Large(0);
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    let z = MyStruct { x, y };
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    let opt = Option::Some(z);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/HK7zBx)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Because the size of an `enum` is the size of its largest element plus a flag, the compiler can
 | 
				
			||||||
 | 
					predict how much memory is used no matter which variant of an enum is currently stored in a
 | 
				
			||||||
 | 
					variable. Thus, enums and unions have no need of heap allocation. There's unfortunately not a great
 | 
				
			||||||
 | 
					way to show this in assembly, so I'll instead point you to the
 | 
				
			||||||
 | 
					[`core::mem::size_of`](https://doc.rust-lang.org/stable/core/mem/fn.size_of.html#size-of-enums)
 | 
				
			||||||
 | 
					documentation.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Arrays
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The array type is guaranteed to be stack allocated, which is why the array size must be declared.
 | 
				
			||||||
 | 
					Interestingly enough, this can be used to cause safe Rust programs to crash:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					// 256 bytes
 | 
				
			||||||
 | 
					#[derive(Default)]
 | 
				
			||||||
 | 
					struct TwoFiftySix {
 | 
				
			||||||
 | 
					    _a: [u64; 32]
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// 8 kilobytes
 | 
				
			||||||
 | 
					#[derive(Default)]
 | 
				
			||||||
 | 
					struct EightK {
 | 
				
			||||||
 | 
					    _a: [TwoFiftySix; 32]
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// 256 kilobytes
 | 
				
			||||||
 | 
					#[derive(Default)]
 | 
				
			||||||
 | 
					struct TwoFiftySixK {
 | 
				
			||||||
 | 
					    _a: [EightK; 32]
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// 8 megabytes - exceeds space typically provided for the stack,
 | 
				
			||||||
 | 
					// though the kernel can be instructed to allocate more.
 | 
				
			||||||
 | 
					// On Linux, you can check stack size using `ulimit -s`
 | 
				
			||||||
 | 
					#[derive(Default)]
 | 
				
			||||||
 | 
					struct EightM {
 | 
				
			||||||
 | 
					    _a: [TwoFiftySixK; 32]
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					fn main() {
 | 
				
			||||||
 | 
					    // Because we already have things in stack memory
 | 
				
			||||||
 | 
					    // (like the current function call stack), allocating another
 | 
				
			||||||
 | 
					    // eight megabytes of stack memory crashes the program
 | 
				
			||||||
 | 
					    let _x = EightM::default();
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					--
 | 
				
			||||||
 | 
					[Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=587a6380a4914bcbcef4192c90c01dc4)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					There aren't any security implications of this (no memory corruption occurs), but it's good to note
 | 
				
			||||||
 | 
					that the Rust compiler won't move arrays into heap memory even if they can be reasonably expected to
 | 
				
			||||||
 | 
					overflow the stack.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Closures
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Rules for how anonymous functions capture their arguments are typically language-specific. In Java,
 | 
				
			||||||
 | 
					[Lambda Expressions](https://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html) are
 | 
				
			||||||
 | 
					actually objects created on the heap that capture local primitives by copying, and capture local
 | 
				
			||||||
 | 
					non-primitives as (`final`) references.
 | 
				
			||||||
 | 
					[Python](https://docs.python.org/3.7/reference/expressions.html#lambda) and
 | 
				
			||||||
 | 
					[JavaScript](https://javascriptweblog.wordpress.com/2010/10/25/understanding-javascript-closures/)
 | 
				
			||||||
 | 
					both bind _everything_ by reference normally, but Python can also
 | 
				
			||||||
 | 
					[capture values](https://stackoverflow.com/a/235764/1454178) and JavaScript has
 | 
				
			||||||
 | 
					[Arrow functions](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/Arrow_functions).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In Rust, arguments to closures are the same as arguments to other functions; closures are simply
 | 
				
			||||||
 | 
					functions that don't have a declared name. Some weird ordering of the stack may be required to
 | 
				
			||||||
 | 
					handle them, but it's the compiler's responsiblity to figure that out.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Each example below has the same effect, but a different assembly implementation. In the simplest
 | 
				
			||||||
 | 
					case, we immediately run a closure returned by another function. Because we don't store a reference
 | 
				
			||||||
 | 
					to the closure, the stack memory needed to store the captured values is contiguous:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					fn my_func() -> impl FnOnce() {
 | 
				
			||||||
 | 
					    let x = 24;
 | 
				
			||||||
 | 
					    // Note that this closure in assembly looks exactly like
 | 
				
			||||||
 | 
					    // any other function; you even use the `call` instruction
 | 
				
			||||||
 | 
					    // to start running it.
 | 
				
			||||||
 | 
					    move || { x; }
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn immediate() {
 | 
				
			||||||
 | 
					    my_func()();
 | 
				
			||||||
 | 
					    my_func()();
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/mgJ2zl), 25 total assembly instructions
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					If we store a reference to the closure, the Rust compiler keeps values it needs in the stack memory
 | 
				
			||||||
 | 
					of the original function. Getting the details right is a bit harder, so the instruction count goes
 | 
				
			||||||
 | 
					up even though this code is functionally equivalent to our original example:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					pub fn simple_reference() {
 | 
				
			||||||
 | 
					    let x = my_func();
 | 
				
			||||||
 | 
					    let y = my_func();
 | 
				
			||||||
 | 
					    y();
 | 
				
			||||||
 | 
					    x();
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/K_dj5n), 55 total assembly instructions
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Even things like variable order can make a difference in instruction count:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					pub fn complex() {
 | 
				
			||||||
 | 
					    let x = my_func();
 | 
				
			||||||
 | 
					    let y = my_func();
 | 
				
			||||||
 | 
					    x();
 | 
				
			||||||
 | 
					    y();
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/p37qFl), 70 total assembly instructions
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In every circumstance though, the compiler ensured that no heap allocations were necessary.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Generics
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Traits in Rust come in two broad forms: static dispatch (monomorphization, `impl Trait`) and dynamic
 | 
				
			||||||
 | 
					dispatch (trait objects, `dyn Trait`). While dynamic dispatch is often _associated_ with trait
 | 
				
			||||||
 | 
					objects being stored in the heap, dynamic dispatch can be used with stack allocated objects as well:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					trait GetInt {
 | 
				
			||||||
 | 
					    fn get_int(&self) -> u64;
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// vtable stored at section L__unnamed_1
 | 
				
			||||||
 | 
					struct WhyNotU8 {
 | 
				
			||||||
 | 
					    x: u8
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					impl GetInt for WhyNotU8 {
 | 
				
			||||||
 | 
					    fn get_int(&self) -> u64 {
 | 
				
			||||||
 | 
					        self.x as u64
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// vtable stored at section L__unnamed_2
 | 
				
			||||||
 | 
					struct ActualU64 {
 | 
				
			||||||
 | 
					    x: u64
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					impl GetInt for ActualU64 {
 | 
				
			||||||
 | 
					    fn get_int(&self) -> u64 {
 | 
				
			||||||
 | 
					        self.x
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// `&dyn` declares that we want to use dynamic dispatch
 | 
				
			||||||
 | 
					// rather than monomorphization, so there is only one
 | 
				
			||||||
 | 
					// `retrieve_int` function that shows up in the final assembly.
 | 
				
			||||||
 | 
					// If we used generics, there would be one implementation of
 | 
				
			||||||
 | 
					// `retrieve_int` for each type that implements `GetInt`.
 | 
				
			||||||
 | 
					pub fn retrieve_int(u: &dyn GetInt) {
 | 
				
			||||||
 | 
					    // In the assembly, we just call an address given to us
 | 
				
			||||||
 | 
					    // in the `rsi` register and hope that it was set up
 | 
				
			||||||
 | 
					    // correctly when this function was invoked.
 | 
				
			||||||
 | 
					    let x = u.get_int();
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn do_call() {
 | 
				
			||||||
 | 
					    // Note that even though the vtable for `WhyNotU8` and
 | 
				
			||||||
 | 
					    // `ActualU64` includes a pointer to
 | 
				
			||||||
 | 
					    // `core::ptr::real_drop_in_place`, it is never invoked.
 | 
				
			||||||
 | 
					    let a = WhyNotU8 { x: 0 };
 | 
				
			||||||
 | 
					    let b = ActualU64 { x: 0 };
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    retrieve_int(&a);
 | 
				
			||||||
 | 
					    retrieve_int(&b);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/u_yguS)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					It's hard to imagine practical situations where dynamic dispatch would be used for objects that
 | 
				
			||||||
 | 
					aren't heap allocated, but it technically can be done.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Copy types
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Understanding move semantics and copy semantics in Rust is weird at first. The Rust docs
 | 
				
			||||||
 | 
					[go into detail](https://doc.rust-lang.org/stable/core/marker/trait.Copy.html) far better than can
 | 
				
			||||||
 | 
					be addressed here, so I'll leave them to do the job. From a memory perspective though, their
 | 
				
			||||||
 | 
					guideline is reasonable:
 | 
				
			||||||
 | 
					[if your type can implemement `Copy`, it should](https://doc.rust-lang.org/stable/core/marker/trait.Copy.html#when-should-my-type-be-copy).
 | 
				
			||||||
 | 
					While there are potential speed tradeoffs to _benchmark_ when discussing `Copy` (move semantics for
 | 
				
			||||||
 | 
					stack objects vs. copying stack pointers vs. copying stack `struct`s), _it's impossible for `Copy`
 | 
				
			||||||
 | 
					to introduce a heap allocation_.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					But why is this the case? Fundamentally, it's because the language controls what `Copy` means -
 | 
				
			||||||
 | 
					["the behavior of `Copy` is not overloadable"](https://doc.rust-lang.org/std/marker/trait.Copy.html#whats-the-difference-between-copy-and-clone)
 | 
				
			||||||
 | 
					because it's a marker trait. From there we'll note that a type
 | 
				
			||||||
 | 
					[can implement `Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html#when-can-my-type-be-copy)
 | 
				
			||||||
 | 
					if (and only if) its components implement `Copy`, and that
 | 
				
			||||||
 | 
					[no heap-allocated types implement `Copy`](https://doc.rust-lang.org/std/marker/trait.Copy.html#implementors).
 | 
				
			||||||
 | 
					Thus, assignments involving heap types are always move semantics, and new heap allocations won't
 | 
				
			||||||
 | 
					occur because of implicit operator behavior.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					#[derive(Clone)]
 | 
				
			||||||
 | 
					struct Cloneable {
 | 
				
			||||||
 | 
					    x: Box<u64>
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					// error[E0204]: the trait `Copy` may not be implemented for this type
 | 
				
			||||||
 | 
					#[derive(Copy, Clone)]
 | 
				
			||||||
 | 
					struct NotCopyable {
 | 
				
			||||||
 | 
					    x: Box<u64>
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/VToRuK)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Iterators
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In managed memory languages (like
 | 
				
			||||||
 | 
					[Java](https://www.youtube.com/watch?v=bSkpMdDe4g4&feature=youtu.be&t=357)), there's a subtle
 | 
				
			||||||
 | 
					difference between these two code samples:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```java
 | 
				
			||||||
 | 
					public static int sum_for(List<Long> vals) {
 | 
				
			||||||
 | 
					    long sum = 0;
 | 
				
			||||||
 | 
					    // Regular for loop
 | 
				
			||||||
 | 
					    for (int i = 0; i < vals.length; i++) {
 | 
				
			||||||
 | 
					        sum += vals[i];
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					    return sum;
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					public static int sum_foreach(List<Long> vals) {
 | 
				
			||||||
 | 
					    long sum = 0;
 | 
				
			||||||
 | 
					    // "Foreach" loop - uses iteration
 | 
				
			||||||
 | 
					    for (Long l : vals) {
 | 
				
			||||||
 | 
					        sum += l;
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					    return sum;
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In the `sum_for` function, nothing terribly interesting happens. In `sum_foreach`, an object of type
 | 
				
			||||||
 | 
					[`Iterator`](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Iterator.html)
 | 
				
			||||||
 | 
					is allocated on the heap, and will eventually be garbage-collected. This isn't a great design;
 | 
				
			||||||
 | 
					iterators are often transient objects that you need during a function and can discard once the
 | 
				
			||||||
 | 
					function ends. Sounds exactly like the issue stack-allocated objects address, no?
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					In Rust, iterators are allocated on the stack. The objects to iterate over are almost certainly in
 | 
				
			||||||
 | 
					heap memory, but the iterator itself
 | 
				
			||||||
 | 
					([`Iter`](https://doc.rust-lang.org/std/slice/struct.Iter.html)) doesn't need to use the heap. In
 | 
				
			||||||
 | 
					each of the examples below we iterate over a collection, but never use heap allocation:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					use std::collections::HashMap;
 | 
				
			||||||
 | 
					// There's a lot of assembly generated, but if you search in the text,
 | 
				
			||||||
 | 
					// there are no references to `real_drop_in_place` anywhere.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn sum_vec(x: &Vec<u32>) {
 | 
				
			||||||
 | 
					    let mut s = 0;
 | 
				
			||||||
 | 
					    // Basic iteration over vectors doesn't need allocation
 | 
				
			||||||
 | 
					    for y in x {
 | 
				
			||||||
 | 
					        s += y;
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn sum_enumerate(x: &Vec<u32>) {
 | 
				
			||||||
 | 
					    let mut s = 0;
 | 
				
			||||||
 | 
					    // More complex iterators are just fine too
 | 
				
			||||||
 | 
					    for (_i, y) in x.iter().enumerate() {
 | 
				
			||||||
 | 
					        s += y;
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn sum_hm(x: &HashMap<u32, u32>) {
 | 
				
			||||||
 | 
					    let mut s = 0;
 | 
				
			||||||
 | 
					    // And it's not just Vec, all types will allocate the iterator
 | 
				
			||||||
 | 
					    // on stack memory
 | 
				
			||||||
 | 
					    for y in x.values() {
 | 
				
			||||||
 | 
					        s += y;
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/FTT3CT)
 | 
				
			||||||
							
								
								
									
										254
									
								
								blog/2019-02-07-a-heaping-helping/_article.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										254
									
								
								blog/2019-02-07-a-heaping-helping/_article.md
									
									
									
									
									
										Normal file
									
								
							@ -0,0 +1,254 @@
 | 
				
			|||||||
 | 
					---
 | 
				
			||||||
 | 
					layout: post
 | 
				
			||||||
 | 
					title: "Dynamic Memory: A Heaping Helping"
 | 
				
			||||||
 | 
					description: "The reason Rust exists."
 | 
				
			||||||
 | 
					category:
 | 
				
			||||||
 | 
					tags: [rust, understanding-allocations]
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Managing dynamic memory is hard. Some languages assume users will do it themselves (C, C++), and
 | 
				
			||||||
 | 
					some languages go to extreme lengths to protect users from themselves (Java, Python). In Rust, how
 | 
				
			||||||
 | 
					the language uses dynamic memory (also referred to as the **heap**) is a system called _ownership_.
 | 
				
			||||||
 | 
					And as the docs mention, ownership
 | 
				
			||||||
 | 
					[is Rust's most unique feature](https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The heap is used in two situations; when the compiler is unable to predict either the _total size of
 | 
				
			||||||
 | 
					memory needed_, or _how long the memory is needed for_, it allocates space in the heap. This happens
 | 
				
			||||||
 | 
					pretty frequently; if you want to download the Google home page, you won't know how large it is
 | 
				
			||||||
 | 
					until your program runs. And when you're finished with Google, we deallocate the memory so it can be
 | 
				
			||||||
 | 
					used to store other webpages. If you're interested in a slightly longer explanation of the heap,
 | 
				
			||||||
 | 
					check out
 | 
				
			||||||
 | 
					[The Stack and the Heap](https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html#the-stack-and-the-heap)
 | 
				
			||||||
 | 
					in Rust's documentation.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					We won't go into detail on how the heap is managed; the
 | 
				
			||||||
 | 
					[ownership documentation](https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html) does a
 | 
				
			||||||
 | 
					phenomenal job explaining both the "why" and "how" of memory management. Instead, we're going to
 | 
				
			||||||
 | 
					focus on understanding "when" heap allocations occur in Rust.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					To start off, take a guess for how many allocations happen in the program below:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					fn main() {}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					It's obviously a trick question; while no heap allocations occur as a result of that code, the setup
 | 
				
			||||||
 | 
					needed to call `main` does allocate on the heap. Here's a way to show it:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					#![feature(integer_atomics)]
 | 
				
			||||||
 | 
					use std::alloc::{GlobalAlloc, Layout, System};
 | 
				
			||||||
 | 
					use std::sync::atomic::{AtomicU64, Ordering};
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					static ALLOCATION_COUNT: AtomicU64 = AtomicU64::new(0);
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					struct CountingAllocator;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					unsafe impl GlobalAlloc for CountingAllocator {
 | 
				
			||||||
 | 
					    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
 | 
				
			||||||
 | 
					        ALLOCATION_COUNT.fetch_add(1, Ordering::SeqCst);
 | 
				
			||||||
 | 
					        System.alloc(layout)
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
 | 
				
			||||||
 | 
					        System.dealloc(ptr, layout);
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					#[global_allocator]
 | 
				
			||||||
 | 
					static A: CountingAllocator = CountingAllocator;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					fn main() {
 | 
				
			||||||
 | 
					    let x = ALLOCATION_COUNT.fetch_add(0, Ordering::SeqCst);
 | 
				
			||||||
 | 
					    println!("There were {} allocations before calling main!", x);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					--
 | 
				
			||||||
 | 
					[Rust Playground](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=fb5060025ba79fc0f906b65a4ef8eb8e)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					As of the time of writing, there are five allocations that happen before `main` is ever called.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					But when we want to understand more practically where heap allocation happens, we'll follow this
 | 
				
			||||||
 | 
					guide:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- Smart pointers hold their contents in the heap
 | 
				
			||||||
 | 
					- Collections are smart pointers for many objects at a time, and reallocate when they need to grow
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Finally, there are two "addendum" issues that are important to address when discussing Rust and the
 | 
				
			||||||
 | 
					heap:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- Non-heap alternatives to many standard library types are available.
 | 
				
			||||||
 | 
					- Special allocators to track memory behavior should be used to benchmark code.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					# Smart pointers
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The first thing to note are the "smart pointer" types. When you have data that must outlive the
 | 
				
			||||||
 | 
					scope in which it is declared, or your data is of unknown or dynamic size, you'll make use of these
 | 
				
			||||||
 | 
					types.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The term [smart pointer](https://en.wikipedia.org/wiki/Smart_pointer) comes from C++, and while it's
 | 
				
			||||||
 | 
					closely linked to a general design pattern of
 | 
				
			||||||
 | 
					["Resource Acquisition Is Initialization"](https://en.cppreference.com/w/cpp/language/raii), we'll
 | 
				
			||||||
 | 
					use it here specifically to describe objects that are responsible for managing ownership of data
 | 
				
			||||||
 | 
					allocated on the heap. The smart pointers available in the `alloc` crate should look mostly
 | 
				
			||||||
 | 
					familiar:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- [`Box`](https://doc.rust-lang.org/alloc/boxed/struct.Box.html)
 | 
				
			||||||
 | 
					- [`Rc`](https://doc.rust-lang.org/alloc/rc/struct.Rc.html)
 | 
				
			||||||
 | 
					- [`Arc`](https://doc.rust-lang.org/alloc/sync/struct.Arc.html)
 | 
				
			||||||
 | 
					- [`Cow`](https://doc.rust-lang.org/alloc/borrow/enum.Cow.html)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The [standard library](https://doc.rust-lang.org/std/) also defines some smart pointers to manage
 | 
				
			||||||
 | 
					heap objects, though more than can be covered here. Some examples are:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- [`RwLock`](https://doc.rust-lang.org/std/sync/struct.RwLock.html)
 | 
				
			||||||
 | 
					- [`Mutex`](https://doc.rust-lang.org/std/sync/struct.Mutex.html)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Finally, there is one ["gotcha"](https://www.merriam-webster.com/dictionary/gotcha): **cell types**
 | 
				
			||||||
 | 
					(like [`RefCell`](https://doc.rust-lang.org/stable/core/cell/struct.RefCell.html)) look and behave
 | 
				
			||||||
 | 
					similarly, but **don't involve heap allocation**. The
 | 
				
			||||||
 | 
					[`core::cell` docs](https://doc.rust-lang.org/stable/core/cell/index.html) have more information.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					When a smart pointer is created, the data it is given is placed in heap memory and the location of
 | 
				
			||||||
 | 
					that data is recorded in the smart pointer. Once the smart pointer has determined it's safe to
 | 
				
			||||||
 | 
					deallocate that memory (when a `Box` has
 | 
				
			||||||
 | 
					[gone out of scope](https://doc.rust-lang.org/stable/std/boxed/index.html) or a reference count
 | 
				
			||||||
 | 
					[goes to zero](https://doc.rust-lang.org/alloc/rc/index.html)), the heap space is reclaimed. We can
 | 
				
			||||||
 | 
					prove these types use heap memory by looking at code:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					use std::rc::Rc;
 | 
				
			||||||
 | 
					use std::sync::Arc;
 | 
				
			||||||
 | 
					use std::borrow::Cow;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn my_box() {
 | 
				
			||||||
 | 
					    // Drop at assembly line 1640
 | 
				
			||||||
 | 
					    Box::new(0);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn my_rc() {
 | 
				
			||||||
 | 
					    // Drop at assembly line 1650
 | 
				
			||||||
 | 
					    Rc::new(0);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn my_arc() {
 | 
				
			||||||
 | 
					    // Drop at assembly line 1660
 | 
				
			||||||
 | 
					    Arc::new(0);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn my_cow() {
 | 
				
			||||||
 | 
					    // Drop at assembly line 1672
 | 
				
			||||||
 | 
					    Cow::from("drop");
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/4AMQug)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					# Collections
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Collection types use heap memory because their contents have dynamic size; they will request more
 | 
				
			||||||
 | 
					memory [when needed](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.reserve), and can
 | 
				
			||||||
 | 
					[release memory](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.shrink_to_fit) when it's
 | 
				
			||||||
 | 
					no longer necessary. This dynamic property forces Rust to heap allocate everything they contain. In
 | 
				
			||||||
 | 
					a way, **collections are smart pointers for many objects at a time**. Common types that fall under
 | 
				
			||||||
 | 
					this umbrella are [`Vec`](https://doc.rust-lang.org/stable/alloc/vec/struct.Vec.html),
 | 
				
			||||||
 | 
					[`HashMap`](https://doc.rust-lang.org/stable/std/collections/struct.HashMap.html), and
 | 
				
			||||||
 | 
					[`String`](https://doc.rust-lang.org/stable/alloc/string/struct.String.html) (not
 | 
				
			||||||
 | 
					[`str`](https://doc.rust-lang.org/std/primitive.str.html)).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					While collections store the objects they own in heap memory, _creating new collections will not
 | 
				
			||||||
 | 
					allocate on the heap_. This is a bit weird; if we call `Vec::new()`, the assembly shows a
 | 
				
			||||||
 | 
					corresponding call to `real_drop_in_place`:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					pub fn my_vec() {
 | 
				
			||||||
 | 
					    // Drop in place at line 481
 | 
				
			||||||
 | 
					    Vec::<u8>::new();
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/1WkNtC)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					But because the vector has no elements to manage, no calls to the allocator will ever be dispatched:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					use std::alloc::{GlobalAlloc, Layout, System};
 | 
				
			||||||
 | 
					use std::sync::atomic::{AtomicBool, Ordering};
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					fn main() {
 | 
				
			||||||
 | 
					    // Turn on panicking if we allocate on the heap
 | 
				
			||||||
 | 
					    DO_PANIC.store(true, Ordering::SeqCst);
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    // Interesting bit happens here
 | 
				
			||||||
 | 
					    let x: Vec<u8> = Vec::new();
 | 
				
			||||||
 | 
					    drop(x);
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    // Turn panicking back off, some deallocations occur
 | 
				
			||||||
 | 
					    // after main as well.
 | 
				
			||||||
 | 
					    DO_PANIC.store(false, Ordering::SeqCst);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					#[global_allocator]
 | 
				
			||||||
 | 
					static A: PanicAllocator = PanicAllocator;
 | 
				
			||||||
 | 
					static DO_PANIC: AtomicBool = AtomicBool::new(false);
 | 
				
			||||||
 | 
					struct PanicAllocator;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					unsafe impl GlobalAlloc for PanicAllocator {
 | 
				
			||||||
 | 
					    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
 | 
				
			||||||
 | 
					        if DO_PANIC.load(Ordering::SeqCst) {
 | 
				
			||||||
 | 
					            panic!("Unexpected allocation.");
 | 
				
			||||||
 | 
					        }
 | 
				
			||||||
 | 
					        System.alloc(layout)
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
 | 
				
			||||||
 | 
					        if DO_PANIC.load(Ordering::SeqCst) {
 | 
				
			||||||
 | 
					            panic!("Unexpected deallocation.");
 | 
				
			||||||
 | 
					        }
 | 
				
			||||||
 | 
					        System.dealloc(ptr, layout);
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					--
 | 
				
			||||||
 | 
					[Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=831a297d176d015b1f9ace01ae416cc6)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Other standard library types follow the same behavior; make sure to check out
 | 
				
			||||||
 | 
					[`HashMap::new()`](https://doc.rust-lang.org/std/collections/hash_map/struct.HashMap.html#method.new),
 | 
				
			||||||
 | 
					and [`String::new()`](https://doc.rust-lang.org/std/string/struct.String.html#method.new).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					# Heap Alternatives
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					While it is a bit strange to speak of the stack after spending time with the heap, it's worth
 | 
				
			||||||
 | 
					pointing out that some heap-allocated objects in Rust have stack-based counterparts provided by
 | 
				
			||||||
 | 
					other crates. If you have need of the functionality, but want to avoid allocating, there are
 | 
				
			||||||
 | 
					typically alternatives available.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					When it comes to some standard library smart pointers
 | 
				
			||||||
 | 
					([`RwLock`](https://doc.rust-lang.org/std/sync/struct.RwLock.html) and
 | 
				
			||||||
 | 
					[`Mutex`](https://doc.rust-lang.org/std/sync/struct.Mutex.html)), stack-based alternatives are
 | 
				
			||||||
 | 
					provided in crates like [parking_lot](https://crates.io/crates/parking_lot) and
 | 
				
			||||||
 | 
					[spin](https://crates.io/crates/spin). You can check out
 | 
				
			||||||
 | 
					[`lock_api::RwLock`](https://docs.rs/lock_api/0.1.5/lock_api/struct.RwLock.html),
 | 
				
			||||||
 | 
					[`lock_api::Mutex`](https://docs.rs/lock_api/0.1.5/lock_api/struct.Mutex.html), and
 | 
				
			||||||
 | 
					[`spin::Once`](https://mvdnes.github.io/rust-docs/spin-rs/spin/struct.Once.html) if you're in need
 | 
				
			||||||
 | 
					of synchronization primitives.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					[thread_id](https://crates.io/crates/thread-id) may be necessary if you're implementing an allocator
 | 
				
			||||||
 | 
					because [`thread::current().id()`](https://doc.rust-lang.org/std/thread/struct.ThreadId.html) uses a
 | 
				
			||||||
 | 
					[`thread_local!` structure](https://doc.rust-lang.org/stable/src/std/sys_common/thread_info.rs.html#17-36)
 | 
				
			||||||
 | 
					that needs heap allocation.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					# Tracing Allocators
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					When writing performance-sensitive code, there's no alternative to measuring your code. If you
 | 
				
			||||||
 | 
					didn't write a benchmark,
 | 
				
			||||||
 | 
					[you don't care about it's performance](https://www.youtube.com/watch?v=2EWejmkKlxs&feature=youtu.be&t=263)
 | 
				
			||||||
 | 
					You should never rely on your instincts when
 | 
				
			||||||
 | 
					[a microsecond is an eternity](https://www.youtube.com/watch?v=NH1Tta7purM).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Similarly, there's great work going on in Rust with allocators that keep track of what they're doing
 | 
				
			||||||
 | 
					(like [`alloc_counter`](https://crates.io/crates/alloc_counter)). When it comes to tracking heap
 | 
				
			||||||
 | 
					behavior, it's easy to make mistakes; please write tests and make sure you have tools to guard
 | 
				
			||||||
 | 
					against future issues.
 | 
				
			||||||
							
								
								
									
										258
									
								
								blog/2019-02-07-a-heaping-helping/index.mdx
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										258
									
								
								blog/2019-02-07-a-heaping-helping/index.mdx
									
									
									
									
									
										Normal file
									
								
							@ -0,0 +1,258 @@
 | 
				
			|||||||
 | 
					---
 | 
				
			||||||
 | 
					slug: 2019/02/a-heaping-helping
 | 
				
			||||||
 | 
					title: "Allocations in Rust: Dynamic memory"
 | 
				
			||||||
 | 
					date: 2019-02-07 12:00:00
 | 
				
			||||||
 | 
					authors: [bspeice]
 | 
				
			||||||
 | 
					tags: []
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Managing dynamic memory is hard. Some languages assume users will do it themselves (C, C++), and
 | 
				
			||||||
 | 
					some languages go to extreme lengths to protect users from themselves (Java, Python). In Rust, how
 | 
				
			||||||
 | 
					the language uses dynamic memory (also referred to as the **heap**) is a system called _ownership_.
 | 
				
			||||||
 | 
					And as the docs mention, ownership
 | 
				
			||||||
 | 
					[is Rust's most unique feature](https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The heap is used in two situations; when the compiler is unable to predict either the _total size of
 | 
				
			||||||
 | 
					memory needed_, or _how long the memory is needed for_, it allocates space in the heap.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					<!-- truncate -->
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					This happens
 | 
				
			||||||
 | 
					pretty frequently; if you want to download the Google home page, you won't know how large it is
 | 
				
			||||||
 | 
					until your program runs. And when you're finished with Google, we deallocate the memory so it can be
 | 
				
			||||||
 | 
					used to store other webpages. If you're interested in a slightly longer explanation of the heap,
 | 
				
			||||||
 | 
					check out
 | 
				
			||||||
 | 
					[The Stack and the Heap](https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html#the-stack-and-the-heap)
 | 
				
			||||||
 | 
					in Rust's documentation.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					We won't go into detail on how the heap is managed; the
 | 
				
			||||||
 | 
					[ownership documentation](https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html) does a
 | 
				
			||||||
 | 
					phenomenal job explaining both the "why" and "how" of memory management. Instead, we're going to
 | 
				
			||||||
 | 
					focus on understanding "when" heap allocations occur in Rust.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					To start off, take a guess for how many allocations happen in the program below:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					fn main() {}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					It's obviously a trick question; while no heap allocations occur as a result of that code, the setup
 | 
				
			||||||
 | 
					needed to call `main` does allocate on the heap. Here's a way to show it:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					#![feature(integer_atomics)]
 | 
				
			||||||
 | 
					use std::alloc::{GlobalAlloc, Layout, System};
 | 
				
			||||||
 | 
					use std::sync::atomic::{AtomicU64, Ordering};
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					static ALLOCATION_COUNT: AtomicU64 = AtomicU64::new(0);
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					struct CountingAllocator;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					unsafe impl GlobalAlloc for CountingAllocator {
 | 
				
			||||||
 | 
					    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
 | 
				
			||||||
 | 
					        ALLOCATION_COUNT.fetch_add(1, Ordering::SeqCst);
 | 
				
			||||||
 | 
					        System.alloc(layout)
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
 | 
				
			||||||
 | 
					        System.dealloc(ptr, layout);
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					#[global_allocator]
 | 
				
			||||||
 | 
					static A: CountingAllocator = CountingAllocator;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					fn main() {
 | 
				
			||||||
 | 
					    let x = ALLOCATION_COUNT.fetch_add(0, Ordering::SeqCst);
 | 
				
			||||||
 | 
					    println!("There were {} allocations before calling main!", x);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					--
 | 
				
			||||||
 | 
					[Rust Playground](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=fb5060025ba79fc0f906b65a4ef8eb8e)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					As of the time of writing, there are five allocations that happen before `main` is ever called.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					But when we want to understand more practically where heap allocation happens, we'll follow this
 | 
				
			||||||
 | 
					guide:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- Smart pointers hold their contents in the heap
 | 
				
			||||||
 | 
					- Collections are smart pointers for many objects at a time, and reallocate when they need to grow
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Finally, there are two "addendum" issues that are important to address when discussing Rust and the
 | 
				
			||||||
 | 
					heap:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- Non-heap alternatives to many standard library types are available.
 | 
				
			||||||
 | 
					- Special allocators to track memory behavior should be used to benchmark code.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Smart pointers
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The first thing to note are the "smart pointer" types. When you have data that must outlive the
 | 
				
			||||||
 | 
					scope in which it is declared, or your data is of unknown or dynamic size, you'll make use of these
 | 
				
			||||||
 | 
					types.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The term [smart pointer](https://en.wikipedia.org/wiki/Smart_pointer) comes from C++, and while it's
 | 
				
			||||||
 | 
					closely linked to a general design pattern of
 | 
				
			||||||
 | 
					["Resource Acquisition Is Initialization"](https://en.cppreference.com/w/cpp/language/raii), we'll
 | 
				
			||||||
 | 
					use it here specifically to describe objects that are responsible for managing ownership of data
 | 
				
			||||||
 | 
					allocated on the heap. The smart pointers available in the `alloc` crate should look mostly
 | 
				
			||||||
 | 
					familiar:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- [`Box`](https://doc.rust-lang.org/alloc/boxed/struct.Box.html)
 | 
				
			||||||
 | 
					- [`Rc`](https://doc.rust-lang.org/alloc/rc/struct.Rc.html)
 | 
				
			||||||
 | 
					- [`Arc`](https://doc.rust-lang.org/alloc/sync/struct.Arc.html)
 | 
				
			||||||
 | 
					- [`Cow`](https://doc.rust-lang.org/alloc/borrow/enum.Cow.html)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The [standard library](https://doc.rust-lang.org/std/) also defines some smart pointers to manage
 | 
				
			||||||
 | 
					heap objects, though more than can be covered here. Some examples are:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- [`RwLock`](https://doc.rust-lang.org/std/sync/struct.RwLock.html)
 | 
				
			||||||
 | 
					- [`Mutex`](https://doc.rust-lang.org/std/sync/struct.Mutex.html)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Finally, there is one ["gotcha"](https://www.merriam-webster.com/dictionary/gotcha): **cell types**
 | 
				
			||||||
 | 
					(like [`RefCell`](https://doc.rust-lang.org/stable/core/cell/struct.RefCell.html)) look and behave
 | 
				
			||||||
 | 
					similarly, but **don't involve heap allocation**. The
 | 
				
			||||||
 | 
					[`core::cell` docs](https://doc.rust-lang.org/stable/core/cell/index.html) have more information.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					When a smart pointer is created, the data it is given is placed in heap memory and the location of
 | 
				
			||||||
 | 
					that data is recorded in the smart pointer. Once the smart pointer has determined it's safe to
 | 
				
			||||||
 | 
					deallocate that memory (when a `Box` has
 | 
				
			||||||
 | 
					[gone out of scope](https://doc.rust-lang.org/stable/std/boxed/index.html) or a reference count
 | 
				
			||||||
 | 
					[goes to zero](https://doc.rust-lang.org/alloc/rc/index.html)), the heap space is reclaimed. We can
 | 
				
			||||||
 | 
					prove these types use heap memory by looking at code:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					use std::rc::Rc;
 | 
				
			||||||
 | 
					use std::sync::Arc;
 | 
				
			||||||
 | 
					use std::borrow::Cow;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn my_box() {
 | 
				
			||||||
 | 
					    // Drop at assembly line 1640
 | 
				
			||||||
 | 
					    Box::new(0);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn my_rc() {
 | 
				
			||||||
 | 
					    // Drop at assembly line 1650
 | 
				
			||||||
 | 
					    Rc::new(0);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn my_arc() {
 | 
				
			||||||
 | 
					    // Drop at assembly line 1660
 | 
				
			||||||
 | 
					    Arc::new(0);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn my_cow() {
 | 
				
			||||||
 | 
					    // Drop at assembly line 1672
 | 
				
			||||||
 | 
					    Cow::from("drop");
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/4AMQug)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Collections
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Collection types use heap memory because their contents have dynamic size; they will request more
 | 
				
			||||||
 | 
					memory [when needed](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.reserve), and can
 | 
				
			||||||
 | 
					[release memory](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.shrink_to_fit) when it's
 | 
				
			||||||
 | 
					no longer necessary. This dynamic property forces Rust to heap allocate everything they contain. In
 | 
				
			||||||
 | 
					a way, **collections are smart pointers for many objects at a time**. Common types that fall under
 | 
				
			||||||
 | 
					this umbrella are [`Vec`](https://doc.rust-lang.org/stable/alloc/vec/struct.Vec.html),
 | 
				
			||||||
 | 
					[`HashMap`](https://doc.rust-lang.org/stable/std/collections/struct.HashMap.html), and
 | 
				
			||||||
 | 
					[`String`](https://doc.rust-lang.org/stable/alloc/string/struct.String.html) (not
 | 
				
			||||||
 | 
					[`str`](https://doc.rust-lang.org/std/primitive.str.html)).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					While collections store the objects they own in heap memory, _creating new collections will not
 | 
				
			||||||
 | 
					allocate on the heap_. This is a bit weird; if we call `Vec::new()`, the assembly shows a
 | 
				
			||||||
 | 
					corresponding call to `real_drop_in_place`:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					pub fn my_vec() {
 | 
				
			||||||
 | 
					    // Drop in place at line 481
 | 
				
			||||||
 | 
					    Vec::<u8>::new();
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/1WkNtC)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					But because the vector has no elements to manage, no calls to the allocator will ever be dispatched:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					use std::alloc::{GlobalAlloc, Layout, System};
 | 
				
			||||||
 | 
					use std::sync::atomic::{AtomicBool, Ordering};
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					fn main() {
 | 
				
			||||||
 | 
					    // Turn on panicking if we allocate on the heap
 | 
				
			||||||
 | 
					    DO_PANIC.store(true, Ordering::SeqCst);
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    // Interesting bit happens here
 | 
				
			||||||
 | 
					    let x: Vec<u8> = Vec::new();
 | 
				
			||||||
 | 
					    drop(x);
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    // Turn panicking back off, some deallocations occur
 | 
				
			||||||
 | 
					    // after main as well.
 | 
				
			||||||
 | 
					    DO_PANIC.store(false, Ordering::SeqCst);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					#[global_allocator]
 | 
				
			||||||
 | 
					static A: PanicAllocator = PanicAllocator;
 | 
				
			||||||
 | 
					static DO_PANIC: AtomicBool = AtomicBool::new(false);
 | 
				
			||||||
 | 
					struct PanicAllocator;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					unsafe impl GlobalAlloc for PanicAllocator {
 | 
				
			||||||
 | 
					    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
 | 
				
			||||||
 | 
					        if DO_PANIC.load(Ordering::SeqCst) {
 | 
				
			||||||
 | 
					            panic!("Unexpected allocation.");
 | 
				
			||||||
 | 
					        }
 | 
				
			||||||
 | 
					        System.alloc(layout)
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
 | 
				
			||||||
 | 
					        if DO_PANIC.load(Ordering::SeqCst) {
 | 
				
			||||||
 | 
					            panic!("Unexpected deallocation.");
 | 
				
			||||||
 | 
					        }
 | 
				
			||||||
 | 
					        System.dealloc(ptr, layout);
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					--
 | 
				
			||||||
 | 
					[Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=831a297d176d015b1f9ace01ae416cc6)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Other standard library types follow the same behavior; make sure to check out
 | 
				
			||||||
 | 
					[`HashMap::new()`](https://doc.rust-lang.org/std/collections/hash_map/struct.HashMap.html#method.new),
 | 
				
			||||||
 | 
					and [`String::new()`](https://doc.rust-lang.org/std/string/struct.String.html#method.new).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Heap Alternatives
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					While it is a bit strange to speak of the stack after spending time with the heap, it's worth
 | 
				
			||||||
 | 
					pointing out that some heap-allocated objects in Rust have stack-based counterparts provided by
 | 
				
			||||||
 | 
					other crates. If you have need of the functionality, but want to avoid allocating, there are
 | 
				
			||||||
 | 
					typically alternatives available.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					When it comes to some standard library smart pointers
 | 
				
			||||||
 | 
					([`RwLock`](https://doc.rust-lang.org/std/sync/struct.RwLock.html) and
 | 
				
			||||||
 | 
					[`Mutex`](https://doc.rust-lang.org/std/sync/struct.Mutex.html)), stack-based alternatives are
 | 
				
			||||||
 | 
					provided in crates like [parking_lot](https://crates.io/crates/parking_lot) and
 | 
				
			||||||
 | 
					[spin](https://crates.io/crates/spin). You can check out
 | 
				
			||||||
 | 
					[`lock_api::RwLock`](https://docs.rs/lock_api/0.1.5/lock_api/struct.RwLock.html),
 | 
				
			||||||
 | 
					[`lock_api::Mutex`](https://docs.rs/lock_api/0.1.5/lock_api/struct.Mutex.html), and
 | 
				
			||||||
 | 
					[`spin::Once`](https://mvdnes.github.io/rust-docs/spin-rs/spin/struct.Once.html) if you're in need
 | 
				
			||||||
 | 
					of synchronization primitives.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					[thread_id](https://crates.io/crates/thread-id) may be necessary if you're implementing an allocator
 | 
				
			||||||
 | 
					because [`thread::current().id()`](https://doc.rust-lang.org/std/thread/struct.ThreadId.html) uses a
 | 
				
			||||||
 | 
					[`thread_local!` structure](https://doc.rust-lang.org/stable/src/std/sys_common/thread_info.rs.html#17-36)
 | 
				
			||||||
 | 
					that needs heap allocation.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Tracing Allocators
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					When writing performance-sensitive code, there's no alternative to measuring your code. If you
 | 
				
			||||||
 | 
					didn't write a benchmark,
 | 
				
			||||||
 | 
					[you don't care about it's performance](https://www.youtube.com/watch?v=2EWejmkKlxs&feature=youtu.be&t=263)
 | 
				
			||||||
 | 
					You should never rely on your instincts when
 | 
				
			||||||
 | 
					[a microsecond is an eternity](https://www.youtube.com/watch?v=NH1Tta7purM).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Similarly, there's great work going on in Rust with allocators that keep track of what they're doing
 | 
				
			||||||
 | 
					(like [`alloc_counter`](https://crates.io/crates/alloc_counter)). When it comes to tracking heap
 | 
				
			||||||
 | 
					behavior, it's easy to make mistakes; please write tests and make sure you have tools to guard
 | 
				
			||||||
 | 
					against future issues.
 | 
				
			||||||
							
								
								
									
										148
									
								
								blog/2019-02-08-compiler-optimizations/_article.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										148
									
								
								blog/2019-02-08-compiler-optimizations/_article.md
									
									
									
									
									
										Normal file
									
								
							@ -0,0 +1,148 @@
 | 
				
			|||||||
 | 
					---
 | 
				
			||||||
 | 
					layout: post
 | 
				
			||||||
 | 
					title: "Compiler Optimizations: What It's Done Lately"
 | 
				
			||||||
 | 
					description: "A lot. The answer is a lot."
 | 
				
			||||||
 | 
					category:
 | 
				
			||||||
 | 
					tags: [rust, understanding-allocations]
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					**Update 2019-02-10**: When debugging a
 | 
				
			||||||
 | 
					[related issue](https://gitlab.com/sio4/code/alloc-counter/issues/1), it was discovered that the
 | 
				
			||||||
 | 
					original code worked because LLVM optimized out the entire function, rather than just the allocation
 | 
				
			||||||
 | 
					segments. The code has been updated with proper use of
 | 
				
			||||||
 | 
					[`read_volatile`](https://doc.rust-lang.org/std/ptr/fn.read_volatile.html), and a previous section
 | 
				
			||||||
 | 
					on vector capacity has been removed.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Up to this point, we've been discussing memory usage in the Rust language by focusing on simple
 | 
				
			||||||
 | 
					rules that are mostly right for small chunks of code. We've spent time showing how those rules work
 | 
				
			||||||
 | 
					themselves out in practice, and become familiar with reading the assembly code needed to see each
 | 
				
			||||||
 | 
					memory type (global, stack, heap) in action.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Throughout the series so far, we've put a handicap on the code. In the name of consistent and
 | 
				
			||||||
 | 
					understandable results, we've asked the compiler to pretty please leave the training wheels on. Now
 | 
				
			||||||
 | 
					is the time where we throw out all the rules and take off the kid gloves. As it turns out, both the
 | 
				
			||||||
 | 
					Rust compiler and the LLVM optimizers are incredibly sophisticated, and we'll step back and let them
 | 
				
			||||||
 | 
					do their job.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Similar to
 | 
				
			||||||
 | 
					["What Has My Compiler Done For Me Lately?"](https://www.youtube.com/watch?v=bSkpMdDe4g4), we're
 | 
				
			||||||
 | 
					focusing on interesting things the Rust language (and LLVM!) can do with memory management. We'll
 | 
				
			||||||
 | 
					still be looking at assembly code to understand what's going on, but it's important to mention
 | 
				
			||||||
 | 
					again: **please use automated tools like [alloc-counter](https://crates.io/crates/alloc_counter) to
 | 
				
			||||||
 | 
					double-check memory behavior if it's something you care about**. It's far too easy to mis-read
 | 
				
			||||||
 | 
					assembly in large code sections, you should always verify behavior if you care about memory usage.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The guiding principal as we move forward is this: _optimizing compilers won't produce worse programs
 | 
				
			||||||
 | 
					than we started with._ There won't be any situations where stack allocations get moved to heap
 | 
				
			||||||
 | 
					allocations. There will, however, be an opera of optimization.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					# The Case of the Disappearing Box
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Our first optimization comes when LLVM can reason that the lifetime of an object is sufficiently
 | 
				
			||||||
 | 
					short that heap allocations aren't necessary. In these cases, LLVM will move the allocation to the
 | 
				
			||||||
 | 
					stack instead! The way this interacts with `#[inline]` attributes is a bit opaque, but the important
 | 
				
			||||||
 | 
					part is that LLVM can sometimes do better than the baseline Rust language:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					use std::alloc::{GlobalAlloc, Layout, System};
 | 
				
			||||||
 | 
					use std::sync::atomic::{AtomicBool, Ordering};
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn cmp(x: u32) {
 | 
				
			||||||
 | 
					    // Turn on panicking if we allocate on the heap
 | 
				
			||||||
 | 
					    DO_PANIC.store(true, Ordering::SeqCst);
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    // The compiler is able to see through the constant `Box`
 | 
				
			||||||
 | 
					    // and directly compare `x` to 24 - assembly line 73
 | 
				
			||||||
 | 
					    let y = Box::new(24);
 | 
				
			||||||
 | 
					    let equals = x == *y;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    // This call to drop is eliminated
 | 
				
			||||||
 | 
					    drop(y);
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    // Need to mark the comparison result as volatile so that
 | 
				
			||||||
 | 
					    // LLVM doesn't strip out all the code. If `y` is marked
 | 
				
			||||||
 | 
					    // volatile instead, allocation will be forced.
 | 
				
			||||||
 | 
					    unsafe { std::ptr::read_volatile(&equals) };
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    // Turn off panicking, as there are some deallocations
 | 
				
			||||||
 | 
					    // when we exit main.
 | 
				
			||||||
 | 
					    DO_PANIC.store(false, Ordering::SeqCst);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					fn main() {
 | 
				
			||||||
 | 
					    cmp(12)
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					#[global_allocator]
 | 
				
			||||||
 | 
					static A: PanicAllocator = PanicAllocator;
 | 
				
			||||||
 | 
					static DO_PANIC: AtomicBool = AtomicBool::new(false);
 | 
				
			||||||
 | 
					struct PanicAllocator;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					unsafe impl GlobalAlloc for PanicAllocator {
 | 
				
			||||||
 | 
					    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
 | 
				
			||||||
 | 
					        if DO_PANIC.load(Ordering::SeqCst) {
 | 
				
			||||||
 | 
					            panic!("Unexpected allocation.");
 | 
				
			||||||
 | 
					        }
 | 
				
			||||||
 | 
					        System.alloc(layout)
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
 | 
				
			||||||
 | 
					        if DO_PANIC.load(Ordering::SeqCst) {
 | 
				
			||||||
 | 
					            panic!("Unexpected deallocation.");
 | 
				
			||||||
 | 
					        }
 | 
				
			||||||
 | 
					        System.dealloc(ptr, layout);
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## -- [Compiler Explorer](https://godbolt.org/z/BZ_Yp3)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					[Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=4a765f753183d5b919f62c71d2109d5d)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					# Dr. Array or: How I Learned to Love the Optimizer
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Finally, this isn't so much about LLVM figuring out different memory behavior, but LLVM stripping
 | 
				
			||||||
 | 
					out code that doesn't do anything. Optimizations of this type have a lot of nuance to them; if
 | 
				
			||||||
 | 
					you're not careful, they can make your benchmarks look
 | 
				
			||||||
 | 
					[impossibly good](https://www.youtube.com/watch?v=nXaxk27zwlk&feature=youtu.be&t=1199). In Rust, the
 | 
				
			||||||
 | 
					`black_box` function (implemented in both
 | 
				
			||||||
 | 
					[`libtest`](https://doc.rust-lang.org/1.1.0/test/fn.black_box.html) and
 | 
				
			||||||
 | 
					[`criterion`](https://docs.rs/criterion/0.2.10/criterion/fn.black_box.html)) will tell the compiler
 | 
				
			||||||
 | 
					to disable this kind of optimization. But if you let LLVM remove unnecessary code, you can end up
 | 
				
			||||||
 | 
					running programs that previously caused errors:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					#[derive(Default)]
 | 
				
			||||||
 | 
					struct TwoFiftySix {
 | 
				
			||||||
 | 
					    _a: [u64; 32]
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					#[derive(Default)]
 | 
				
			||||||
 | 
					struct EightK {
 | 
				
			||||||
 | 
					    _a: [TwoFiftySix; 32]
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					#[derive(Default)]
 | 
				
			||||||
 | 
					struct TwoFiftySixK {
 | 
				
			||||||
 | 
					    _a: [EightK; 32]
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					#[derive(Default)]
 | 
				
			||||||
 | 
					struct EightM {
 | 
				
			||||||
 | 
					    _a: [TwoFiftySixK; 32]
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn main() {
 | 
				
			||||||
 | 
					    // Normally this blows up because we can't reserve size on stack
 | 
				
			||||||
 | 
					    // for the `EightM` struct. But because the compiler notices we
 | 
				
			||||||
 | 
					    // never do anything with `_x`, it optimizes out the stack storage
 | 
				
			||||||
 | 
					    // and the program completes successfully.
 | 
				
			||||||
 | 
					    let _x = EightM::default();
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## -- [Compiler Explorer](https://godbolt.org/z/daHn7P)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					[Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=4c253bf26072119896ab93c6ef064dc0)
 | 
				
			||||||
							
								
								
									
										149
									
								
								blog/2019-02-08-compiler-optimizations/index.mdx
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										149
									
								
								blog/2019-02-08-compiler-optimizations/index.mdx
									
									
									
									
									
										Normal file
									
								
							@ -0,0 +1,149 @@
 | 
				
			|||||||
 | 
					---
 | 
				
			||||||
 | 
					title: "Allocations in Rust: Compiler optimizations"
 | 
				
			||||||
 | 
					description: "A lot. The answer is a lot."
 | 
				
			||||||
 | 
					date: 2019-02-08 12:00:00
 | 
				
			||||||
 | 
					last_updated:
 | 
				
			||||||
 | 
					    date: 2019-02-10 12:00:00
 | 
				
			||||||
 | 
					tags: []
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Up to this point, we've been discussing memory usage in the Rust language by focusing on simple
 | 
				
			||||||
 | 
					rules that are mostly right for small chunks of code. We've spent time showing how those rules work
 | 
				
			||||||
 | 
					themselves out in practice, and become familiar with reading the assembly code needed to see each
 | 
				
			||||||
 | 
					memory type (global, stack, heap) in action.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Throughout the series so far, we've put a handicap on the code. In the name of consistent and
 | 
				
			||||||
 | 
					understandable results, we've asked the compiler to pretty please leave the training wheels on. Now
 | 
				
			||||||
 | 
					is the time where we throw out all the rules and take off the kid gloves. As it turns out, both the
 | 
				
			||||||
 | 
					Rust compiler and the LLVM optimizers are incredibly sophisticated, and we'll step back and let them
 | 
				
			||||||
 | 
					do their job.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					<!-- truncate -->
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Similar to
 | 
				
			||||||
 | 
					["What Has My Compiler Done For Me Lately?"](https://www.youtube.com/watch?v=bSkpMdDe4g4), we're
 | 
				
			||||||
 | 
					focusing on interesting things the Rust language (and LLVM!) can do with memory management. We'll
 | 
				
			||||||
 | 
					still be looking at assembly code to understand what's going on, but it's important to mention
 | 
				
			||||||
 | 
					again: **please use automated tools like [alloc-counter](https://crates.io/crates/alloc_counter) to
 | 
				
			||||||
 | 
					double-check memory behavior if it's something you care about**. It's far too easy to mis-read
 | 
				
			||||||
 | 
					assembly in large code sections, you should always verify behavior if you care about memory usage.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					The guiding principal as we move forward is this: _optimizing compilers won't produce worse programs
 | 
				
			||||||
 | 
					than we started with._ There won't be any situations where stack allocations get moved to heap
 | 
				
			||||||
 | 
					allocations. There will, however, be an opera of optimization.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					**Update 2019-02-10**: When debugging a
 | 
				
			||||||
 | 
					[related issue](https://gitlab.com/sio4/code/alloc-counter/issues/1), it was discovered that the
 | 
				
			||||||
 | 
					original code worked because LLVM optimized out the entire function, rather than just the allocation
 | 
				
			||||||
 | 
					segments. The code has been updated with proper use of
 | 
				
			||||||
 | 
					[`read_volatile`](https://doc.rust-lang.org/std/ptr/fn.read_volatile.html), and a previous section
 | 
				
			||||||
 | 
					on vector capacity has been removed.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## The Case of the Disappearing Box
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Our first optimization comes when LLVM can reason that the lifetime of an object is sufficiently
 | 
				
			||||||
 | 
					short that heap allocations aren't necessary. In these cases, LLVM will move the allocation to the
 | 
				
			||||||
 | 
					stack instead! The way this interacts with `#[inline]` attributes is a bit opaque, but the important
 | 
				
			||||||
 | 
					part is that LLVM can sometimes do better than the baseline Rust language:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					use std::alloc::{GlobalAlloc, Layout, System};
 | 
				
			||||||
 | 
					use std::sync::atomic::{AtomicBool, Ordering};
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn cmp(x: u32) {
 | 
				
			||||||
 | 
					    // Turn on panicking if we allocate on the heap
 | 
				
			||||||
 | 
					    DO_PANIC.store(true, Ordering::SeqCst);
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    // The compiler is able to see through the constant `Box`
 | 
				
			||||||
 | 
					    // and directly compare `x` to 24 - assembly line 73
 | 
				
			||||||
 | 
					    let y = Box::new(24);
 | 
				
			||||||
 | 
					    let equals = x == *y;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    // This call to drop is eliminated
 | 
				
			||||||
 | 
					    drop(y);
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    // Need to mark the comparison result as volatile so that
 | 
				
			||||||
 | 
					    // LLVM doesn't strip out all the code. If `y` is marked
 | 
				
			||||||
 | 
					    // volatile instead, allocation will be forced.
 | 
				
			||||||
 | 
					    unsafe { std::ptr::read_volatile(&equals) };
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    // Turn off panicking, as there are some deallocations
 | 
				
			||||||
 | 
					    // when we exit main.
 | 
				
			||||||
 | 
					    DO_PANIC.store(false, Ordering::SeqCst);
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					fn main() {
 | 
				
			||||||
 | 
					    cmp(12)
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					#[global_allocator]
 | 
				
			||||||
 | 
					static A: PanicAllocator = PanicAllocator;
 | 
				
			||||||
 | 
					static DO_PANIC: AtomicBool = AtomicBool::new(false);
 | 
				
			||||||
 | 
					struct PanicAllocator;
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					unsafe impl GlobalAlloc for PanicAllocator {
 | 
				
			||||||
 | 
					    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
 | 
				
			||||||
 | 
					        if DO_PANIC.load(Ordering::SeqCst) {
 | 
				
			||||||
 | 
					            panic!("Unexpected allocation.");
 | 
				
			||||||
 | 
					        }
 | 
				
			||||||
 | 
					        System.alloc(layout)
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
 | 
				
			||||||
 | 
					        if DO_PANIC.load(Ordering::SeqCst) {
 | 
				
			||||||
 | 
					            panic!("Unexpected deallocation.");
 | 
				
			||||||
 | 
					        }
 | 
				
			||||||
 | 
					        System.dealloc(ptr, layout);
 | 
				
			||||||
 | 
					    }
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/BZ_Yp3)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=4a765f753183d5b919f62c71d2109d5d)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					## Dr. Array or: how I learned to love the optimizer
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Finally, this isn't so much about LLVM figuring out different memory behavior, but LLVM stripping
 | 
				
			||||||
 | 
					out code that doesn't do anything. Optimizations of this type have a lot of nuance to them; if
 | 
				
			||||||
 | 
					you're not careful, they can make your benchmarks look
 | 
				
			||||||
 | 
					[impossibly good](https://www.youtube.com/watch?v=nXaxk27zwlk&feature=youtu.be&t=1199). In Rust, the
 | 
				
			||||||
 | 
					`black_box` function (implemented in both
 | 
				
			||||||
 | 
					[`libtest`](https://doc.rust-lang.org/1.1.0/test/fn.black_box.html) and
 | 
				
			||||||
 | 
					[`criterion`](https://docs.rs/criterion/0.2.10/criterion/fn.black_box.html)) will tell the compiler
 | 
				
			||||||
 | 
					to disable this kind of optimization. But if you let LLVM remove unnecessary code, you can end up
 | 
				
			||||||
 | 
					running programs that previously caused errors:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					```rust
 | 
				
			||||||
 | 
					#[derive(Default)]
 | 
				
			||||||
 | 
					struct TwoFiftySix {
 | 
				
			||||||
 | 
					    _a: [u64; 32]
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					#[derive(Default)]
 | 
				
			||||||
 | 
					struct EightK {
 | 
				
			||||||
 | 
					    _a: [TwoFiftySix; 32]
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					#[derive(Default)]
 | 
				
			||||||
 | 
					struct TwoFiftySixK {
 | 
				
			||||||
 | 
					    _a: [EightK; 32]
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					#[derive(Default)]
 | 
				
			||||||
 | 
					struct EightM {
 | 
				
			||||||
 | 
					    _a: [TwoFiftySixK; 32]
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					pub fn main() {
 | 
				
			||||||
 | 
					    // Normally this blows up because we can't reserve size on stack
 | 
				
			||||||
 | 
					    // for the `EightM` struct. But because the compiler notices we
 | 
				
			||||||
 | 
					    // never do anything with `_x`, it optimizes out the stack storage
 | 
				
			||||||
 | 
					    // and the program completes successfully.
 | 
				
			||||||
 | 
					    let _x = EightM::default();
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					```
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Compiler Explorer](https://godbolt.org/z/daHn7P)
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Rust Playground](https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=4c253bf26072119896ab93c6ef064dc0)
 | 
				
			||||||
							
								
								
									
										35
									
								
								blog/2019-02-09-summary/_article.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										35
									
								
								blog/2019-02-09-summary/_article.md
									
									
									
									
									
										Normal file
									
								
							@ -0,0 +1,35 @@
 | 
				
			|||||||
 | 
					---
 | 
				
			||||||
 | 
					layout: post
 | 
				
			||||||
 | 
					title: "Summary: What are the Allocation Rules?"
 | 
				
			||||||
 | 
					description: "A synopsis and reference."
 | 
				
			||||||
 | 
					category:
 | 
				
			||||||
 | 
					tags: [rust, understanding-allocations]
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					While there's a lot of interesting detail captured in this series, it's often helpful to have a
 | 
				
			||||||
 | 
					document that answers some "yes/no" questions. You may not care about what an `Iterator` looks like
 | 
				
			||||||
 | 
					in assembly, you just need to know whether it allocates an object on the heap or not. And while Rust
 | 
				
			||||||
 | 
					will prioritize the fastest behavior it can, here are the rules for each memory type:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					**Heap Allocation**:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- Smart pointers (`Box`, `Rc`, `Mutex`, etc.) allocate their contents in heap memory.
 | 
				
			||||||
 | 
					- Collections (`HashMap`, `Vec`, `String`, etc.) allocate their contents in heap memory.
 | 
				
			||||||
 | 
					- Some smart pointers in the standard library have counterparts in other crates that don't need heap
 | 
				
			||||||
 | 
					  memory. If possible, use those.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					**Stack Allocation**:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- Everything not using a smart pointer will be allocated on the stack.
 | 
				
			||||||
 | 
					- Structs, enums, iterators, arrays, and closures are all stack allocated.
 | 
				
			||||||
 | 
					- Cell types (`RefCell`) behave like smart pointers, but are stack-allocated.
 | 
				
			||||||
 | 
					- Inlining (`#[inline]`) will not affect allocation behavior for better or worse.
 | 
				
			||||||
 | 
					- Types that are marked `Copy` are guaranteed to have their contents stack-allocated.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					**Global Allocation**:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- `const` is a fixed value; the compiler is allowed to copy it wherever useful.
 | 
				
			||||||
 | 
					- `static` is a fixed reference; the compiler will guarantee it is unique.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					 --
 | 
				
			||||||
 | 
					[Raph Levien](https://docs.google.com/presentation/d/1q-c7UAyrUlM-eZyTo1pd8SZ0qwA_wYxmPZVOQkoDmH4/edit?usp=sharing)
 | 
				
			||||||
							
								
								
									
										1
									
								
								blog/2019-02-09-summary/container-size.svg
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										1
									
								
								blog/2019-02-09-summary/container-size.svg
									
									
									
									
									
										Normal file
									
								
							
										
											
												File diff suppressed because one or more lines are too long
											
										
									
								
							| 
		 After Width: | Height: | Size: 426 KiB  | 
							
								
								
									
										39
									
								
								blog/2019-02-09-summary/index.mdx
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										39
									
								
								blog/2019-02-09-summary/index.mdx
									
									
									
									
									
										Normal file
									
								
							@ -0,0 +1,39 @@
 | 
				
			|||||||
 | 
					---
 | 
				
			||||||
 | 
					slug: 2019/02/summary
 | 
				
			||||||
 | 
					title: "Allocations in Rust: Summary"
 | 
				
			||||||
 | 
					date: 2019-02-09 12:00:00
 | 
				
			||||||
 | 
					authors: [bspeice]
 | 
				
			||||||
 | 
					tags: []
 | 
				
			||||||
 | 
					---
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					While there's a lot of interesting detail captured in this series, it's often helpful to have a
 | 
				
			||||||
 | 
					document that answers some "yes/no" questions. You may not care about what an `Iterator` looks like
 | 
				
			||||||
 | 
					in assembly, you just need to know whether it allocates an object on the heap or not. And while Rust
 | 
				
			||||||
 | 
					will prioritize the fastest behavior it can, here are the rules for each memory type:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					<!-- truncate -->
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					**Global Allocation**:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- `const` is a fixed value; the compiler is allowed to copy it wherever useful.
 | 
				
			||||||
 | 
					- `static` is a fixed reference; the compiler will guarantee it is unique.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					**Stack Allocation**:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- Everything not using a smart pointer will be allocated on the stack.
 | 
				
			||||||
 | 
					- Structs, enums, iterators, arrays, and closures are all stack allocated.
 | 
				
			||||||
 | 
					- Cell types (`RefCell`) behave like smart pointers, but are stack-allocated.
 | 
				
			||||||
 | 
					- Inlining (`#[inline]`) will not affect allocation behavior for better or worse.
 | 
				
			||||||
 | 
					- Types that are marked `Copy` are guaranteed to have their contents stack-allocated.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					**Heap Allocation**:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					- Smart pointers (`Box`, `Rc`, `Mutex`, etc.) allocate their contents in heap memory.
 | 
				
			||||||
 | 
					- Collections (`HashMap`, `Vec`, `String`, etc.) allocate their contents in heap memory.
 | 
				
			||||||
 | 
					- Some smart pointers in the standard library have counterparts in other crates that don't need heap
 | 
				
			||||||
 | 
					  memory. If possible, use those.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					-- [Raph Levien](https://docs.google.com/presentation/d/1q-c7UAyrUlM-eZyTo1pd8SZ0qwA_wYxmPZVOQkoDmH4/edit?usp=sharing)
 | 
				
			||||||
		Reference in New Issue
	
	Block a user