mirror of
				https://github.com/bspeice/speice.io
				synced 2025-10-31 09:30:32 -04:00 
			
		
		
		
	Second to final draft
This commit is contained in:
		| @ -6,42 +6,42 @@ category: | |||||||
| tags: [] | tags: [] | ||||||
| --- | --- | ||||||
|  |  | ||||||
| One of my first conversations about programming went like this: | One of my earliest conversations about programming went like this: | ||||||
|  |  | ||||||
| > Programmers have it too easy these days. They should learn to develop | > Programmers have it too easy these days. They should learn to develop | ||||||
| > in low memory environments and be efficient. | > in low memory environments and be more efficient. | ||||||
| > | > | ||||||
| > -- My Father (paraphrased) | > -- My Father (paraphrased) | ||||||
|  |  | ||||||
| Though it's not like the first code I wrote was for a | ...though it's not like the first code I wrote was for a | ||||||
| [graphing calculator](https://education.ti.com/en/products/calculators/graphing-calculators/ti-84-plus-se), | [graphing calculator](https://education.ti.com/en/products/calculators/graphing-calculators/ti-84-plus-se) | ||||||
| packing a whole 24KB of RAM. By the way, *what are you doing on my lawn?* | packing a whole 24KB of RAM. By the way, *what are you doing on my lawn?* | ||||||
|  |  | ||||||
| The principle remains though: be efficient with the resources you're given, because | The principle remains though: be efficient with the resources you're given, because | ||||||
| [what Intel giveth, Microsoft taketh away](http://exo-blog.blogspot.com/2007/09/what-intel-giveth-microsoft-taketh-away.html). | [what Intel giveth, Microsoft taketh away](http://exo-blog.blogspot.com/2007/09/what-intel-giveth-microsoft-taketh-away.html). | ||||||
| My professional work has been focused on this kind of efficiency; low-latency financial markets demand that | My professional work is focused on this kind of efficiency; low-latency financial markets demand that | ||||||
| you understand at a deep level *exactly* what your code is doing. As I've been experimenting with Rust for | you understand at a deep level *exactly* what your code is doing. As I continue experimenting with Rust for | ||||||
| personal projects, it's exciting to bring that mindset with me. There's flexibility for the times where I'd rather | personal projects, it's exciting to bring a utilitarian mindset with me: there's flexibility for the times I pretend | ||||||
| have a garbage collector, and flexibility for the times that I really care about efficiency. | to have a garbage collector, and flexibility for the times that I really care about efficiency. | ||||||
|  |  | ||||||
| This post is a (small) case study in how I went from the former to the latter. And it's an attempt to prove how easy | This post is a (small) case study in how I went from the former to the latter. And it's an attempt to prove how easy | ||||||
| it is for you to do the same. | it is for you to do the same. | ||||||
|  |  | ||||||
| # The Starting Line | # Curiosity | ||||||
|  |  | ||||||
| When I first started building the [dtparse] crate, my intention was to mirror as closely as possible  | When I first started building the [dtparse] crate, my intention was to mirror as closely as possible  | ||||||
| the equivalent [Python library][dateutil]. Python, as you may know, is garbage collected. Very rarely is memory | the equivalent [Python library][dateutil]. Python, as you may know, is garbage collected. Very rarely is memory | ||||||
| usage considered in Python, and I likewise wasn't paying too much attention when `dtparse` was first being built. | usage considered in Python, and I likewise wasn't paying too much attention when `dtparse` was first being built. | ||||||
|  |  | ||||||
| That works out well enough, and I'm not planning on making that crate hyper-efficient. | That works out well enough, and I'm not planning on making that `dtparse` hyper-efficient. | ||||||
| But every so often I've wondered: "what exactly is going on in memory?" With the advent of Rust 1.28 and the | But every so often, I've wondered: "what exactly is going on in memory?" With the advent of Rust 1.28 and the | ||||||
| [Global Allocator trait](https://doc.rust-lang.org/std/alloc/trait.GlobalAlloc.html), I had a really great idea: | [Global Allocator trait](https://doc.rust-lang.org/std/alloc/trait.GlobalAlloc.html), I had a really great idea: | ||||||
| *build a custom allocator that allows you to track your own allocations.* That way, you can do things like | *build a custom allocator that allows you to track your own allocations.* That way, you can do things like | ||||||
| writing tests for both correct results and correct memory usage. I gave it a [shot][qadapt], but learned | writing tests for both correct results and correct memory usage. I gave it a [shot][qadapt], but learned | ||||||
| very quickly: **never write your own allocator**. It went from "fun weekend project" into | very quickly: **never write your own allocator**. It went from "fun weekend project" to | ||||||
| "I have literally no idea what my computer is doing" at breakneck speed. | "I have literally no idea what my computer is doing" at breakneck speed. | ||||||
|  |  | ||||||
| Instead, let's highlight another (easier) way you can make sense of your memory usage: [heaptrack] | Instead, I'll highlight a separate path I took to make sense of my memory usage: [heaptrack]. | ||||||
|  |  | ||||||
| # Turning on the System Allocator | # Turning on the System Allocator | ||||||
|  |  | ||||||
| @ -59,12 +59,12 @@ use std::alloc::System; | |||||||
| static GLOBAL: System = System; | static GLOBAL: System = System; | ||||||
| ``` | ``` | ||||||
|  |  | ||||||
| Or look [here](https://blog.rust-lang.org/2018/08/02/Rust-1.28.html) for another example. | ...and that's it. Everything else comes essentially for free. | ||||||
|  |  | ||||||
| # Running heaptrack | # Running heaptrack | ||||||
|  |  | ||||||
| Assuming you've installed heaptrack <span style="font-size: .6em;">(Homebrew in Mac, package manager in Linux, ??? in Windows)</span>, | Assuming you've installed heaptrack <span style="font-size: .6em;">(Homebrew in Mac, package manager in Linux, ??? in Windows)</span>, | ||||||
| all that's left is to fire it up: | all that's left is to fire up your application: | ||||||
|  |  | ||||||
| ``` | ``` | ||||||
| heaptrack my_application | heaptrack my_application | ||||||
| @ -84,14 +84,10 @@ And even these pretty colors: | |||||||
|  |  | ||||||
| # Reading Flamegraphs | # Reading Flamegraphs | ||||||
|  |  | ||||||
| We're going to focus on the heap ["flamegraph"](http://www.brendangregg.com/flamegraphs.html), | To make sense of our memory usage, we're going to focus on that last picture - it's called | ||||||
| which is the last picture I showed above. Normally these charts are used to show how much time | a ["flamegraph"](http://www.brendangregg.com/flamegraphs.html). These charts are typically | ||||||
| you spend executing different functions, but the focus for now is to show how much memory | used to show how much time you spend executing different functions, but they're used here | ||||||
| was allocated during those functions. | to show how much memory was allocated during those functions. | ||||||
|  |  | ||||||
| As a quick introduction to reading flamegraphs, the idea is this: |  | ||||||
| The width of the bar is how much memory was allocated by that function, and all functions |  | ||||||
| that it calls. |  | ||||||
|  |  | ||||||
| For example, we can see that all executions happened during the `main` function: | For example, we can see that all executions happened during the `main` function: | ||||||
|  |  | ||||||
| @ -101,7 +97,7 @@ For example, we can see that all executions happened during the `main` function: | |||||||
|  |  | ||||||
|  |  | ||||||
|  |  | ||||||
| ...and within *that*, allocations happened in two main places: | ...and within *that*, allocations happened in two different places: | ||||||
|  |  | ||||||
|  |  | ||||||
|  |  | ||||||
| @ -112,7 +108,7 @@ as an issue: **what the heck is the `Default` thing doing?** | |||||||
|  |  | ||||||
| # Optimizing dtparse | # Optimizing dtparse | ||||||
|  |  | ||||||
| See, I knew that there were some allocations that happen during the `dtparse::parse` method, | See, I knew that there were some allocations during calls to `dtparse::parse`, | ||||||
| but I was totally wrong about where the bulk of allocations occurred in my program. | but I was totally wrong about where the bulk of allocations occurred in my program. | ||||||
| Let me post the code and see if you can spot the mistake: | Let me post the code and see if you can spot the mistake: | ||||||
|  |  | ||||||
| @ -132,14 +128,13 @@ pub fn parse(timestr: &str) -> ParseResult<(NaiveDateTime, Option<FixedOffset>)> | |||||||
|  |  | ||||||
| --- | --- | ||||||
|  |  | ||||||
| The issue is that I keep on creating a new `Parser` every time you call the `parse()` function! | Because `Parser::parse` requires a mutable reference to itself, I have to create a new parser | ||||||
|  | every time it receives a string. This seems excessive! We'd rather have an immutable parser | ||||||
| Now this is a bit excessive, but was necessary at the time because `Parser.parse()` used `&mut self`. | that can be re-used, and avoid needing to allocate memory in the first place. | ||||||
| In order to properly parse a string, the parser itself required mutable state. |  | ||||||
|  |  | ||||||
| Armed with that information, I put some time in to | Armed with that information, I put some time in to | ||||||
| [make the parser immutable](https://github.com/bspeice/dtparse/commit/741afa34517d6bc1155713bbc5d66905fea13fad#diff-b4aea3e418ccdb71239b96952d9cddb6). | [make the parser immutable](https://github.com/bspeice/dtparse/commit/741afa34517d6bc1155713bbc5d66905fea13fad#diff-b4aea3e418ccdb71239b96952d9cddb6). | ||||||
| Now I can re-use the same parser over and over! And would you believe it? No more allocations of default parsers: | Now that I can re-use the same parser over and over, the allocations disappear: | ||||||
|  |  | ||||||
|  |  | ||||||
|  |  | ||||||
| @ -153,12 +148,12 @@ All the way down to 300KB: | |||||||
|  |  | ||||||
| # Conclusion | # Conclusion | ||||||
|  |  | ||||||
| In the end, you don't need to write a custom allocator to test memory performance. Rather, there are some | In the end, you don't need to write a custom allocator to be efficient with memory, great tools | ||||||
| great tools that already exist you can put to work! | already exist to help you understand what your program is doing. | ||||||
|  |  | ||||||
| **Use them.** | **Use them.** | ||||||
|  |  | ||||||
| Now that [Moore's Law](https://en.wikipedia.org/wiki/Moore%27s_law) | Given that [Moore's Law](https://en.wikipedia.org/wiki/Moore%27s_law) | ||||||
| is [dead](https://www.technologyreview.com/s/601441/moores-law-is-dead-now-what/), we've all got to | is [dead](https://www.technologyreview.com/s/601441/moores-law-is-dead-now-what/), we've all got to | ||||||
| do our part to take back what Microsoft stole. | do our part to take back what Microsoft stole. | ||||||
|  |  | ||||||
|  | |||||||
		Reference in New Issue
	
	Block a user