speice.io/_posts/2018-10-08-case-study-optimization.md

165 lines
6.9 KiB
Markdown
Raw Normal View History

2018-10-04 22:20:31 -04:00
---
layout: post
title: "A Case Study in Heaptrack"
2018-10-07 22:01:30 -04:00
description: "...because you don't need no garbage collection"
2020-06-29 15:51:23 -04:00
category:
2018-10-04 22:20:31 -04:00
tags: []
---
2018-10-06 17:53:14 -04:00
One of my earliest conversations about programming went like this:
2018-10-04 22:20:31 -04:00
> Programmers have it too easy these days. They should learn to develop
2018-10-06 17:53:14 -04:00
> in low memory environments and be more efficient.
2018-10-04 22:20:31 -04:00
>
> -- My Father (paraphrased)
2018-10-06 17:53:14 -04:00
...though it's not like the first code I wrote was for a
[graphing calculator](https://education.ti.com/en/products/calculators/graphing-calculators/ti-84-plus-se)
2020-06-29 15:51:23 -04:00
packing a whole 24KB of RAM. By the way, _what are you doing on my lawn?_
2018-10-04 22:20:31 -04:00
2018-10-07 22:01:30 -04:00
The principle remains though: be efficient with the resources you have, because
2018-10-04 22:20:31 -04:00
[what Intel giveth, Microsoft taketh away](http://exo-blog.blogspot.com/2007/09/what-intel-giveth-microsoft-taketh-away.html).
2018-10-06 17:53:14 -04:00
My professional work is focused on this kind of efficiency; low-latency financial markets demand that
2020-06-29 15:51:23 -04:00
you understand at a deep level _exactly_ what your code is doing. As I continue experimenting with Rust for
2018-10-06 17:53:14 -04:00
personal projects, it's exciting to bring a utilitarian mindset with me: there's flexibility for the times I pretend
2018-10-07 22:01:30 -04:00
to have a garbage collector, and flexibility for the times that I really care about how memory is used.
2018-10-04 22:20:31 -04:00
2018-10-07 22:01:30 -04:00
This post is a (small) case study in how I went from the former to the latter. And ultimately, it's intended
to be a starting toolkit to empower analysis of your own code.
2018-10-04 22:20:31 -04:00
2018-10-06 17:53:14 -04:00
# Curiosity
2018-10-04 22:20:31 -04:00
2020-06-29 15:51:23 -04:00
When I first started building the [dtparse] crate, my intention was to mirror as closely as possible
2018-10-04 22:20:31 -04:00
the equivalent [Python library][dateutil]. Python, as you may know, is garbage collected. Very rarely is memory
2018-10-05 23:28:00 -04:00
usage considered in Python, and I likewise wasn't paying too much attention when `dtparse` was first being built.
2018-10-04 22:20:31 -04:00
2018-10-07 22:01:30 -04:00
This lackadaisical approach to memory works well enough, and I'm not planning on making `dtparse` hyper-efficient.
2018-10-06 17:53:14 -04:00
But every so often, I've wondered: "what exactly is going on in memory?" With the advent of Rust 1.28 and the
2018-10-04 22:20:31 -04:00
[Global Allocator trait](https://doc.rust-lang.org/std/alloc/trait.GlobalAlloc.html), I had a really great idea:
2020-06-29 15:51:23 -04:00
_build a custom allocator that allows you to track your own allocations._ That way, you can do things like
2018-10-04 22:20:31 -04:00
writing tests for both correct results and correct memory usage. I gave it a [shot][qadapt], but learned
2018-10-06 17:53:14 -04:00
very quickly: **never write your own allocator**. It went from "fun weekend project" to
2018-10-05 23:28:00 -04:00
"I have literally no idea what my computer is doing" at breakneck speed.
2018-10-04 22:20:31 -04:00
2018-10-06 17:53:14 -04:00
Instead, I'll highlight a separate path I took to make sense of my memory usage: [heaptrack].
2018-10-04 22:20:31 -04:00
# Turning on the System Allocator
This is the hardest part of the post. Because Rust uses
[its own allocator](https://github.com/rust-lang/rust/pull/27400#issue-41256384) by default,
2018-10-07 22:01:30 -04:00
`heaptrack` is unable to properly record unmodified Rust code. To remedy this, we'll make use
of the `#[global_allocator]` attribute.
2018-10-04 22:20:31 -04:00
2018-10-07 22:01:30 -04:00
Specifically, in `lib.rs` or `main.rs`, add this:
2018-10-04 22:20:31 -04:00
```rust
use std::alloc::System;
#[global_allocator]
static GLOBAL: System = System;
```
2018-10-06 17:53:14 -04:00
...and that's it. Everything else comes essentially for free.
2018-10-04 22:20:31 -04:00
# Running heaptrack
Assuming you've installed heaptrack <span style="font-size: .6em;">(Homebrew in Mac, package manager in Linux, ??? in Windows)</span>,
2018-10-06 17:53:14 -04:00
all that's left is to fire up your application:
2018-10-04 22:20:31 -04:00
```
heaptrack my_application
```
It's that easy. After the program finishes, you'll see a file in your local directory with a name
like `heaptrack.my_appplication.XXXX.gz`. If you load that up in `heaptrack_gui`, you'll see
something like this:
![heaptrack](/assets/images/2018-10-heaptrack/heaptrack-before.png)
---
And even these pretty colors:
![pretty colors](/assets/images/2018-10-heaptrack/heaptrack-flamegraph.png)
# Reading Flamegraphs
2018-10-06 17:53:14 -04:00
To make sense of our memory usage, we're going to focus on that last picture - it's called
a ["flamegraph"](http://www.brendangregg.com/flamegraphs.html). These charts are typically
2018-10-07 22:01:30 -04:00
used to show how much time your program spends executing each function, but they're used here
to show how much memory was allocated during those functions instead.
2018-10-04 22:20:31 -04:00
For example, we can see that all executions happened during the `main` function:
![allocations in main](/assets/images/2018-10-heaptrack/heaptrack-main-colorized.png)
...and within that, all allocations happened during `dtparse::parse`:
![allocations in dtparse](/assets/images/2018-10-heaptrack/heaptrack-dtparse-colorized.png)
2020-06-29 15:51:23 -04:00
...and within _that_, allocations happened in two different places:
2018-10-04 22:20:31 -04:00
![allocations in parseinfo](/assets/images/2018-10-heaptrack/heaptrack-parseinfo-colorized.png)
Now I apologize that it's hard to see, but there's one area specifically that stuck out
as an issue: **what the heck is the `Default` thing doing?**
![pretty colors](/assets/images/2018-10-heaptrack/heaptrack-flamegraph-default.png)
# Optimizing dtparse
2018-10-06 17:53:14 -04:00
See, I knew that there were some allocations during calls to `dtparse::parse`,
2018-10-04 22:20:31 -04:00
but I was totally wrong about where the bulk of allocations occurred in my program.
Let me post the code and see if you can spot the mistake:
```rust
/// Main entry point for using `dtparse`.
pub fn parse(timestr: &str) -> ParseResult<(NaiveDateTime, Option<FixedOffset>)> {
let res = Parser::default().parse(
timestr, None, None, false, false,
None, false,
&HashMap::new(),
)?;
Ok((res.0, res.1))
}
```
2020-06-29 15:51:23 -04:00
2018-10-04 22:20:31 -04:00
> [dtparse](https://github.com/bspeice/dtparse/blob/4d7c5dd99572823fa4a390b483c38ab020a2172f/src/lib.rs#L1286)
---
2018-10-07 22:01:30 -04:00
Because `Parser::parse` requires a mutable reference to itself, I have to create a new `Parser::default`
every time it receives a string. This is excessive! We'd rather have an immutable parser
that can be re-used, and avoid allocating memory in the first place.
2018-10-04 22:20:31 -04:00
2018-10-05 23:28:00 -04:00
Armed with that information, I put some time in to
[make the parser immutable](https://github.com/bspeice/dtparse/commit/741afa34517d6bc1155713bbc5d66905fea13fad#diff-b4aea3e418ccdb71239b96952d9cddb6).
2018-10-06 17:53:14 -04:00
Now that I can re-use the same parser over and over, the allocations disappear:
2018-10-04 22:20:31 -04:00
![allocations cleaned up](/assets/images/2018-10-heaptrack/heaptrack-flamegraph-after.png)
2018-10-07 22:01:30 -04:00
In total, we went from requiring 2 MB of memory in [version 1.0.2](https://crates.io/crates/dtparse/1.0.2):
2018-10-04 22:20:31 -04:00
![memory before](/assets/images/2018-10-heaptrack/heaptrack-closeup.png)
2018-10-07 22:01:30 -04:00
All the way down to 300KB in [version 1.0.3](https://crates.io/crates/dtparse/1.0.3):
2018-10-04 22:20:31 -04:00
![memory after](/assets/images/2018-10-heaptrack/heaptrack-closeup-after.png)
# Conclusion
2018-10-06 17:53:14 -04:00
In the end, you don't need to write a custom allocator to be efficient with memory, great tools
already exist to help you understand what your program is doing.
2018-10-04 22:20:31 -04:00
**Use them.**
2018-10-06 17:53:14 -04:00
Given that [Moore's Law](https://en.wikipedia.org/wiki/Moore%27s_law)
2018-10-04 22:20:31 -04:00
is [dead](https://www.technologyreview.com/s/601441/moores-law-is-dead-now-what/), we've all got to
do our part to take back what Microsoft stole.
[dtparse]: https://crates.io/crates/dtparse
[dateutil]: https://github.com/dateutil/dateutil
[heaptrack]: https://github.com/KDE/heaptrack
[qadapt]: https://crates.io/crates/qadapt