speice.io/2018/10/case-study-optimization/index.html

92 lines
38 KiB
HTML
Raw Normal View History

<!doctype html><html lang=en dir=ltr class="blog-wrapper blog-post-page plugin-blog plugin-id-default" data-has-hydrated=false><meta charset=UTF-8><meta name=generator content="Docusaurus v3.7.0"><title data-rh=true>A case study in heaptrack | The Old Speice Guy</title><meta data-rh=true name=viewport content="width=device-width, initial-scale=1.0"><meta data-rh=true name=twitter:card content=summary_large_image><meta data-rh=true property=og:url content=https://speice.io/2018/10/case-study-optimization/><meta data-rh=true property=og:locale content=en><meta data-rh=true name=docusaurus_locale content=en><meta data-rh=true name=docusaurus_tag content=default><meta data-rh=true name=docsearch:language content=en><meta data-rh=true name=docsearch:docusaurus_tag content=default><meta data-rh=true property=og:title content="A case study in heaptrack | The Old Speice Guy"><meta data-rh=true name=description content="I remember early in my career someone joking that:"><meta data-rh=true property=og:description content="I remember early in my career someone joking that:"><meta data-rh=true property=og:type content=article><meta data-rh=true property=article:published_time content=2018-10-08T12:00:00.000Z><link data-rh=true rel=icon href=/img/favicon.ico><link data-rh=true rel=canonical href=https://speice.io/2018/10/case-study-optimization/><link data-rh=true rel=alternate href=https://speice.io/2018/10/case-study-optimization/ hreflang=en><link data-rh=true rel=alternate href=https://speice.io/2018/10/case-study-optimization/ hreflang=x-default><script data-rh=true type=application/ld+json>{"@context":"https://schema.org","@id":"https://speice.io/2018/10/case-study-optimization","@type":"BlogPosting","author":{"@type":"Person","name":"Bradlee Speice"},"dateModified":"2024-11-09T22:02:02.000Z","datePublished":"2018-10-08T12:00:00.000Z","description":"I remember early in my career someone joking that:","headline":"A case study in heaptrack","isPartOf":{"@id":"https://speice.io/","@type":"Blog","name":"Blog"},"keywords":[],"mainEntityOfPage":"https://speice.io/2018/10/case-study-optimization","name":"A case study in heaptrack","url":"https://speice.io/2018/10/case-study-optimization"}</script><link rel=alternate type=application/rss+xml href=/rss.xml title="The Old Speice Guy RSS Feed"><link rel=alternate type=application/atom+xml href=/atom.xml title="The Old Speice Guy Atom Feed"><link rel=stylesheet href=/katex/katex.min.css type=text/css><link rel=stylesheet href=/assets/css/styles.24ac2c37.css><script src=/assets/js/runtime~main.75ada3c5.js defer></script><script src=/assets/js/main.d0bb06d2.js defer></script><body class=navigation-with-keyboard><script>!function(){var t,e=function(){try{return new URLSearchParams(window.location.search).get("docusaurus-theme")}catch(t){}}()||function(){try{return window.localStorage.getItem("theme")}catch(t){}}();t=null!==e?e:"light",document.documentElement.setAttribute("data-theme",t)}(),function(){try{for(var[t,e]of new URLSearchParams(window.location.search).entries())if(t.startsWith("docusaurus-data-")){var a=t.replace("docusaurus-data-","data-");document.documentElement.setAttribute(a,e)}}catch(t){}}()</script><div id=__docusaurus><div role=region aria-label="Skip to main content"><a class=skipToContent_fXgn href=#__docusaurus_skipToContent_fallback>Skip to main content</a></div><nav aria-label=Main class="navbar navbar--fixed-top"><div class=navbar__inner><div class=navbar__items><button aria-label="Toggle navigation bar" aria-expanded=false class="navbar__toggle clean-btn" type=button><svg width=30 height=30 viewBox="0 0 30 30" aria-hidden=true><path stroke=currentColor stroke-linecap=round stroke-miterlimit=10 stroke-width=2 d="M4 7h22M4 15h22M4 23h22"/></svg></button><a class=navbar__brand href=/><div class=navbar__logo><img src=/img/logo.svg alt="Sierpinski Gasket" class="themedComponent_mlkZ themedComponent--light_NVdE"><img src=/img/logo-dark.svg alt="Sierpinski Gasket" class="themedComponent_mlkZ themedComponent--dark_xIcU"></div><b class="navbar__title text--truncate">The Ol
<blockquote>
<p>Programmers have it too easy these days. They should learn to develop in low memory environments
and be more efficient.</p>
</blockquote>
<p>...though it's not like the first code I wrote was for a
<a href=https://web.archive.org/web/20180924060530/https://education.ti.com/en/products/calculators/graphing-calculators/ti-84-plus-se target=_blank rel="noopener noreferrer">graphing calculator</a>
packing a whole 24KB of RAM.</p>
<p>But the principle remains: be efficient with the resources you have, because
<a href=http://exo-blog.blogspot.com/2007/09/what-intel-giveth-microsoft-taketh-away.html target=_blank rel="noopener noreferrer">what Intel giveth, Microsoft taketh away</a>.</p>
<p>My professional work is focused on this kind of efficiency; low-latency financial markets demand
that you understand at a deep level <em>exactly</em> what your code is doing. As I continue experimenting
with Rust for personal projects, it's exciting to bring a utilitarian mindset with me: there's
flexibility for the times I pretend to have a garbage collector, and flexibility for the times that
I really care about how memory is used.</p>
<p>This post is a (small) case study in how I went from the former to the latter. And ultimately, it's
intended to be a starting toolkit to empower analysis of your own code.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id=curiosity>Curiosity<a href=#curiosity class=hash-link aria-label="Direct link to Curiosity" title="Direct link to Curiosity"></a></h2>
<p>When I first started building the <a href=https://crates.io/crates/dtparse target=_blank rel="noopener noreferrer">dtparse</a> crate, my intention was to mirror as closely as possible
the equivalent <a href=https://github.com/dateutil/dateutil target=_blank rel="noopener noreferrer">Python library</a>. Python, as you may know, is garbage collected. Very
rarely is memory usage considered in Python, and I likewise wasn't paying too much attention when
<code>dtparse</code> was first being built.</p>
<p>This lackadaisical approach to memory works well enough, and I'm not planning on making <code>dtparse</code>
hyper-efficient. But every so often, I've wondered: "what exactly is going on in memory?" With the
advent of Rust 1.28 and the
<a href=https://doc.rust-lang.org/std/alloc/trait.GlobalAlloc.html target=_blank rel="noopener noreferrer">Global Allocator trait</a>, I had a really
great idea: <em>build a custom allocator that allows you to track your own allocations.</em> That way, you
can do things like writing tests for both correct results and correct memory usage. I gave it a
<a href=https://crates.io/crates/qadapt target=_blank rel="noopener noreferrer">shot</a>, but learned very quickly: <strong>never write your own allocator</strong>. It went from "fun
weekend project" to "I have literally no idea what my computer is doing" at breakneck speed.</p>
<p>Instead, I'll highlight a separate path I took to make sense of my memory usage: <a href=https://github.com/KDE/heaptrack target=_blank rel="noopener noreferrer">heaptrack</a>.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id=turning-on-the-system-allocator>Turning on the System Allocator<a href=#turning-on-the-system-allocator class=hash-link aria-label="Direct link to Turning on the System Allocator" title="Direct link to Turning on the System Allocator"></a></h2>
<p>This is the hardest part of the post. Because Rust uses
<a href=https://github.com/rust-lang/rust/pull/27400#issue-41256384 target=_blank rel="noopener noreferrer">its own allocator</a> by default,
<code>heaptrack</code> is unable to properly record unmodified Rust code. To remedy this, we'll make use of the
<code>#[global_allocator]</code> attribute.</p>
<p>Specifically, in <code>lib.rs</code> or <code>main.rs</code>, add this:</p>
<div class="language-rust codeBlockContainer_Ckt0 theme-code-block" style="--prism-background-color:hsl(230, 1%, 98%);--prism-color:hsl(230, 8%, 24%)"><div class=codeBlockContent_biex><pre tabindex=0 class="prism-code language-rust codeBlock_bY9V thin-scrollbar" style="background-color:hsl(230, 1%, 98%);color:hsl(230, 8%, 24%)"><code class=codeBlockLines_e6Vv><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token keyword" style="color:hsl(301, 63%, 40%)">use</span><span class="token plain"> </span><span class="token namespace">std</span><span class="token namespace punctuation" style="color:hsl(119, 34%, 47%)">::</span><span class="token namespace">alloc</span><span class="token namespace punctuation" style="color:hsl(119, 34%, 47%)">::</span><span class="token class-name" style="color:hsl(35, 99%, 36%)">System</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">;</span><span class="token plain"></span><br></span><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token plain" style=display:inline-block></span><br></span><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token plain"></span><span class="token attribute attr-name" style="color:hsl(35, 99%, 36%)">#[global_allocator]</span><span class="token plain"></span><br></span><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token plain"></span><span class="token keyword" style="color:hsl(301, 63%, 40%)">static</span><span class="token plain"> </span><span class="token constant" style="color:hsl(35, 99%, 36%)">GLOBAL</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">:</span><span class="token plain"> </span><span class="token class-name" style="color:hsl(35, 99%, 36%)">System</span><span class="token plain"> </span><span class="token operator" style="color:hsl(221, 87%, 60%)">=</span><span class="token plain"> </span><span class="token class-name" style="color:hsl(35, 99%, 36%)">System</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">;</span><br></span></code></pre><div class=buttonGroup__atx><button type=button aria-label="Copy code to clipboard" title=Copy class=clean-btn><span class=copyButtonIcons_eSgA aria-hidden=true><svg viewBox="0 0 24 24" class=copyButtonIcon_y97N><path fill=currentColor d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"/></svg><svg viewBox="0 0 24 24" class=copyButtonSuccessIcon_LjdS><path fill=currentColor d=M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z /></svg></span></button></div></div></div>
<p>...and that's it. Everything else comes essentially for free.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id=running-heaptrack>Running heaptrack<a href=#running-heaptrack class=hash-link aria-label="Direct link to Running heaptrack" title="Direct link to Running heaptrack"></a></h2>
<p>Assuming you've installed heaptrack <small>(Homebrew in Mac, package manager
in Linux, ??? in Windows)</small>, all that's left is to fire up your application:</p>
<div class="codeBlockContainer_Ckt0 theme-code-block" style="--prism-background-color:hsl(230, 1%, 98%);--prism-color:hsl(230, 8%, 24%)"><div class=codeBlockContent_biex><pre tabindex=0 class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="background-color:hsl(230, 1%, 98%);color:hsl(230, 8%, 24%)"><code class=codeBlockLines_e6Vv><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token plain">heaptrack my_application</span><br></span></code></pre><div class=buttonGroup__atx><button type=button aria-label="Copy code to clipboard" title=Copy class=clean-btn><span class=copyButtonIcons_eSgA aria-hidden=true><svg viewBox="0 0 24 24" class=copyButtonIcon_y97N><path fill=currentColor d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"/></svg><svg viewBox="0 0 24 24" class=copyButtonSuccessIcon_LjdS><path fill=currentColor d=M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z /></svg></span></button></div></div></div>
<p>It's that easy. After the program finishes, you'll see a file in your local directory with a name
like <code>heaptrack.my_appplication.XXXX.gz</code>. If you load that up in <code>heaptrack_gui</code>, you'll see
something like this:</p>
<p><img decoding=async loading=lazy alt=heaptrack src=/assets/images/heaptrack-before-11fba190f97831448cc539ebb32fa579.png width=1312 height=320 class=img_ev3q></p>
<hr>
<p>And even these pretty colors:</p>
<p><img decoding=async loading=lazy alt="pretty colors" src=/assets/images/heaptrack-flamegraph-5094664fa79faaf2664b38505c15ac1f.png width=1284 height=715 class=img_ev3q></p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id=reading-flamegraphs>Reading Flamegraphs<a href=#reading-flamegraphs class=hash-link aria-label="Direct link to Reading Flamegraphs" title="Direct link to Reading Flamegraphs"></a></h2>
<p>To make sense of our memory usage, we're going to focus on that last picture - it's called a
<a href=http://www.brendangregg.com/flamegraphs.html target=_blank rel="noopener noreferrer">"flamegraph"</a>. These charts are typically used to
show how much time your program spends executing each function, but they're used here to show how
much memory was allocated during those functions instead.</p>
<p>For example, we can see that all executions happened during the <code>main</code> function:</p>
<p><img decoding=async loading=lazy alt="allocations in main" src=/assets/images/heaptrack-main-colorized-cfe5d7d345d32cfc1a0f297580619718.png width=654 height=343 class=img_ev3q></p>
<p>...and within that, all allocations happened during <code>dtparse::parse</code>:</p>
<p><img decoding=async loading=lazy alt="allocations in dtparse" src=/assets/images/heaptrack-dtparse-colorized-e6caf224f50df2dd56981f5b02970325.png width=654 height=315 class=img_ev3q></p>
<p>...and within <em>that</em>, allocations happened in two different places:</p>
<p><img decoding=async loading=lazy alt="allocations in parseinfo" src=/assets/images/heaptrack-parseinfo-colorized-a1898beaf28a3997ac86810f872539b7.png width=654 height=372 class=img_ev3q></p>
<p>Now I apologize that it's hard to see, but there's one area specifically that stuck out as an issue:
<strong>what the heck is the <code>Default</code> thing doing?</strong></p>
<p><img decoding=async loading=lazy alt="pretty colors" src=/assets/images/heaptrack-flamegraph-default-26cc411d387f58f50cb548f8e81df1a1.png width=1284 height=715 class=img_ev3q></p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id=optimizing-dtparse>Optimizing dtparse<a href=#optimizing-dtparse class=hash-link aria-label="Direct link to Optimizing dtparse" title="Direct link to Optimizing dtparse"></a></h2>
<p>See, I knew that there were some allocations during calls to <code>dtparse::parse</code>, but I was totally
wrong about where the bulk of allocations occurred in my program. Let me post the code and see if
you can spot the mistake:</p>
<div class="language-rust codeBlockContainer_Ckt0 theme-code-block" style="--prism-background-color:hsl(230, 1%, 98%);--prism-color:hsl(230, 8%, 24%)"><div class=codeBlockContent_biex><pre tabindex=0 class="prism-code language-rust codeBlock_bY9V thin-scrollbar" style="background-color:hsl(230, 1%, 98%);color:hsl(230, 8%, 24%)"><code class=codeBlockLines_e6Vv><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token comment" style="color:hsl(230, 4%, 64%)">/// Main entry point for using `dtparse`.</span><span class="token plain"></span><br></span><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token plain"></span><span class="token keyword" style="color:hsl(301, 63%, 40%)">pub</span><span class="token plain"> </span><span class="token keyword" style="color:hsl(301, 63%, 40%)">fn</span><span class="token plain"> </span><span class="token function-definition function" style="color:hsl(221, 87%, 60%)">parse</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">(</span><span class="token plain">timestr</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">:</span><span class="token plain"> </span><span class="token operator" style="color:hsl(221, 87%, 60%)">&</span><span class="token keyword" style="color:hsl(301, 63%, 40%)">str</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">)</span><span class="token plain"> </span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">-></span><span class="token plain"> </span><span class="token class-name" style="color:hsl(35, 99%, 36%)">ParseResult</span><span class="token operator" style="color:hsl(221, 87%, 60%)">&lt;</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">(</span><span class="token class-name" style="color:hsl(35, 99%, 36%)">NaiveDateTime</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">,</span><span class="token plain"> </span><span class="token class-name" style="color:hsl(35, 99%, 36%)">Option</span><span class="token operator" style="color:hsl(221, 87%, 60%)">&lt;</span><span class="token class-name" style="color:hsl(35, 99%, 36%)">FixedOffset</span><span class="token operator" style="color:hsl(221, 87%, 60%)">></span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">)</span><span class="token operator" style="color:hsl(221, 87%, 60%)">></span><span class="token plain"> </span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">{</span><span class="token plain"></span><br></span><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token plain"> </span><span class="token keyword" style="color:hsl(301, 63%, 40%)">let</span><span class="token plain"> res </span><span class="token operator" style="color:hsl(221, 87%, 60%)">=</span><span class="token plain"> </span><span class="token class-name" style="color:hsl(35, 99%, 36%)">Parser</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">::</span><span class="token function" style="color:hsl(221, 87%, 60%)">default</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">(</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">)</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">.</span><span class="token function" style="color:hsl(221, 87%, 60%)">parse</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">(</span><span class="token plain"></span><br></span><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token plain"> timestr</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">,</span><span class="token plain"> </span><span class="token class-name" style="color:hsl(35, 99%, 36%)">None</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">,</span><span class="token plain"> </span><span class="token class-name" style="color:hsl(35, 99%, 36%)">None</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">,</span><span class="token plain"> </span><span class="token boolean" style="
<blockquote>
<p><a href=https://github.com/bspeice/dtparse/blob/4d7c5dd99572823fa4a390b483c38ab020a2172f/src/lib.rs#L1286 target=_blank rel="noopener noreferrer">dtparse</a></p>
</blockquote>
<hr>
<p>Because <code>Parser::parse</code> requires a mutable reference to itself, I have to create a new
<code>Parser::default</code> every time it receives a string. This is excessive! We'd rather have an immutable
parser that can be re-used, and avoid allocating memory in the first place.</p>
<p>Armed with that information, I put some time in to
<a href=https://github.com/bspeice/dtparse/commit/741afa34517d6bc1155713bbc5d66905fea13fad#diff-b4aea3e418ccdb71239b96952d9cddb6 target=_blank rel="noopener noreferrer">make the parser immutable</a>.
Now that I can re-use the same parser over and over, the allocations disappear:</p>
<p><img decoding=async loading=lazy alt="allocations cleaned up" src=/assets/images/heaptrack-flamegraph-after-cedc4c3519313f5af538364165e92c34.png width=1272 height=712 class=img_ev3q></p>
<p>In total, we went from requiring 2 MB of memory in
<a href=https://crates.io/crates/dtparse/1.0.2 target=_blank rel="noopener noreferrer">version 1.0.2</a>:</p>
<p><img decoding=async loading=lazy alt="memory before" src=/assets/images/heaptrack-closeup-12ae3897c033ccb3684a88dd45592e14.png width=717 height=116 class=img_ev3q></p>
<p>All the way down to 300KB in <a href=https://crates.io/crates/dtparse/1.0.3 target=_blank rel="noopener noreferrer">version 1.0.3</a>:</p>
<p><img decoding=async loading=lazy alt="memory after" src=/assets/images/heaptrack-closeup-after-967bc4596c480bcc9e8410b0a7a64a00.png width=739 height=123 class=img_ev3q></p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id=conclusion>Conclusion<a href=#conclusion class=hash-link aria-label="Direct link to Conclusion" title="Direct link to Conclusion"></a></h2>
<p>In the end, you don't need to write a custom allocator to be efficient with memory, great tools
already exist to help you understand what your program is doing.</p>
<p><strong>Use them.</strong></p>
<p>Given that <a href=https://en.wikipedia.org/wiki/Moore%27s_law target=_blank rel="noopener noreferrer">Moore's Law</a> is
<a href=https://www.technologyreview.com/s/601441/moores-law-is-dead-now-what/ target=_blank rel="noopener noreferrer">dead</a>, we've all got to do
our part to take back what Microsoft stole.</div></article><nav class="pagination-nav docusaurus-mt-lg" aria-label="Blog post page navigation"><a class="pagination-nav__link pagination-nav__link--prev" href=/2018/09/isomorphic-apps/><div class=pagination-nav__sublabel>Older post</div><div class=pagination-nav__label>Isomorphic desktop apps with Rust</div></a><a class="pagination-nav__link pagination-nav__link--next" href=/2018/12/what-small-business-really-means/><div class=pagination-nav__sublabel>Newer post</div><div class=pagination-nav__label>More "what companies really mean"</div></a></nav></main><div class="col col--2"><div class="tableOfContents_bqdL thin-scrollbar"><ul class="table-of-contents table-of-contents__left-border"><li><a href=#curiosity class="table-of-contents__link toc-highlight">Curiosity</a><li><a href=#turning-on-the-system-allocator class="table-of-contents__link toc-highlight">Turning on the System Allocator</a><li><a href=#running-heaptrack class="table-of-contents__link toc-highlight">Running heaptrack</a><li><a href=#reading-flamegraphs class="table-of-contents__link toc-highlight">Reading Flamegraphs</a><li><a href=#optimizing-dtparse class="table-of-contents__link toc-highlight">Optimizing dtparse</a><li><a href=#conclusion class="table-of-contents__link toc-highlight">Conclusion</a></ul></div></div></div></div></div><footer class=footer><div class="container container-fluid"><div class="footer__bottom text--center"><div class=footer__copyright>Copyright © 2025 Bradlee Speice</div></div></div></footer></div>