<!doctype html><htmllang=endir=ltrclass="blog-wrapper blog-post-page plugin-blog plugin-id-default"data-has-hydrated=false><metacharset=UTF-8><metaname=generatorcontent="Docusaurus v3.7.0"><titledata-rh=true>A case study in heaptrack | The Old Speice Guy</title><metadata-rh=truename=viewportcontent="width=device-width, initial-scale=1.0"><metadata-rh=truename=twitter:cardcontent=summary_large_image><metadata-rh=trueproperty=og:urlcontent=https://speice.io/2018/10/case-study-optimization><metadata-rh=trueproperty=og:localecontent=en><metadata-rh=truename=docusaurus_localecontent=en><metadata-rh=truename=docusaurus_tagcontent=default><metadata-rh=truename=docsearch:languagecontent=en><metadata-rh=truename=docsearch:docusaurus_tagcontent=default><metadata-rh=trueproperty=og:titlecontent="A case study in heaptrack | The Old Speice Guy"><metadata-rh=truename=descriptioncontent="I remember early in my career someone joking that:"><metadata-rh=trueproperty=og:descriptioncontent="I remember early in my career someone joking that:"><metadata-rh=trueproperty=og:typecontent=article><metadata-rh=trueproperty=article:published_timecontent=2018-10-08T12:00:00.000Z><linkdata-rh=truerel=iconhref=/img/favicon.ico><linkdata-rh=truerel=canonicalhref=https://speice.io/2018/10/case-study-optimization><linkdata-rh=truerel=alternatehref=https://speice.io/2018/10/case-study-optimizationhreflang=en><linkdata-rh=truerel=alternatehref=https://speice.io/2018/10/case-study-optimizationhreflang=x-default><scriptdata-rh=truetype=application/ld+json>{"@context":"https://schema.org","@id":"https://speice.io/2018/10/case-study-optimization","@type":"BlogPosting","author":{"@type":"Person","name":"Bradlee Speice"},"dateModified":"2024-11-09T22:02:02.000Z","datePublished":"2018-10-08T12:00:00.000Z","description":"I remember early in my career someone joking that:","headline":"A case study in heaptrack","isPartOf":{"@id":"https://speice.io/","@type":"Blog","name":"Blog"},"keywords":[],"mainEntityOfPage":"https://speice.io/2018/10/case-study-optimization","name":"A case study in heaptrack","url":"https://speice.io/2018/10/case-study-optimization"}</script><linkrel=alternatetype=application/rss+xmlhref=/rss.xmltitle="The Old Speice Guy RSS Feed"><linkrel=alternatetype=application/atom+xmlhref=/atom.xmltitle="The Old Speice Guy Atom Feed"><linkrel=stylesheethref=/katex/katex.min.csstype=text/css><linkrel=stylesheethref=/assets/css/styles.24ac2c37.css><scriptsrc=/assets/js/runtime~main.8ba92cdd.jsdefer></script><scriptsrc=/assets/js/main.a392e665.jsdefer></script><bodyclass=navigation-with-keyboard><script>!function(){vart,e=function(){try{returnnewURLSearchParams(window.location.search).get("docusaurus-theme")}catch(t){}}()||function(){try{returnwindow.localStorage.getItem("theme")}catch(t){}}();t=null!==e?e:"light",document.documentElement.setAttribute("data-theme",t)}(),function(){try{for(var[t,e]ofnewURLSearchParams(window.location.search).entries())if(t.startsWith("docusaurus-data-")){vara=t.replace("docusaurus-data-","data-");document.documentElement.setAttribute(a,e)}}catch(t){}}()</script><divid=__docusaurus><divrole=regionaria-label="Skip to main content"><aclass=skipToContent_fXgnhref=#__docusaurus_skipToContent_fallback>Skip to main content</a></div><navaria-label=Mainclass="navbar navbar--fixed-top"><divclass=navbar__inner><divclass=navbar__items><buttonaria-label="Toggle navigation bar"aria-expanded=falseclass="navbar__toggle clean-btn"type=button><svgwidth=30height=30viewBox="0 0 30 30"aria-hidden=true><pathstroke=currentColorstroke-linecap=roundstroke-miterlimit=10stroke-width=2d="M4 7h22M4 15h22M4 23h22"/></svg></button><aclass=navbar__brandhref=/><divclass=navbar__logo><imgsrc=/img/logo.svgalt="Sierpinski Gasket"class="themedComponent_mlkZ themedComponent--light_NVdE"><imgsrc=/img/logo-dark.svgalt="Sierpinski Gasket"class="themedComponent_mlkZ themedComponent--dark_xIcU"></div><bclass="navbar__title text--truncate">The Old Sp
<p>But the principle remains: be efficient with the resources you have, because
<ahref=http://exo-blog.blogspot.com/2007/09/what-intel-giveth-microsoft-taketh-away.htmltarget=_blankrel="noopener noreferrer">what Intel giveth, Microsoft taketh away</a>.</p>
<p>My professional work is focused on this kind of efficiency; low-latency financial markets demand
that you understand at a deep level <em>exactly</em> what your code is doing. As I continue experimenting
with Rust for personal projects, it's exciting to bring a utilitarian mindset with me: there's
flexibility for the times I pretend to have a garbage collector, and flexibility for the times that
I really care about how memory is used.</p>
<p>This post is a (small) case study in how I went from the former to the latter. And ultimately, it's
intended to be a starting toolkit to empower analysis of your own code.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id=curiosity>Curiosity<ahref=#curiosityclass=hash-linkaria-label="Direct link to Curiosity"title="Direct link to Curiosity"></a></h2>
<p>When I first started building the <ahref=https://crates.io/crates/dtparsetarget=_blankrel="noopener noreferrer">dtparse</a> crate, my intention was to mirror as closely as possible
the equivalent <ahref=https://github.com/dateutil/dateutiltarget=_blankrel="noopener noreferrer">Python library</a>. Python, as you may know, is garbage collected. Very
rarely is memory usage considered in Python, and I likewise wasn't paying too much attention when
<code>dtparse</code> was first being built.</p>
<p>This lackadaisical approach to memory works well enough, and I'm not planning on making <code>dtparse</code>
hyper-efficient. But every so often, I've wondered: "what exactly is going on in memory?" With the
advent of Rust 1.28 and the
<ahref=https://doc.rust-lang.org/std/alloc/trait.GlobalAlloc.htmltarget=_blankrel="noopener noreferrer">Global Allocator trait</a>, I had a really
great idea: <em>build a custom allocator that allows you to track your own allocations.</em> That way, you
can do things like writing tests for both correct results and correct memory usage. I gave it a
<ahref=https://crates.io/crates/qadapttarget=_blankrel="noopener noreferrer">shot</a>, but learned very quickly: <strong>never write your own allocator</strong>. It went from "fun
weekend project" to "I have literally no idea what my computer is doing" at breakneck speed.</p>
<p>Instead, I'll highlight a separate path I took to make sense of my memory usage: <ahref=https://github.com/KDE/heaptracktarget=_blankrel="noopener noreferrer">heaptrack</a>.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id=turning-on-the-system-allocator>Turning on the System Allocator<ahref=#turning-on-the-system-allocatorclass=hash-linkaria-label="Direct link to Turning on the System Allocator"title="Direct link to Turning on the System Allocator"></a></h2>
<p>This is the hardest part of the post. Because Rust uses
<ahref=https://github.com/rust-lang/rust/pull/27400#issue-41256384target=_blankrel="noopener noreferrer">its own allocator</a> by default,
<code>heaptrack</code> is unable to properly record unmodified Rust code. To remedy this, we'll make use of the
<code>#[global_allocator]</code> attribute.</p>
<p>Specifically, in <code>lib.rs</code> or <code>main.rs</code>, add this:</p>
<p>...and that's it. Everything else comes essentially for free.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id=running-heaptrack>Running heaptrack<ahref=#running-heaptrackclass=hash-linkaria-label="Direct link to Running heaptrack"title="Direct link to Running heaptrack"></a></h2>
<p>Assuming you've installed heaptrack <small>(Homebrew in Mac, package manager
in Linux, ??? in Windows)</small>, all that's left is to fire up your application:</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id=reading-flamegraphs>Reading Flamegraphs<ahref=#reading-flamegraphsclass=hash-linkaria-label="Direct link to Reading Flamegraphs"title="Direct link to Reading Flamegraphs"></a></h2>
<p>To make sense of our memory usage, we're going to focus on that last picture - it's called a
<ahref=http://www.brendangregg.com/flamegraphs.htmltarget=_blankrel="noopener noreferrer">"flamegraph"</a>. These charts are typically used to
show how much time your program spends executing each function, but they're used here to show how
much memory was allocated during those functions instead.</p>
<p>For example, we can see that all executions happened during the <code>main</code> function:</p>
<p><imgdecoding=asyncloading=lazyalt="allocations in main"src=/assets/images/heaptrack-main-colorized-cfe5d7d345d32cfc1a0f297580619718.pngwidth=654height=343class=img_ev3q></p>
<p>...and within that, all allocations happened during <code>dtparse::parse</code>:</p>
<p><imgdecoding=asyncloading=lazyalt="allocations in dtparse"src=/assets/images/heaptrack-dtparse-colorized-e6caf224f50df2dd56981f5b02970325.pngwidth=654height=315class=img_ev3q></p>
<p>...and within <em>that</em>, allocations happened in two different places:</p>
<p><imgdecoding=asyncloading=lazyalt="allocations in parseinfo"src=/assets/images/heaptrack-parseinfo-colorized-a1898beaf28a3997ac86810f872539b7.pngwidth=654height=372class=img_ev3q></p>
<p>Now I apologize that it's hard to see, but there's one area specifically that stuck out as an issue:
<strong>what the heck is the <code>Default</code> thing doing?</strong></p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id=optimizing-dtparse>Optimizing dtparse<ahref=#optimizing-dtparseclass=hash-linkaria-label="Direct link to Optimizing dtparse"title="Direct link to Optimizing dtparse"></a></h2>
<p>See, I knew that there were some allocations during calls to <code>dtparse::parse</code>, but I was totally
wrong about where the bulk of allocations occurred in my program. Let me post the code and see if
<p>Because <code>Parser::parse</code> requires a mutable reference to itself, I have to create a new
<code>Parser::default</code> every time it receives a string. This is excessive! We'd rather have an immutable
parser that can be re-used, and avoid allocating memory in the first place.</p>
<p>Armed with that information, I put some time in to
<ahref=https://github.com/bspeice/dtparse/commit/741afa34517d6bc1155713bbc5d66905fea13fad#diff-b4aea3e418ccdb71239b96952d9cddb6target=_blankrel="noopener noreferrer">make the parser immutable</a>.
Now that I can re-use the same parser over and over, the allocations disappear:</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id=conclusion>Conclusion<ahref=#conclusionclass=hash-linkaria-label="Direct link to Conclusion"title="Direct link to Conclusion"></a></h2>
<p>In the end, you don't need to write a custom allocator to be efficient with memory, great tools
already exist to help you understand what your program is doing.</p>
<p><strong>Use them.</strong></p>
<p>Given that <ahref=https://en.wikipedia.org/wiki/Moore%27s_lawtarget=_blankrel="noopener noreferrer">Moore's Law</a> is
<ahref=https://www.technologyreview.com/s/601441/moores-law-is-dead-now-what/target=_blankrel="noopener noreferrer">dead</a>, we've all got to do