speice.io/2019/02/08/compiler-optimizations/index.html

46 lines
54 KiB
HTML
Raw Permalink Normal View History

<!doctype html><html lang=en dir=ltr class="blog-wrapper blog-post-page plugin-blog plugin-id-default" data-has-hydrated=false><meta charset=UTF-8><meta name=generator content="Docusaurus v3.6.1"><title data-rh=true>Allocations in Rust: Compiler optimizations | The Old Speice Guy</title><meta data-rh=true name=viewport content="width=device-width,initial-scale=1.0"><meta data-rh=true name=twitter:card content=summary_large_image><meta data-rh=true property=og:url content=https://speice.io/2019/02/08/compiler-optimizations><meta data-rh=true property=og:locale content=en><meta data-rh=true name=docusaurus_locale content=en><meta data-rh=true name=docusaurus_tag content=default><meta data-rh=true name=docsearch:language content=en><meta data-rh=true name=docsearch:docusaurus_tag content=default><meta data-rh=true property=og:title content="Allocations in Rust: Compiler optimizations | The Old Speice Guy"><meta data-rh=true name=description content="A lot. The answer is a lot."><meta data-rh=true property=og:description content="A lot. The answer is a lot."><meta data-rh=true property=og:type content=article><meta data-rh=true property=article:published_time content=2019-02-08T12:00:00.000Z><link data-rh=true rel=icon href=/img/favicon.ico><link data-rh=true rel=canonical href=https://speice.io/2019/02/08/compiler-optimizations><link data-rh=true rel=alternate href=https://speice.io/2019/02/08/compiler-optimizations hreflang=en><link data-rh=true rel=alternate href=https://speice.io/2019/02/08/compiler-optimizations hreflang=x-default><script data-rh=true type=application/ld+json>{"@context":"https://schema.org","@id":"https://speice.io/2019/02/08/compiler-optimizations","@type":"BlogPosting","author":[],"dateModified":"2024-11-10T02:05:00.000Z","datePublished":"2019-02-08T12:00:00.000Z","description":"A lot. The answer is a lot.","headline":"Allocations in Rust: Compiler optimizations","isPartOf":{"@id":"https://speice.io/","@type":"Blog","name":"Blog"},"keywords":[],"mainEntityOfPage":"https://speice.io/2019/02/08/compiler-optimizations","name":"Allocations in Rust: Compiler optimizations","url":"https://speice.io/2019/02/08/compiler-optimizations"}</script><link rel=alternate type=application/rss+xml href=/rss.xml title="The Old Speice Guy RSS Feed"><link rel=alternate type=application/atom+xml href=/atom.xml title="The Old Speice Guy Atom Feed"><link rel=stylesheet href=/katex/katex.min.css><link rel=stylesheet href=/assets/css/styles.16c3428d.css><script src=/assets/js/runtime~main.29a27dcf.js defer></script><script src=/assets/js/main.d461af80.js defer></script><body class=navigation-with-keyboard><script>!function(){var t,e=function(){try{return new URLSearchParams(window.location.search).get("docusaurus-theme")}catch(t){}}()||function(){try{return window.localStorage.getItem("theme")}catch(t){}}();t=null!==e?e:"light",document.documentElement.setAttribute("data-theme",t)}(),function(){try{for(var[t,e]of new URLSearchParams(window.location.search).entries())if(t.startsWith("docusaurus-data-")){var a=t.replace("docusaurus-data-","data-");document.documentElement.setAttribute(a,e)}}catch(t){}}()</script><div id=__docusaurus><div role=region aria-label="Skip to main content"><a class=skipToContent_fXgn href=#__docusaurus_skipToContent_fallback>Skip to main content</a></div><nav aria-label=Main class="navbar navbar--fixed-top"><div class=navbar__inner><div class=navbar__items><button aria-label="Toggle navigation bar" aria-expanded=false class="navbar__toggle clean-btn" type=button><svg width=30 height=30 viewBox="0 0 30 30" aria-hidden=true><path stroke=currentColor stroke-linecap=round stroke-miterlimit=10 stroke-width=2 d="M4 7h22M4 15h22M4 23h22"/></svg></button><a class=navbar__brand href=/><div class=navbar__logo><img src=/img/logo.svg alt="Sierpinski Gasket" class="themedComponent_mlkZ themedComponent--light_NVdE"><img src=/img/logo-dark.svg alt="Sierpinski Gasket" class="themedComponent_mlkZ themedComponent--dark_xIcU"></div><b class="navbar__title text--truncate">The Old Speice Guy</b></a></div><div class="navb
rules that are mostly right for small chunks of code. We've spent time showing how those rules work
themselves out in practice, and become familiar with reading the assembly code needed to see each
memory type (global, stack, heap) in action.</p>
<p>Throughout the series so far, we've put a handicap on the code. In the name of consistent and
understandable results, we've asked the compiler to pretty please leave the training wheels on. Now
is the time where we throw out all the rules and take off the kid gloves. As it turns out, both the
Rust compiler and the LLVM optimizers are incredibly sophisticated, and we'll step back and let them
do their job.</p>
<p>Similar to
<a href="https://www.youtube.com/watch?v=bSkpMdDe4g4" target=_blank rel="noopener noreferrer">"What Has My Compiler Done For Me Lately?"</a>, we're
focusing on interesting things the Rust language (and LLVM!) can do with memory management. We'll
still be looking at assembly code to understand what's going on, but it's important to mention
again: <strong>please use automated tools like <a href=https://crates.io/crates/alloc_counter target=_blank rel="noopener noreferrer">alloc-counter</a> to
double-check memory behavior if it's something you care about</strong>. It's far too easy to mis-read
assembly in large code sections, you should always verify behavior if you care about memory usage.</p>
<p>The guiding principal as we move forward is this: <em>optimizing compilers won't produce worse programs
than we started with.</em> There won't be any situations where stack allocations get moved to heap
allocations. There will, however, be an opera of optimization.</p>
<p><strong>Update 2019-02-10</strong>: When debugging a
<a href=https://gitlab.com/sio4/code/alloc-counter/issues/1 target=_blank rel="noopener noreferrer">related issue</a>, it was discovered that the
original code worked because LLVM optimized out the entire function, rather than just the allocation
segments. The code has been updated with proper use of
<a href=https://doc.rust-lang.org/std/ptr/fn.read_volatile.html target=_blank rel="noopener noreferrer"><code>read_volatile</code></a>, and a previous section
on vector capacity has been removed.</p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id=the-case-of-the-disappearing-box>The Case of the Disappearing Box<a href=#the-case-of-the-disappearing-box class=hash-link aria-label="Direct link to The Case of the Disappearing Box" title="Direct link to The Case of the Disappearing Box"></a></h2>
<p>Our first optimization comes when LLVM can reason that the lifetime of an object is sufficiently
short that heap allocations aren't necessary. In these cases, LLVM will move the allocation to the
stack instead! The way this interacts with <code>#[inline]</code> attributes is a bit opaque, but the important
part is that LLVM can sometimes do better than the baseline Rust language:</p>
<div class="language-rust codeBlockContainer_Ckt0 theme-code-block" style="--prism-background-color:hsl(230, 1%, 98%);--prism-color:hsl(230, 8%, 24%)"><div class=codeBlockContent_biex><pre tabindex=0 class="prism-code language-rust codeBlock_bY9V thin-scrollbar" style="background-color:hsl(230, 1%, 98%);color:hsl(230, 8%, 24%)"><code class=codeBlockLines_e6Vv><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token keyword" style="color:hsl(301, 63%, 40%)">use</span><span class="token plain"> </span><span class="token namespace">std</span><span class="token namespace punctuation" style="color:hsl(119, 34%, 47%)">::</span><span class="token namespace">alloc</span><span class="token namespace punctuation" style="color:hsl(119, 34%, 47%)">::</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">{</span><span class="token class-name" style="color:hsl(35, 99%, 36%)">GlobalAlloc</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">,</span><span class="token plain"> </span><span class="token class-name" style="color:hsl(35, 99%, 36%)">Layout</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">,</span><span class="token plain"> </span><span class="token class-name" style="color:hsl(35, 99%, 36%)">System</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">}</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">;</span><span class="token plain"></span><br></span><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token plain"></span><span class="token keyword" style="color:hsl(301, 63%, 40%)">use</span><span class="token plain"> </span><span class="token namespace">std</span><span class="token namespace punctuation" style="color:hsl(119, 34%, 47%)">::</span><span class="token namespace">sync</span><span class="token namespace punctuation" style="color:hsl(119, 34%, 47%)">::</span><span class="token namespace">atomic</span><span class="token namespace punctuation" style="color:hsl(119, 34%, 47%)">::</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">{</span><span class="token class-name" style="color:hsl(35, 99%, 36%)">AtomicBool</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">,</span><span class="token plain"> </span><span class="token class-name" style="color:hsl(35, 99%, 36%)">Ordering</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">}</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">;</span><span class="token plain"></span><br></span><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token plain" style=display:inline-block></span><br></span><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token plain"></span><span class="token keyword" style="color:hsl(301, 63%, 40%)">pub</span><span class="token plain"> </span><span class="token keyword" style="color:hsl(301, 63%, 40%)">fn</span><span class="token plain"> </span><span class="token function-definition function" style="color:hsl(221, 87%, 60%)">cmp</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">(</span><span class="token plain">x</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">:</span><span class="token plain"> </span><span class="token keyword" style="color:hsl(301, 63%, 40%)">u32</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">)</span><span class="token plain"> </span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">{</span><span class="token plain"></span><br></span><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token plain"> </span><span class="token comment" style="color:hsl(230, 4%, 64%)">// Turn on panicking if we allocate on the heap</span><span class="token plain"></span><br></span><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token plain"> </span><span class="token constant" style="color:hsl(35, 99%, 36%)">DO_PANIC</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">.</span><
<p>-- <a href=https://godbolt.org/z/BZ_Yp3 target=_blank rel="noopener noreferrer">Compiler Explorer</a></p>
<p>-- <a href="https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=4a765f753183d5b919f62c71d2109d5d" target=_blank rel="noopener noreferrer">Rust Playground</a></p>
<h2 class="anchor anchorWithStickyNavbar_LWe7" id=dr-array-or-how-i-learned-to-love-the-optimizer>Dr. Array or: how I learned to love the optimizer<a href=#dr-array-or-how-i-learned-to-love-the-optimizer class=hash-link aria-label="Direct link to Dr. Array or: how I learned to love the optimizer" title="Direct link to Dr. Array or: how I learned to love the optimizer"></a></h2>
<p>Finally, this isn't so much about LLVM figuring out different memory behavior, but LLVM stripping
out code that doesn't do anything. Optimizations of this type have a lot of nuance to them; if
you're not careful, they can make your benchmarks look
<a href="https://www.youtube.com/watch?v=nXaxk27zwlk&feature=youtu.be&t=1199" target=_blank rel="noopener noreferrer">impossibly good</a>. In Rust, the
<code>black_box</code> function (implemented in both
<a href=https://doc.rust-lang.org/1.1.0/test/fn.black_box.html target=_blank rel="noopener noreferrer"><code>libtest</code></a> and
<a href=https://docs.rs/criterion/0.2.10/criterion/fn.black_box.html target=_blank rel="noopener noreferrer"><code>criterion</code></a>) will tell the compiler
to disable this kind of optimization. But if you let LLVM remove unnecessary code, you can end up
running programs that previously caused errors:</p>
<div class="language-rust codeBlockContainer_Ckt0 theme-code-block" style="--prism-background-color:hsl(230, 1%, 98%);--prism-color:hsl(230, 8%, 24%)"><div class=codeBlockContent_biex><pre tabindex=0 class="prism-code language-rust codeBlock_bY9V thin-scrollbar" style="background-color:hsl(230, 1%, 98%);color:hsl(230, 8%, 24%)"><code class=codeBlockLines_e6Vv><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token attribute attr-name" style="color:hsl(35, 99%, 36%)">#[derive(Default)]</span><span class="token plain"></span><br></span><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token plain"></span><span class="token keyword" style="color:hsl(301, 63%, 40%)">struct</span><span class="token plain"> </span><span class="token type-definition class-name" style="color:hsl(35, 99%, 36%)">TwoFiftySix</span><span class="token plain"> </span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">{</span><span class="token plain"></span><br></span><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token plain"> _a</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">:</span><span class="token plain"> </span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">[</span><span class="token keyword" style="color:hsl(301, 63%, 40%)">u64</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">;</span><span class="token plain"> </span><span class="token number" style="color:hsl(35, 99%, 36%)">32</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">]</span><span class="token plain"></span><br></span><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token plain"></span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">}</span><span class="token plain"></span><br></span><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token plain" style=display:inline-block></span><br></span><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token plain"></span><span class="token attribute attr-name" style="color:hsl(35, 99%, 36%)">#[derive(Default)]</span><span class="token plain"></span><br></span><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token plain"></span><span class="token keyword" style="color:hsl(301, 63%, 40%)">struct</span><span class="token plain"> </span><span class="token type-definition class-name" style="color:hsl(35, 99%, 36%)">EightK</span><span class="token plain"> </span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">{</span><span class="token plain"></span><br></span><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token plain"> _a</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">:</span><span class="token plain"> </span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">[</span><span class="token class-name" style="color:hsl(35, 99%, 36%)">TwoFiftySix</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">;</span><span class="token plain"> </span><span class="token number" style="color:hsl(35, 99%, 36%)">32</span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">]</span><span class="token plain"></span><br></span><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token plain"></span><span class="token punctuation" style="color:hsl(119, 34%, 47%)">}</span><span class="token plain"></span><br></span><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token plain" style=display:inline-block></span><br></span><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token plain"></span><span class="token attribute attr-name" style="color:hsl(35, 99%, 36%)">#[derive(Default)]</span><span class="token plain"></span><br></span><span class=token-line style="color:hsl(230, 8%, 24%)"><span class="token plain"></span><span class="token keyword" style="color:hsl(301, 63%, 40%)">struct</span><span class="token plain"> </span><span class="token type-definition class-name" style="
<p>-- <a href=https://godbolt.org/z/daHn7P target=_blank rel="noopener noreferrer">Compiler Explorer</a></p>
<p>-- <a href="https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=4c253bf26072119896ab93c6ef064dc0" target=_blank rel="noopener noreferrer">Rust Playground</a></div></article><nav class="pagination-nav docusaurus-mt-lg" aria-label="Blog post page navigation"><a class="pagination-nav__link pagination-nav__link--prev" href=/2019/02/a-heaping-helping><div class=pagination-nav__sublabel>Older post</div><div class=pagination-nav__label>Allocations in Rust: Dynamic memory</div></a><a class="pagination-nav__link pagination-nav__link--next" href=/2019/02/summary><div class=pagination-nav__sublabel>Newer post</div><div class=pagination-nav__label>Allocations in Rust: Summary</div></a></nav></main><div class="col col--2"><div class="tableOfContents_bqdL thin-scrollbar"><ul class="table-of-contents table-of-contents__left-border"><li><a href=#the-case-of-the-disappearing-box class="table-of-contents__link toc-highlight">The Case of the Disappearing Box</a><li><a href=#dr-array-or-how-i-learned-to-love-the-optimizer class="table-of-contents__link toc-highlight">Dr. Array or: how I learned to love the optimizer</a></ul></div></div></div></div></div><footer class=footer><div class="container container-fluid"><div class="footer__bottom text--center"><div class=footer__copyright>Copyright © 2024 Bradlee Speice</div></div></div></footer></div>