<description><![CDATA[So far, our plot() function has been fairly simple: map a fractal flame coordinate to a specific pixel,]]></description>
<content:encoded><![CDATA[<p>So far, our <code>plot()</code> function has been fairly simple: map a fractal flame coordinate to a specific pixel,
and color in that pixel. This works well for simple function systems (like Sierpinski's Gasket),
but more complex systems (like the reference parameters) produce grainy images.</p>
<p>In this post, we'll refine the image quality and add color to really make things shine.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="image-histograms">Image histograms<ahref="https://speice.io/2024/11/playing-with-fire-log-density#image-histograms"class="hash-link"aria-label="Direct link to Image histograms"title="Direct link to Image histograms"></a></h2>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="tone-mapping">Tone mapping<ahref="https://speice.io/2024/11/playing-with-fire-log-density#tone-mapping"class="hash-link"aria-label="Direct link to Tone mapping"title="Direct link to Tone mapping"></a></h2>
<p>While using a histogram reduces the "graining," it also leads to some parts vanishing entirely.
In the reference parameters, the outer circle is still there, but the interior is gone!</p>
<p>To fix this, we'll introduce the second major innovation of the fractal flame algorithm: <ahref="https://en.wikipedia.org/wiki/Tone_mapping"target="_blank"rel="noopener noreferrer">tone mapping</a>.
This is a technique used in computer graphics to compensate for differences in how
computers represent brightness, and how people actually see brightness.</p>
<p>As a concrete example, high-dynamic-range (HDR) photography uses this technique to capture
scenes with a wide range of brightnesses. To take a picture of something dark,
you need a long exposure time. However, long exposures lead to "hot spots" (sections that are pure white).
By taking multiple pictures with different exposure times, we can combine them to create
a final image where everything is visible.</p>
<p>In fractal flames, this "tone map" is accomplished by scaling brightness according to the <em>logarithm</em>
of how many times we encounter a pixel. This way, "cold spots" (pixels the chaos game visits infrequently)
are still visible, and "hot spots" (pixels the chaos game visits frequently) won't wash out.</p>
<detailsclass="details_lb9f alert alert--info details_b_Ee"data-collapsed="true"><summary>Log-scale vibrancy also explains fractal flames appear to be 3D...</summary><div><divclass="collapsibleContent_i85q"><p>As mentioned in the paper:</p><blockquote>
<p>Where one branch of the fractal crosses another, one may appear to occlude the other
if their densities are different enough because the lesser density is inconsequential in sum.
For example, branches of densities 1000 and 100 might have brightnesses of 30 and 20.
Where they cross the density is 1100, whose brightness is 30.4, which is
<h2class="anchor anchorWithStickyNavbar_LWe7"id="color">Color<ahref="https://speice.io/2024/11/playing-with-fire-log-density#color"class="hash-link"aria-label="Direct link to Color"title="Direct link to Color"></a></h2>
<p>Now we'll introduce the last innovation of the fractal flame algorithm: color.
By including a third coordinate (<spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mi>c</mi></mrow><annotationencoding="application/x-tex">c</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.4306em"></span><spanclass="mord mathnormal">c</span></span></span></span>) in the chaos game, we can illustrate the transforms
responsible for the image.</p>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="color-coordinate">Color coordinate<ahref="https://speice.io/2024/11/playing-with-fire-log-density#color-coordinate"class="hash-link"aria-label="Direct link to Color coordinate"title="Direct link to Color coordinate"></a></h3>
<p>Color in a fractal flame is continuous on the range <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mostretchy="false">[</mo><mn>0</mn><moseparator="true">,</mo><mn>1</mn><mostretchy="false">]</mo></mrow><annotationencoding="application/x-tex">[0, 1]</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:1em;vertical-align:-0.25em"></span><spanclass="mopen">[</span><spanclass="mord">0</span><spanclass="mpunct">,</span><spanclass="mspace"style="margin-right:0.1667em"></span><spanclass="mord">1</span><spanclass="mclose">]</span></span></span></span>. This is important for two reasons:</p>
<ul>
<li>It helps blend colors together in the final image. Slight changes in the color value lead to
slight changes in the actual color</li>
<li>It allows us to swap in new color palettes easily. We're free to choose what actual colors
each value represents</li>
</ul>
<p>We'll give each transform a color value (<spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><msub><mi>c</mi><mi>i</mi></msub></mrow><annotationencoding="application/x-tex">c_i</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.5806em;vertical-align:-0.15em"></span><spanclass="mord"><spanclass="mord mathnormal">c</span><spanclass="msupsub"><spanclass="vlist-t vlist-t2"><spanclass="vlist-r"><spanclass="vlist"style="height:0.3117em"><spanstyle="top:-2.55em;margin-left:0em;margin-right:0.05em"><spanclass="pstrut"style="height:2.7em"></span><spanclass="sizing reset-size6 size3 mtight"><spanclass="mord mathnormal mtight">i</span></span></span></span><spanclass="vlist-s"></span></span><spanclass="vlist-r"><spanclass="vlist"style="height:0.15em"><span></span></span></span></span></span></span></span></span></span>) in the <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mostretchy="false">[</mo><mn>0</mn><moseparator="true">,</mo><mn>1</mn><mostretchy="false">]</mo></mrow><annotationencoding="application/x-tex">[0, 1]</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:1em;vertical-align:-0.25em"></span><spanclass="mopen">[</span><spanclass="mord">0</span><spanclass="mpunct">,</span><spanclass="mspace"style="margin-right:0.1667em"></span><spanclass="mord">1</span><spanclass="mclose">]</span></span></span></span> range.
The final transform gets a value too (<spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><msub><mi>c</mi><mi>f</mi></msub></mrow><annotationencoding="application/x-tex">c_f</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.7167em;vertical-align:-0.2861em"></span><spanclass="mord"><spanclass="mord mathnormal">c</span><spanclass="msupsub"><spanclass="vlist-t vlist-t2"><spanclass="vlist-r"><spanclass="vlist"style="height:0.3361em"><spanstyle="top:-2.55em;margin-left:0em;margin-right:0.05em"><spanclass="pstrut"style="height:2.7em"></span><spanclass="sizing reset-size6 size3 mtight"><spanclass="mord mathnormal mtight"style="margin-right:0.10764em">f</span></span></span></span><spanclass="vlist-s"></span></span><spanclass="vlist-r"><spanclass="vlist"style="height:0.2861em"><span></span></span></span></span></span></span></span></span></span>).
Then, at each step in the chaos game, we'll set the current color
<h3class="anchor anchorWithStickyNavbar_LWe7"id="color-speed">Color speed<ahref="https://speice.io/2024/11/playing-with-fire-log-density#color-speed"class="hash-link"aria-label="Direct link to Color speed"title="Direct link to Color speed"></a></h3>
<divclass="theme-admonition theme-admonition-warning admonition_xJq3 alert alert--warning"><divclass="admonitionHeading_Gvgb"><spanclass="admonitionIcon_Rf37"><svgviewBox="0 0 16 16"><pathfill-rule="evenodd"d="M8.893 1.5c-.183-.31-.52-.5-.887-.5s-.703.19-.886.5L.138 13.499a.98.98 0 0 0 0 1.001c.193.31.53.501.886.501h13.964c.367 0 .704-.19.877-.5a1.03 1.03 0 0 0 .01-1.002L8.893 1.5zm.133 11.497H6.987v-2.003h2.039v2.003zm0-3.004H6.987V5.987h2.039v4.006z"></path></svg></span>warning</div><divclass="admonitionContent_BuS1"><p>Color speed isn't introduced in the Fractal Flame Algorithm paper.</p><p>It is included here because <ahref="https://github.com/scottdraves/flam3/blob/7fb50c82e90e051f00efcc3123d0e06de26594b2/variations.c#L2140"target="_blank"rel="noopener noreferrer"><code>flam3</code> implements it</a>,
and because it's fun to play with.</p></div></div>
<p>Next, we'll add a parameter to each transform that controls how much it changes the current color.
This is known as the "color speed" (<spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><msub><mi>s</mi><mi>i</mi></msub></mrow><annotationencoding="application/x-tex">s_i</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.5806em;vertical-align:-0.15em"></span><spanclass="mord"><spanclass="mord mathnormal">s</span><spanclass="msupsub"><spanclass="vlist-t vlist-t2"><spanclass="vlist-r"><spanclass="vlist"style="height:0.3117em"><spanstyle="top:-2.55em;margin-left:0em;margin-right:0.05em"><spanclass="pstrut"style="height:2.7em"></span><spanclass="sizing reset-size6 size3 mtight"><spanclass="mord mathnormal mtight">i</span></span></span></span><spanclass="vlist-s"></span></span><spanclass="vlist-r"><spanclass="vlist"style="height:0.15em"><span></span></span></span></span></span></span></span></span></span>):</p>
<p>Color speed values work just like transform weights. A value of 1
means we take the transform color and ignore the previous color state.
A value of 0 means we keep the current color state and ignore the
transform color.</p>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="palette">Palette<ahref="https://speice.io/2024/11/playing-with-fire-log-density#palette"class="hash-link"aria-label="Direct link to Palette"title="Direct link to Palette"></a></h3>
<p>Now, we need to map the color coordinate to a pixel color. Fractal flames typically use
256 colors (each color has 3 values - red, green, blue) to define a palette.
The color coordinate then becomes an index into the palette.</p>
<p>There's one small complication: the color coordinate is continuous, but the palette
uses discrete colors. How do we handle situations where the color coordinate is
"in between" the colors of our palette?</p>
<p>One way to handle this is a step function. In the code below, we multiply the color coordinate
by the number of colors in the palette, then truncate that value. This gives us a discrete index:</p>
<detailsclass="details_lb9f alert alert--info details_b_Ee"data-collapsed="true"><summary>As an alternative...</summary><div><divclass="collapsibleContent_i85q"><p>...you could interpolate between colors in the palette.
For example, <code>flam3</code> uses <ahref="https://github.com/scottdraves/flam3/blob/7fb50c82e90e051f00efcc3123d0e06de26594b2/rect.c#L483-L486"target="_blank"rel="noopener noreferrer">linear interpolation</a></p></div></div></details>
<p>In the diagram below, each color in the palette is plotted on a small vertical strip.
Putting the strips side by side shows the full palette used by the reference parameters:</p>
<!---->
<divstyle="width:100%;height:40"></div>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="plotting">Plotting<ahref="https://speice.io/2024/11/playing-with-fire-log-density#plotting"class="hash-link"aria-label="Direct link to Plotting"title="Direct link to Plotting"></a></h3>
<p>We're now ready to plot our <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mostretchy="false">(</mo><msub><mi>x</mi><mi>f</mi></msub><moseparator="true">,</mo><msub><mi>y</mi><mi>f</mi></msub><moseparator="true">,</mo><msub><mi>c</mi><mi>f</mi></msub><mostretchy="false">)</mo></mrow><annotationencoding="application/x-tex">(x_f,y_f,c_f)</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:1.0361em;vertical-align:-0.2861em"></span><spanclass="mopen">(</span><spanclass="mord"><spanclass="mord mathnormal">x</span><spanclass="msupsub"><spanclass="vlist-t vlist-t2"><spanclass="vlist-r"><spanclass="vlist"style="height:0.3361em"><spanstyle="top:-2.55em;margin-left:0em;margin-right:0.05em"><spanclass="pstrut"style="height:2.7em"></span><spanclass="sizing reset-size6 size3 mtight"><spanclass="mord mathnormal mtight"style="margin-right:0.10764em">f</span></span></span></span><spanclass="vlist-s"></span></span><spanclass="vlist-r"><spanclass="vlist"style="height:0.2861em"><span></span></span></span></span></span></span><spanclass="mpunct">,</span><spanclass="mspace"style="margin-right:0.1667em"></span><spanclass="mord"><spanclass="mord mathnormal"style="margin-right:0.03588em">y</span><spanclass="msupsub"><spanclass="vlist-t vlist-t2"><spanclass="vlist-r"><spanclass="vlist"style="height:0.3361em"><spanstyle="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em"><spanclass="pstrut"style="height:2.7em"></span><spanclass="sizing reset-size6 size3 mtight"><spanclass="mord mathnormal mtight"style="margin-right:0.10764em">f</span></span></span></span><spanclass="vlist-s"></span></span><spanclass="vlist-r"><spanclass="vlist"style="height:0.2861em"><span></span></span></span></span></span></span><spanclass="mpunct">,</span><spanclass="mspace"style="margin-right:0.1667em"></span><spanclass="mord"><spanclass="mord mathnormal">c</span><spanclass="msupsub"><spanclass="vlist-t vlist-t2"><spanclass="vlist-r"><spanclass="vlist"style="height:0.3361em"><spanstyle="top:-2.55em;margin-left:0em;margin-right:0.05em"><spanclass="pstrut"style="height:2.7em"></span><spanclass="sizing reset-size6 size3 mtight"><spanclass="mord mathnormal mtight"style="margin-right:0.10764em">f</span></span></span></span><spanclass="vlist-s"></span></span><spanclass="vlist-r"><spanclass="vlist"style="height:0.2861em"><span></span></span></span></span></span></span><spanclass="mclose">)</span></span></span></span> coordinates. This time, we'll use a histogram
for each color channel (red, green, blue, alpha). After translating from color coordinate (<spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><msub><mi>c</mi><mi>f</mi></msub></mrow><annotationencoding="application/x-tex">c_f</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.7167em;vertical-align:-0.2861em"></span><spanclass="mord"><spanclass="mord mathnormal">c</span><spanclass="msupsub"><spanclass="vlist-t vlist-t2"><spanclass="vlist-r"><spanclass="vlist"style="height:0.3361em"><spanstyle="top:-2.55em;margin-left:0em;margin-right:0.05em"><spanclass="pstrut"style="height:2.7em"></span><spanclass="sizing reset-size6 size3 mtight"><spanclass="mord mathnormal mtight"style="margin-right:0.10764em">f</span></span></span></span><spanclass="vlist-s"></span></span><spanclass="vlist-r"><spanclass="vlist"style="height:0.2861em"><span></span></span></span></span></span></span></span></span></span>)
<h2class="anchor anchorWithStickyNavbar_LWe7"id="summary">Summary<ahref="https://speice.io/2024/11/playing-with-fire-log-density#summary"class="hash-link"aria-label="Direct link to Summary"title="Direct link to Summary"></a></h2>
<p>Tone mapping is the second major innovation of the fractal flame algorithm.
By tracking how often the chaos game encounters each pixel, we can adjust
brightness/transparency to reduce the visual "graining" of previous images.</p>
<p>Next, introducing a third coordinate to the chaos game makes color images possible,
the third major innovation of the fractal flame algorithm. Using a continuous
color scale and color palette adds a splash of excitement to the image.</p>
<p>The Fractal Flame Algorithm paper goes on to describe more techniques
not covered here. For example, image quality can be improved with density estimation
and filtering. New parameters can be generated by "mutating" existing
fractal flames. And fractal flames can even be animated to produce videos!</p>
<p>That said, I think this is a good place to wrap up. We went from
an introduction to the mathematics of fractal systems all the way to
generating full-color images. Fractal flames are a challenging topic,
but it's extremely rewarding to learn about how they work.</p>]]></content:encoded>
</item>
<item>
<title><![CDATA[Playing with fire: Transforms and variations]]></title>
If you're interested in tweaking the parameters, or creating your own, <ahref="https://sourceforge.net/projects/apophysis/"target="_blank"rel="noopener noreferrer">Apophysis</a>
can load that file.</p></div></div>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="variations">Variations<ahref="https://speice.io/2024/11/playing-with-fire-transforms#variations"class="hash-link"aria-label="Direct link to Variations"title="Direct link to Variations"></a></h2>
<p>Just like transforms, variations (<spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><msub><mi>V</mi><mi>j</mi></msub></mrow><annotationencoding="application/x-tex">V_j</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.9694em;vertical-align:-0.2861em"></span><spanclass="mord"><spanclass="mord mathnormal"style="margin-right:0.22222em">V</span><spanclass="msupsub"><spanclass="vlist-t vlist-t2"><spanclass="vlist-r"><spanclass="vlist"style="height:0.3117em"><spanstyle="top:-2.55em;margin-left:-0.2222em;margin-right:0.05em"><spanclass="pstrut"style="height:2.7em"></span><spanclass="sizing reset-size6 size3 mtight"><spanclass="mord mathnormal mtight"style="margin-right:0.05724em">j</span></span></span></span><spanclass="vlist-s"></span></span><spanclass="vlist-r"><spanclass="vlist"style="height:0.2861em"><span></span></span></span></span></span></span></span></span></span>) are functions that take in <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mostretchy="false">(</mo><mi>x</mi><moseparator="true">,</mo><mi>y</mi><mostretchy="false">)</mo></mrow><annotationencoding="application/x-tex">(x, y)</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:1em;vertical-align:-0.25em"></span><spanclass="mopen">(</span><spanclass="mord mathnormal">x</span><spanclass="mpunct">,</span><spanclass="mspace"style="margin-right:0.1667em"></span><spanclass="mord mathnormal"style="margin-right:0.03588em">y</span><spanclass="mclose">)</span></span></span></span> coordinates
and give back new <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mostretchy="false">(</mo><mi>x</mi><moseparator="true">,</mo><mi>y</mi><mostretchy="false">)</mo></mrow><annotationencoding="application/x-tex">(x, y)</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:1em;vertical-align:-0.25em"></span><spanclass="mopen">(</span><spanclass="mord mathnormal">x</span><spanclass="mpunct">,</span><spanclass="mspace"style="margin-right:0.1667em"></span><spanclass="mord mathnormal"style="margin-right:0.03588em">y</span><spanclass="mclose">)</span></span></span></span> coordinates.
However, the sky is the limit for what happens between input and output.
The Fractal Flame paper lists 49 variation functions,
and the official <code>flam3</code> implementation supports <ahref="https://github.com/scottdraves/flam3/blob/7fb50c82e90e051f00efcc3123d0e06de26594b2/variations.c"target="_blank"rel="noopener noreferrer">98 different variations</a>.</p>
<p>To draw our reference image, we'll focus on just four:</p>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="linear-variation-0">Linear (variation 0)<ahref="https://speice.io/2024/11/playing-with-fire-transforms#linear-variation-0"class="hash-link"aria-label="Direct link to Linear (variation 0)"title="Direct link to Linear (variation 0)"></a></h3>
<p>This variation is dead simple: return the <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mi>x</mi></mrow><annotationencoding="application/x-tex">x</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.4306em"></span><spanclass="mord mathnormal">x</span></span></span></span> and <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mi>y</mi></mrow><annotationencoding="application/x-tex">y</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.625em;vertical-align:-0.1944em"></span><spanclass="mord mathnormal"style="margin-right:0.03588em">y</span></span></span></span> coordinates as-is.</p>
apply the affine coefficients to the input point and use that as the output.</p></div></div>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="julia-variation-13">Julia (variation 13)<ahref="https://speice.io/2024/11/playing-with-fire-transforms#julia-variation-13"class="hash-link"aria-label="Direct link to Julia (variation 13)"title="Direct link to Julia (variation 13)"></a></h3>
<p>This variation is a good example of a non-linear function. It uses both trigonometry
and probability to produce interesting shapes:</p>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="popcorn-variation-17">Popcorn (variation 17)<ahref="https://speice.io/2024/11/playing-with-fire-transforms#popcorn-variation-17"class="hash-link"aria-label="Direct link to Popcorn (variation 17)"title="Direct link to Popcorn (variation 17)"></a></h3>
<p>Some variations rely on knowing the transform's affine coefficients; they're called "dependent variations."
For this variation, we use <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mi>c</mi></mrow><annotationencoding="application/x-tex">c</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.4306em"></span><spanclass="mord mathnormal">c</span></span></span></span> and <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mi>f</mi></mrow><annotationencoding="application/x-tex">f</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.8889em;vertical-align:-0.1944em"></span><spanclass="mord mathnormal"style="margin-right:0.10764em">f</span></span></span></span>:</p>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="pdj-variation-24">PDJ (variation 24)<ahref="https://speice.io/2024/11/playing-with-fire-transforms#pdj-variation-24"class="hash-link"aria-label="Direct link to PDJ (variation 24)"title="Direct link to PDJ (variation 24)"></a></h3>
<p>Some variations have extra parameters we can choose; they're called "parametric variations."
For the PDJ variation, there are four extra parameters:</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="blending">Blending<ahref="https://speice.io/2024/11/playing-with-fire-transforms#blending"class="hash-link"aria-label="Direct link to Blending"title="Direct link to Blending"></a></h2>
<p>Now, one variation is fun, but we can also combine variations in a process called "blending."
Each variation receives the same <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mi>x</mi></mrow><annotationencoding="application/x-tex">x</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.4306em"></span><spanclass="mord mathnormal">x</span></span></span></span> and <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mi>y</mi></mrow><annotationencoding="application/x-tex">y</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.625em;vertical-align:-0.1944em"></span><spanclass="mord mathnormal"style="margin-right:0.03588em">y</span></span></span></span> inputs, and we add together each variation's <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mi>x</mi></mrow><annotationencoding="application/x-tex">x</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.4306em"></span><spanclass="mord mathnormal">x</span></span></span></span> and <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mi>y</mi></mrow><annotationencoding="application/x-tex">y</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.625em;vertical-align:-0.1944em"></span><spanclass="mord mathnormal"style="margin-right:0.03588em">y</span></span></span></span> outputs.
We'll also give each variation a weight (<spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><msub><mi>v</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub></mrow><annotationencoding="application/x-tex">v_{ij}</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.7167em;vertical-align:-0.2861em"></span><spanclass="mord"><spanclass="mord mathnormal"style="margin-right:0.03588em">v</span><spanclass="msupsub"><spanclass="vlist-t vlist-t2"><spanclass="vlist-r"><spanclass="vlist"style="height:0.3117em"><spanstyle="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em"><spanclass="pstrut"style="height:2.7em"></span><spanclass="sizing reset-size6 size3 mtight"><spanclass="mord mtight"><spanclass="mord mathnormal mtight"style="margin-right:0.05724em">ij</span></span></span></span></span><spanclass="vlist-s"></span></span><spanclass="vlist-r"><spanclass="vlist"style="height:0.2861em"><span></span></span></span></span></span></span></span></span></span>) that changes how much it contributes to the result:</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="post-transforms">Post transforms<ahref="https://speice.io/2024/11/playing-with-fire-transforms#post-transforms"class="hash-link"aria-label="Direct link to Post transforms"title="Direct link to Post transforms"></a></h2>
<p>Next, we'll introduce a second affine transform applied <em>after</em> variation blending. This is called a "post transform."</p>
<p>We'll use some new variables, but the post transform should look familiar:</p>
<p>The image below uses the same transforms/variations as the previous fractal flame,
but allows changing the post-transform coefficients:</p>
<detailsclass="details_lb9f alert alert--info details_b_Ee"data-collapsed="true"><summary>If you want to test your understanding...</summary><div><divclass="collapsibleContent_i85q"><ul>
<li>What post-transform coefficients will give us the previous image?</li>
<li>What post-transform coefficients will give us a <em>mirrored</em> image?</li>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="final-transforms">Final transforms<ahref="https://speice.io/2024/11/playing-with-fire-transforms#final-transforms"class="hash-link"aria-label="Direct link to Final transforms"title="Direct link to Final transforms"></a></h2>
<p>The last step is to introduce a "final transform" (<spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><msub><mi>F</mi><mrow><mi>f</mi><mi>i</mi><mi>n</mi><mi>a</mi><mi>l</mi></mrow></msub></mrow><annotationencoding="application/x-tex">F_{final}</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.9694em;vertical-align:-0.2861em"></span><spanclass="mord"><spanclass="mord mathnormal"style="margin-right:0.13889em">F</span><spanclass="msupsub"><spanclass="vlist-t vlist-t2"><spanclass="vlist-r"><spanclass="vlist"style="height:0.3361em"><spanstyle="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em"><spanclass="pstrut"style="height:2.7em"></span><spanclass="sizing reset-size6 size3 mtight"><spanclass="mord mtight"><spanclass="mord mathnormal mtight"style="margin-right:0.10764em">f</span><spanclass="mord mathnormal mtight">ina</span><spanclass="mord mathnormal mtight"style="margin-right:0.01968em">l</span></span></span></span></span><spanclass="vlist-s"></span></span><spanclass="vlist-r"><spanclass="vlist"style="height:0.2861em"><span></span></span></span></span></span></span></span></span></span>) that is applied
regardless of which regular transform (<spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><msub><mi>F</mi><mi>i</mi></msub></mrow><annotationencoding="application/x-tex">F_i</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.8333em;vertical-align:-0.15em"></span><spanclass="mord"><spanclass="mord mathnormal"style="margin-right:0.13889em">F</span><spanclass="msupsub"><spanclass="vlist-t vlist-t2"><spanclass="vlist-r"><spanclass="vlist"style="height:0.3117em"><spanstyle="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em"><spanclass="pstrut"style="height:2.7em"></span><spanclass="sizing reset-size6 size3 mtight"><spanclass="mord mathnormal mtight">i</span></span></span></span><spanclass="vlist-s"></span></span><spanclass="vlist-r"><spanclass="vlist"style="height:0.15em"><span></span></span></span></span></span></span></span></span></span>) the chaos game selects.
It's just like a normal transform (composition of affine transform, variation blend, and post transform),
but it doesn't affect the chaos game state.</p>
<p>After adding the final transform, our chaos game algorithm looks like this:</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="summary">Summary<ahref="https://speice.io/2024/11/playing-with-fire-transforms#summary"class="hash-link"aria-label="Direct link to Summary"title="Direct link to Summary"></a></h2>
<p>Variations are the fractal flame algorithm's first major innovation.
By blending variation functions and post/final transforms, we generate unique images.</p>
<p>However, these images are grainy and unappealing. In the next post, we'll clean up
the image quality and add some color.</p>]]></content:encoded>
</item>
<item>
<title><![CDATA[Playing with fire: The fractal flame algorithm]]></title>
<p>I don't remember when exactly I first learned about fractal flames, but I do remember being entranced by the images they created.
I also remember their unique appeal to my young engineering mind; this was an art form I could participate in.</p>
<p>The <ahref="https://flam3.com/flame_draves.pdf"target="_blank"rel="noopener noreferrer">Fractal Flame Algorithm paper</a> describing their structure was too much
for me to handle at the time (I was ~12 years old), so I was content to play around and enjoy the pictures.
But the desire to understand it stuck around. Now, with a graduate degree under my belt, I wanted to revisit it.</p>
<p>This guide is my attempt to explain how fractal flames work so that younger me — and others interested in the art —
can understand without too much prior knowledge.</p>
<hr>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="iterated-function-systems">Iterated function systems<ahref="https://speice.io/2024/11/playing-with-fire#iterated-function-systems"class="hash-link"aria-label="Direct link to Iterated function systems"title="Direct link to Iterated function systems"></a></h2>
<p>As mentioned, fractal flames are a type of "<ahref="https://en.wikipedia.org/wiki/Iterated_function_system"target="_blank"rel="noopener noreferrer">iterated function system</a>,"
or IFS. The formula for an IFS is short, but takes some time to work through:</p>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="solution-set">Solution set<ahref="https://speice.io/2024/11/playing-with-fire#solution-set"class="hash-link"aria-label="Direct link to Solution set"title="Direct link to Solution set"></a></h3>
<p>First, <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mi>S</mi></mrow><annotationencoding="application/x-tex">S</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.6833em"></span><spanclass="mord mathnormal"style="margin-right:0.05764em">S</span></span></span></span>. <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mi>S</mi></mrow><annotationencoding="application/x-tex">S</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.6833em"></span><spanclass="mord mathnormal"style="margin-right:0.05764em">S</span></span></span></span> is the set of points in two dimensions (in math terms, <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mi>S</mi><mo>∈</mo><msup><mimathvariant="double-struck">R</mi><mn>2</mn></msup></mrow><annotationencoding="application/x-tex">S \in \mathbb{R}^2</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.7224em;vertical-align:-0.0391em"></span><spanclass="mord mathnormal"style="margin-right:0.05764em">S</span><spanclass="mspace"style="margin-right:0.2778em"></span><spanclass="mrel">∈</span><spanclass="mspace"style="margin-right:0.2778em"></span></span><spanclass="base"><spanclass="strut"style="height:0.8141em"></span><spanclass="mord"><spanclass="mord mathbb">R</span><spanclass="msupsub"><spanclass="vlist-t"><spanclass="vlist-r"><spanclass="vlist"style="height:0.8141em"><spanstyle="top:-3.063em;margin-right:0.05em"><spanclass="pstrut"style="height:2.7em"></span><spanclass="sizing reset-size6 size3 mtight"><spanclass="mord mtight">2</span></span></span></span></span></span></span></span></span></span></span>)
that represent a "solution" of some kind to our equation.
Our goal is to find all the points in <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mi>S</mi></mrow><annotationencoding="application/x-tex">S</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.6833em"></span><spanclass="mord mathnormal"style="margin-right:0.05764em">S</span></span></span></span>, plot them, and display that image.</p>
<p>For example, if we say <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mi>S</mi><mo>=</mo><mostretchy="false">{</mo><mostretchy="false">(</mo><mn>0</mn><moseparator="true">,</mo><mn>0</mn><mostretchy="false">)</mo><moseparator="true">,</mo><mostretchy="false">(</mo><mn>1</mn><moseparator="true">,</mo><mn>1</mn><mostretchy="false">)</mo><moseparator="true">,</mo><mostretchy="false">(</mo><mn>2</mn><moseparator="true">,</mo><mn>2</mn><mostretchy="false">)</mo><mostretchy="false">}</mo></mrow><annotationencoding="application/x-tex">S = \{(0,0), (1, 1), (2, 2)\}</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.6833em"></span><spanclass="mord mathnormal"style="margin-right:0.05764em">S</span><spanclass="mspace"style="margin-right:0.2778em"></span><spanclass="mrel">=</span><spanclass="mspace"style="margin-right:0.2778em"></span></span><spanclass="base"><spanclass="strut"style="height:1em;vertical-align:-0.25em"></span><spanclass="mopen">{(</span><spanclass="mord">0</span><spanclass="mpunct">,</span><spanclass="mspace"style="margin-right:0.1667em"></span><spanclass="mord">0</span><spanclass="mclose">)</span><spanclass="mpunct">,</span><spanclass="mspace"style="margin-right:0.1667em"></span><spanclass="mopen">(</span><spanclass="mord">1</span><spanclass="mpunct">,</span><spanclass="mspace"style="margin-right:0.1667em"></span><spanclass="mord">1</span><spanclass="mclose">)</span><spanclass="mpunct">,</span><spanclass="mspace"style="margin-right:0.1667em"></span><spanclass="mopen">(</span><spanclass="mord">2</span><spanclass="mpunct">,</span><spanclass="mspace"style="margin-right:0.1667em"></span><spanclass="mord">2</span><spanclass="mclose">)}</span></span></span></span>, there are three points to plot:</p>
<!---->
<divclass="VictoryContainer"style="height:100%;width:100%;user-select:none;pointer-events:none;touch-action:none;position:relative"><svgwidth="450"height="300"role="img"viewBox="0 0 450 300"style="width:100%;height:100%;pointer-events:all"><g><pathd="M 60, 240 m -5, 0 a 5, 5 0 1,0 10,0 a 5, 5 0 1,0 -10,0"style="fill:blue;opacity:1;stroke:transparent;stroke-width:0"role="presentation"shape-rendering="auto"></path><pathd="M 225, 150 m -5, 0 a 5, 5 0 1,0 10,0 a 5, 5 0 1,0 -10,0"style="fill:blue;opacity:1;stroke:transparent;stroke-width:0"role="presentation"shape-rendering="auto"></path><pathd="M 390, 60 m -5, 0 a 5, 5 0 1,0 10,0 a 5, 5 0 1,0 -10,0"style="fill:blue;opacity:1;stroke:transparent;stroke-width:0"role="presentation"shape-rendering="auto"></path></g><grole="presentation"><linevector-effect="non-scaling-stroke"style="stroke:#757575;fill:transparent;stroke-width:1;stroke-linecap:round;stroke-linejoin:round"role="presentation"shape-rendering="auto"x1="60"x2="390"y1="240"y2="240"></line><grole="presentation"><textid="chart-axis-1-tickLabels-0"direction="inherit"dx="0"x="142.5"y="263.26"><tspanx="142.5"dx="0"dy="0"text-anchor="middle"style="font-family:'Inter', 'Helvetica Neue', 'Seravek', 'Helvetica', sans-serif;font-size:12px;font-weight:300;letter-spacing:normal;padding:8px;fill:#292929;stroke:transparent">0.5</tspan></text></g><grole="presentation"><textid="chart-axis-1-tickLabels-1"direction="inherit"dx="0"x="225"y="263.26"><tspanx="225"dx="0"dy="0"text-anchor="middle"style="font-family:'Inter', 'Helvetica Neue', 'Seravek', 'Helvetica', sans-serif;font-size:12px;font-weight:300;letter-spacing:normal;padding:8px;fill:#292929;stroke:transparent">1.0</tspan></text></g><grole="presentation"><textid="chart-axis-1-tickLabels-2"direction="inherit"dx="0"x="307.5"y="263.26"><tspanx="307.5"dx="0"dy="0"text-anchor="middle"style="font-family:'Inter', 'Helvetica Neue', 'Seravek', 'Helvetica', sans-serif;font-size:12px;font-weight:300;letter-spacing:normal;padding:8px;fill:#292929;stroke:transparent">1.5</tspan></text></g><grole="presentation"><textid="chart-axis-1-tickLabels-3"direction="inherit"dx="0"x="390"y="263.26"><tspanx="390"dx="0"dy="0"text-anchor="middle"style="font-family:'Inter', 'Helvetica Neue', 'Seravek', 'Helvetica', sans-serif;font-size:12px;font-weight:300;letter-spacing:normal;padding:8px;fill:#292929;stroke:transparent">2.0</tspan></text></g></g><grole="presentation"><linevector-effect="non-scaling-stroke"style="stroke:#757575;fill:transparent;stroke-width:1;stroke-linecap:round;stroke-linejoin:round"role="presentation"shape-rendering="auto"x1="60"x2="60"y1="60"y2="240"></line><grole="presentation"><textid="chart-axis-2-tickLabels-0"direction="inherit"dx="0"x="47"y="199.26"><tspanx="47"dx="0"dy="0"text-anchor="end"style="font-family:'Inter', 'Helvetica Neue', 'Seravek', 'Helvetica', sans-serif;font-size:12px;font-weight:300;letter-spacing:normal;padding:8px;fill:#292929;stroke:transparent">0.5</tspan></text></g><grole="presentation"><textid="chart-axis-2-tickLabels-1"direction="inherit"dx="0"x="47"y="154.26"><tspanx="47"dx="0"dy="0"text-anchor="end"style="font-family:'Inter', 'Helvetica Neue', 'Seravek', 'Helvetica', sans-serif;font-size:12px;font-weight:300;letter-spacing:normal;padding:8px;fill:#292929;stroke:transparent">1.0</tspan></text></g><grole="presentation"><textid="chart-axis-2-tickLabels-2"direction="inherit"dx="0"x="47"y="109.26"><tspanx="47"dx="0"dy="0"text-anchor="end"style="font-family:'Inter', 'Helvetica Neue', 'Seravek', 'Helvetica', sans-serif;font-size:12px;font-weight:300;letter-spacing:normal;padding:8px;fill:#292929;stroke:transparent">1.5</tspan></text></g><grole="presentation"><textid="chart-axis-2-tickLabels-3"direction="inherit"dx="0"x="47"y="64.26"><tspanx="47"dx="0"dy="0"text-anchor="end"style="font-family:'Inter','HelveticaNeue','Seravek','Helvetica',sans-serif;font-size:12px;font-weight:300;letter-spacing:normal;padding:8px;fill:#292929;stroke:trans
<p>With fractal flames, rather than listing individual points, we use functions to describe the solution.
This means there are an infinite number of points, but if we find <em>enough</em> points to plot, we get a nice picture.
And if the functions change, the solution also changes, and we get something new.</p>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="transform-functions">Transform functions<ahref="https://speice.io/2024/11/playing-with-fire#transform-functions"class="hash-link"aria-label="Direct link to Transform functions"title="Direct link to Transform functions"></a></h3>
<p>Second, the <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><msub><mi>F</mi><mi>i</mi></msub><mostretchy="false">(</mo><mi>S</mi><mostretchy="false">)</mo></mrow><annotationencoding="application/x-tex">F_i(S)</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:1em;vertical-align:-0.25em"></span><spanclass="mord"><spanclass="mord mathnormal"style="margin-right:0.13889em">F</span><spanclass="msupsub"><spanclass="vlist-t vlist-t2"><spanclass="vlist-r"><spanclass="vlist"style="height:0.3117em"><spanstyle="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em"><spanclass="pstrut"style="height:2.7em"></span><spanclass="sizing reset-size6 size3 mtight"><spanclass="mord mathnormal mtight">i</span></span></span></span><spanclass="vlist-s"></span></span><spanclass="vlist-r"><spanclass="vlist"style="height:0.15em"><span></span></span></span></span></span></span><spanclass="mopen">(</span><spanclass="mord mathnormal"style="margin-right:0.05764em">S</span><spanclass="mclose">)</span></span></span></span> functions, also known as "transforms."
Each transform takes in a 2-dimensional point and gives a new point back
While you could theoretically use any function, we'll focus on a specific kind of function
called an "<ahref="https://en.wikipedia.org/wiki/Affine_transformation"target="_blank"rel="noopener noreferrer">affine transformation</a>." Every transform uses the same formula:</p>
<p>Applying this transform to the original points gives us a new set of points:</p>
<!---->
<!---->
<!---->
<divclass="VictoryContainer"style="height:100%;width:100%;user-select:none;pointer-events:none;touch-action:none;position:relative"><svgwidth="450"height="300"role="img"viewBox="0 0 450 300"style="width:100%;height:100%;pointer-events:all"><g><pathd="M 60, 240 m -5, 0 a 5, 5 0 1,0 10,0 a 5, 5 0 1,0 -10,0"style="fill:blue;opacity:1;stroke:transparent;stroke-width:0"role="presentation"shape-rendering="auto"></path><pathd="M 192, 188.57142857142858 m -5, 0 a 5, 5 0 1,0 10,0 a 5, 5 0 1,0 -10,0"style="fill:blue;opacity:1;stroke:transparent;stroke-width:0"role="presentation"shape-rendering="auto"></path><pathd="M 324, 137.14285714285714 m -5, 0 a 5, 5 0 1,0 10,0 a 5, 5 0 1,0 -10,0"style="fill:blue;opacity:1;stroke:transparent;stroke-width:0"role="presentation"shape-rendering="auto"></path></g><g><pathd="M 126, 162.85714285714286 m -5, 0 a 5, 5 0 1,0 10,0 a 5, 5 0 1,0 -10,0"style="fill:orange;opacity:1;stroke:transparent;stroke-width:0"role="presentation"shape-rendering="auto"></path><pathd="M 258, 111.42857142857143 m -5, 0 a 5, 5 0 1,0 10,0 a 5, 5 0 1,0 -10,0"style="fill:orange;opacity:1;stroke:transparent;stroke-width:0"role="presentation"shape-rendering="auto"></path><pathd="M 390, 60 m -5, 0 a 5, 5 0 1,0 10,0 a 5, 5 0 1,0 -10,0"style="fill:orange;opacity:1;stroke:transparent;stroke-width:0"role="presentation"shape-rendering="auto"></path></g><g><rectvector-effect="non-scaling-stroke"style="fill:none;stroke:#E8E8E8;stroke-width:2;padding:16px"role="presentation"shape-rendering="auto"x="75"y="10"width="93.44375"height="72.98"></rect><pathd="M 97, 32 m -4.8, 0 a 4.8, 4.8 0 1,0 9.6,0 a 4.8, 4.8 0 1,0 -9.6,0"style="fill:blue;type:circle"role="presentation"shape-rendering="auto"></path><pathd="M 97, 58.49 m -4.8, 0 a 4.8, 4.8 0 1,0 9.6,0 a 4.8, 4.8 0 1,0 -9.6,0"style="fill:orange;type:circle"role="presentation"shape-rendering="auto"></path><textid="chart-legend-2-labels-0"direction="inherit"dx="0"x="111.4"y="36.26"><tspanx="111.4"dx="0"dy="0"text-anchor="start"style="font-family:'Inter', 'Helvetica Neue', 'Seravek', 'Helvetica', sans-serif;font-size:12px;font-weight:300;letter-spacing:normal;padding:8px;fill:#292929;stroke:transparent">(x,y)</tspan></text><textid="chart-legend-2-labels-1"direction="inherit"dx="0"x="111.4"y="62.75"><tspanx="111.4"dx="0"dy="0"text-anchor="start"style="font-family:'Inter', 'Helvetica Neue', 'Seravek', 'Helvetica', sans-serif;font-size:12px;font-weight:300;letter-spacing:normal;padding:8px;fill:#292929;stroke:transparent">F(x,y)</tspan></text></g><grole="presentation"><linevector-effect="non-scaling-stroke"style="stroke:#757575;fill:transparent;stroke-width:1;stroke-linecap:round;stroke-linejoin:round"role="presentation"shape-rendering="auto"x1="60"x2="390"y1="240"y2="240"></line><grole="presentation"><textid="chart-axis-3-tickLabels-0"direction="inherit"dx="0"x="126"y="263.26"><tspanx="126"dx="0"dy="0"text-anchor="middle"style="font-family:'Inter', 'Helvetica Neue', 'Seravek', 'Helvetica', sans-serif;font-size:12px;font-weight:300;letter-spacing:normal;padding:8px;fill:#292929;stroke:transparent">0.5</tspan></text></g><grole="presentation"><textid="chart-axis-3-tickLabels-1"direction="inherit"dx="0"x="192"y="263.26"><tspanx="192"dx="0"dy="0"text-anchor="middle"style="font-family:'Inter', 'Helvetica Neue', 'Seravek', 'Helvetica', sans-serif;font-size:12px;font-weight:300;letter-spacing:normal;padding:8px;fill:#292929;stroke:transparent">1.0</tspan></text></g><grole="presentation"><textid="chart-axis-3-tickLabels-2"direction="inherit"dx="0"x="258"y="263.26"><tspanx="258"dx="0"dy="0"text-anchor="middle"style="font-family:'Inter', 'Helvetica Neue', 'Seravek', 'Helvetica', sans-serif;font-size:12px;font-weight:300;letter-spacing:normal;padding:8px;fill:#292929;stroke:transparent">1.5</tspan></text></g><grole="presentation"><textid="chart-axis-3-tickLabels-3"direction="inherit"dx="0"x="324"y="263.26"><tspanx="324"dx="0"dy="0"text-anchor="middle"style="font-family:'Inter','HelveticaNe
<p>Fractal flames use more complex functions, but they all start with this structure.</p>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="fixed-set">Fixed set<ahref="https://speice.io/2024/11/playing-with-fire#fixed-set"class="hash-link"aria-label="Direct link to Fixed set"title="Direct link to Fixed set"></a></h3>
<p>With those definitions in place, let's revisit the initial problem:</p>
<p>Our solution, <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mi>S</mi></mrow><annotationencoding="application/x-tex">S</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.6833em"></span><spanclass="mord mathnormal"style="margin-right:0.05764em">S</span></span></span></span>, is the union of all sets produced by applying each function, <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><msub><mi>F</mi><mi>i</mi></msub></mrow><annotationencoding="application/x-tex">F_i</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.8333em;vertical-align:-0.15em"></span><spanclass="mord"><spanclass="mord mathnormal"style="margin-right:0.13889em">F</span><spanclass="msupsub"><spanclass="vlist-t vlist-t2"><spanclass="vlist-r"><spanclass="vlist"style="height:0.3117em"><spanstyle="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em"><spanclass="pstrut"style="height:2.7em"></span><spanclass="sizing reset-size6 size3 mtight"><spanclass="mord mathnormal mtight">i</span></span></span></span><spanclass="vlist-s"></span></span><spanclass="vlist-r"><spanclass="vlist"style="height:0.15em"><span></span></span></span></span></span></span></span></span></span>,
to points in the solution.</p>
</blockquote>
<p>There's just one small problem: to find the solution, we must already know which points are in the solution.
What?</p>
<p>John E. Hutchinson provides an explanation in the <ahref="https://maths-people.anu.edu.au/~john/Assets/Research%20Papers/fractals_self-similarity.pdf"target="_blank"rel="noopener noreferrer">original paper</a>
defining the mathematics of iterated function systems:</p>
<blockquote>
<p>Furthermore, <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mi>S</mi></mrow><annotationencoding="application/x-tex">S</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.6833em"></span><spanclass="mord mathnormal"style="margin-right:0.05764em">S</span></span></span></span> is compact and is the closure of the set of fixed points <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><msub><mi>s</mi><mrow><msub><mi>i</mi><mn>1</mn></msub><mimathvariant="normal">.</mi><mimathvariant="normal">.</mi><mimathvariant="normal">.</mi><msub><mi>i</mi><mi>p</mi></msub></mrow></msub></mrow><annotationencoding="application/x-tex">s_{i_1...i_p}</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.7779em;vertical-align:-0.3473em"></span><spanclass="mord"><spanclass="mord mathnormal">s</span><spanclass="msupsub"><spanclass="vlist-t vlist-t2"><spanclass="vlist-r"><spanclass="vlist"style="height:0.3117em"><spanstyle="top:-2.55em;margin-left:0em;margin-right:0.05em"><spanclass="pstrut"style="height:2.7em"></span><spanclass="sizing reset-size6 size3 mtight"><spanclass="mord mtight"><spanclass="mord mtight"><spanclass="mord mathnormal mtight">i</span><spanclass="msupsub"><spanclass="vlist-t vlist-t2"><spanclass="vlist-r"><spanclass="vlist"style="height:0.3173em"><spanstyle="top:-2.357em;margin-left:0em;margin-right:0.0714em"><spanclass="pstrut"style="height:2.5em"></span><spanclass="sizing reset-size3 size1 mtight"><spanclass="mord mtight">1</span></span></span></span><spanclass="vlist-s"></span></span><spanclass="vlist-r"><spanclass="vlist"style="height:0.143em"><span></span></span></span></span></span></span><spanclass="mord mtight">...</span><spanclass="mord mtight"><spanclass="mord mathnormal mtight">i</span><spanclass="msupsub"><spanclass="vlist-t vlist-t2"><spanclass="vlist-r"><spanclass="vlist"style="height:0.1645em"><spanstyle="top:-2.357em;margin-left:0em;margin-right:0.0714em"><spanclass="pstrut"style="height:2.5em"></span><spanclass="sizing reset-size3 size1 mtight"><spanclass="mord mathnormal mtight">p</span></span></span></span><spanclass="vlist-s"></span></span><spanclass="vlist-r"><spanclass="vlist"style="height:0.2819em"><span></span></span></span></span></span></span></span></span></span></span><spanclass="vlist-s"></span></span><spanclass="vlist-r"><spanclass="vlist"style="height:0.3473em"><span></span></span></span></span></span></span></span></span></span>
of finite compositions <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><msub><mi>F</mi><mrow><msub><mi>i</mi><mn>1</mn></msub><mimathvariant="normal">.</mi><mimathvariant="normal">.</mi><mimathvariant="normal">.</mi><msub><mi>i</mi><mi>p</mi></msub></mrow></msub></mrow><annotationencoding="application/x-tex">F_{i_1...i_p}</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:1.0307em;vertical-align:-0.3473em"></span><spanclass="mord"><spanclass="mord mathnormal"style="margin-right:0.13889em">F</span><spanclass="msupsub"><spanclass="vlist-t vlist-t2"><spanclass="vlist-r"><spanclass="vlist"style="height:0.3117em"><spanstyle="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em"><spanclass="pstrut"style="height:2.7em"></span><spanclass="sizing reset-size6 size3 mtight"><spanclass="mord mtight"><spanclass="mord mtight"><spanclass="mord mathnormal mtight">i</span><spanclass="msupsub"><spanclass="vlist-t vlist-t2"><spanclass="vlist-r"><spanclass="vlist"style="height:0.3173em"><spanstyle="top:-2.357em;margin-left:0em;margin-right:0.0714em"><spanclass="pstrut"style="height:2.5em"></span><spanclass="sizing reset-size3 size1 mtight"><spanclass="mord mtight">1</span></span></span></span><spanclass="vlist-s"></span></span><spanclass="vlist-r"><spanclass="vlist"style="height:0.143em"><span></span></span></span></span></span></span><spanclass="mord mtight">...</span><spanclass="mord mtight"><spanclass="mord mathnormal mtight">i</span><spanclass="msupsub"><spanclass="vlist-t vlist-t2"><spanclass="vlist-r"><spanclass="vlist"style="height:0.1645em"><spanstyle="top:-2.357em;margin-left:0em;margin-right:0.0714em"><spanclass="pstrut"style="height:2.5em"></span><spanclass="sizing reset-size3 size1 mtight"><spanclass="mord mathnormal mtight">p</span></span></span></span><spanclass="vlist-s"></span></span><spanclass="vlist-r"><spanclass="vlist"style="height:0.2819em"><span></span></span></span></span></span></span></span></span></span></span><spanclass="vlist-s"></span></span><spanclass="vlist-r"><spanclass="vlist"style="height:0.3473em"><span></span></span></span></span></span></span></span></span></span> of members of <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mi>F</mi></mrow><annotationencoding="application/x-tex">F</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.6833em"></span><spanclass="mord mathnormal"style="margin-right:0.13889em">F</span></span></span></span>.</p>
</blockquote>
<p>Before your eyes glaze over, let's unpack this:</p>
<ul>
<li><strong>Furthermore, <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mi>S</mi></mrow><annotationencoding="application/x-tex">S</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.6833em"></span><spanclass="mord mathnormal"style="margin-right:0.05764em">S</span></span></span></span> is <ahref="https://en.wikipedia.org/wiki/Compact_space"target="_blank"rel="noopener noreferrer">compact</a>...</strong>: All points in our solution will be in a finite range</li>
<li><strong>...and is the <ahref="https://en.wikipedia.org/wiki/Closure_(mathematics)"target="_blank"rel="noopener noreferrer">closure</a> of the set of <ahref="https://en.wikipedia.org/wiki/Fixed_point_(mathematics)"target="_blank"rel="noopener noreferrer">fixed points</a></strong>:
Applying our functions to points in the solution will give us other points that are in the solution</li>
<li><strong>...of finite compositions <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><msub><mi>F</mi><mrow><msub><mi>i</mi><mn>1</mn></msub><mimathvariant="normal">.</mi><mimathvariant="normal">.</mi><mimathvariant="normal">.</mi><msub><mi>i</mi><mi>p</mi></msub></mrow></msub></mrow><annotationencoding="application/x-tex">F_{i_1...i_p}</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:1.0307em;vertical-align:-0.3473em"></span><spanclass="mord"><spanclass="mord mathnormal"style="margin-right:0.13889em">F</span><spanclass="msupsub"><spanclass="vlist-t vlist-t2"><spanclass="vlist-r"><spanclass="vlist"style="height:0.3117em"><spanstyle="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em"><spanclass="pstrut"style="height:2.7em"></span><spanclass="sizing reset-size6 size3 mtight"><spanclass="mord mtight"><spanclass="mord mtight"><spanclass="mord mathnormal mtight">i</span><spanclass="msupsub"><spanclass="vlist-t vlist-t2"><spanclass="vlist-r"><spanclass="vlist"style="height:0.3173em"><spanstyle="top:-2.357em;margin-left:0em;margin-right:0.0714em"><spanclass="pstrut"style="height:2.5em"></span><spanclass="sizing reset-size3 size1 mtight"><spanclass="mord mtight">1</span></span></span></span><spanclass="vlist-s"></span></span><spanclass="vlist-r"><spanclass="vlist"style="height:0.143em"><span></span></span></span></span></span></span><spanclass="mord mtight">...</span><spanclass="mord mtight"><spanclass="mord mathnormal mtight">i</span><spanclass="msupsub"><spanclass="vlist-t vlist-t2"><spanclass="vlist-r"><spanclass="vlist"style="height:0.1645em"><spanstyle="top:-2.357em;margin-left:0em;margin-right:0.0714em"><spanclass="pstrut"style="height:2.5em"></span><spanclass="sizing reset-size3 size1 mtight"><spanclass="mord mathnormal mtight">p</span></span></span></span><spanclass="vlist-s"></span></span><spanclass="vlist-r"><spanclass="vlist"style="height:0.2819em"><span></span></span></span></span></span></span></span></span></span></span><spanclass="vlist-s"></span></span><spanclass="vlist-r"><spanclass="vlist"style="height:0.3473em"><span></span></span></span></span></span></span></span></span></span> of members of <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mi>F</mi></mrow><annotationencoding="application/x-tex">F</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.6833em"></span><spanclass="mord mathnormal"style="margin-right:0.13889em">F</span></span></span></span></strong>: By composing our functions (that is,
using the output of one function as input to the next), we will arrive at the points in the solution</li>
</ul>
<p>Thus, by applying the functions to fixed points of our system, we will find the other points we care about.</p>
<detailsclass="details_lb9f alert alert--info details_b_Ee"data-collapsed="true"><summary>If you want a bit more math...</summary><div><divclass="collapsibleContent_i85q"><p>...then there are some extra details I've glossed over so far.</p><p>First, the Hutchinson paper requires that the functions <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><msub><mi>F</mi><mi>i</mi></msub></mrow><annotationencoding="application/x-tex">F_i</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.8333em;vertical-align:-0.15em"></span><spanclass="mord"><spanclass="mord mathnormal"style="margin-right:0.13889em">F</span><spanclass="msupsub"><spanclass="vlist-t vlist-t2"><spanclass="vlist-r"><spanclass="vlist"style="height:0.3117em"><spanstyle="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em"><spanclass="pstrut"style="height:2.7em"></span><spanclass="sizing reset-size6 size3 mtight"><spanclass="mord mathnormal mtight">i</span></span></span></span><spanclass="vlist-s"></span></span><spanclass="vlist-r"><spanclass="vlist"style="height:0.15em"><span></span></span></span></span></span></span></span></span></span> be <em>contractive</em> for the solution set to exist.
That is, applying the function to a point must bring it closer to other points. However, as the fractal flame
algorithm demonstrates, we only need functions to be contractive <em>on average</em>. At worst, the system will
degenerate and produce a bad image.</p><p>Second, we're focused on <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><msup><mimathvariant="double-struck">R</mi><mn>2</mn></msup></mrow><annotationencoding="application/x-tex">\mathbb{R}^2</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.8141em"></span><spanclass="mord"><spanclass="mord mathbb">R</span><spanclass="msupsub"><spanclass="vlist-t"><spanclass="vlist-r"><spanclass="vlist"style="height:0.8141em"><spanstyle="top:-3.063em;margin-right:0.05em"><spanclass="pstrut"style="height:2.7em"></span><spanclass="sizing reset-size6 size3 mtight"><spanclass="mord mtight">2</span></span></span></span></span></span></span></span></span></span></span> because we're generating images, but the math
allows for arbitrary dimensions; you could also have 3-dimensional fractal flames.</p><p>Finally, there's a close relationship between fractal flames and <ahref="https://en.wikipedia.org/wiki/Attractor"target="_blank"rel="noopener noreferrer">attractors</a>.
Specifically, the fixed points of <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mi>S</mi></mrow><annotationencoding="application/x-tex">S</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.6833em"></span><spanclass="mord mathnormal"style="margin-right:0.05764em">S</span></span></span></span> act as attractors for the chaos game (explained below).</p></div></div></details>
<p>This is still a bit vague, so let's work through an example.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="sierpinskis-gasket"><ahref="https://www.britannica.com/biography/Waclaw-Sierpinski"target="_blank"rel="noopener noreferrer">Sierpinski's gasket</a><ahref="https://speice.io/2024/11/playing-with-fire#sierpinskis-gasket"class="hash-link"aria-label="Direct link to sierpinskis-gasket"title="Direct link to sierpinskis-gasket"></a></h2>
<p>The Fractal Flame paper gives three functions to use for a first IFS:</p>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="the-chaos-game">The chaos game<ahref="https://speice.io/2024/11/playing-with-fire#the-chaos-game"class="hash-link"aria-label="Direct link to The chaos game"title="Direct link to The chaos game"></a></h3>
<p>Now, how do we find the "fixed points" mentioned earlier? The paper lays out an algorithm called the "<ahref="https://en.wikipedia.org/wiki/Chaos_game"target="_blank"rel="noopener noreferrer">chaos game</a>"
<p>Let's turn this into code, one piece at a time.</p>
<p>To start, we need to generate some random numbers. The "bi-unit square" is the range <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mostretchy="false">[</mo><mo>−</mo><mn>1</mn><moseparator="true">,</mo><mn>1</mn><mostretchy="false">]</mo></mrow><annotationencoding="application/x-tex">[-1, 1]</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:1em;vertical-align:-0.25em"></span><spanclass="mopen">[</span><spanclass="mord">−</span><spanclass="mord">1</span><spanclass="mpunct">,</span><spanclass="mspace"style="margin-right:0.1667em"></span><spanclass="mord">1</span><spanclass="mclose">]</span></span></span></span>,
<p>Next, we need to choose a random integer from <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mn>0</mn></mrow><annotationencoding="application/x-tex">0</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.6444em"></span><spanclass="mord">0</span></span></span></span> to <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mi>n</mi><mo>−</mo><mn>1</mn></mrow><annotationencoding="application/x-tex">n - 1</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.6667em;vertical-align:-0.0833em"></span><spanclass="mord mathnormal">n</span><spanclass="mspace"style="margin-right:0.2222em"></span><spanclass="mbin">−</span><spanclass="mspace"style="margin-right:0.2222em"></span></span><spanclass="base"><spanclass="strut"style="height:0.6444em"></span><spanclass="mord">1</span></span></span></span>:</p>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="plotting">Plotting<ahref="https://speice.io/2024/11/playing-with-fire#plotting"class="hash-link"aria-label="Direct link to Plotting"title="Direct link to Plotting"></a></h3>
<p>Finally, implementing the <code>plot</code> function. This blog series is interactive,
so everything displays directly in the browser. As an alternative,
software like <code>flam3</code> and Apophysis can "plot" by saving an image to disk.</p>
<p>To see the results, we'll use the <ahref="https://developer.mozilla.org/en-US/docs/Web/API/Canvas_API"target="_blank"rel="noopener noreferrer">Canvas API</a>.
This allows us to manipulate individual pixels in an image and show it on screen.</p>
<p>First, we need to convert from fractal flame coordinates to pixel coordinates.
To simplify things, we'll assume that we're plotting a square image
with range <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mostretchy="false">[</mo><mn>0</mn><moseparator="true">,</mo><mn>1</mn><mostretchy="false">]</mo></mrow><annotationencoding="application/x-tex">[0, 1]</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:1em;vertical-align:-0.25em"></span><spanclass="mopen">[</span><spanclass="mord">0</span><spanclass="mpunct">,</span><spanclass="mspace"style="margin-right:0.1667em"></span><spanclass="mord">1</span><spanclass="mclose">]</span></span></span></span> for both <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mi>x</mi></mrow><annotationencoding="application/x-tex">x</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.4306em"></span><spanclass="mord mathnormal">x</span></span></span></span> and <spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><mi>y</mi></mrow><annotationencoding="application/x-tex">y</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.625em;vertical-align:-0.1944em"></span><spanclass="mord mathnormal"style="margin-right:0.03588em">y</span></span></span></span>:</p>
<p>Next, we'll store the pixel data in an <ahref="https://developer.mozilla.org/en-US/docs/Web/API/ImageData"target="_blank"rel="noopener noreferrer"><code>ImageData</code> object</a>.
Each pixel on screen has a corresponding index in the <code>data</code> array.
To plot a point, we set that pixel to be black:</p>
<small></small><p><small>The image here is slightly different than in the paper.
I think the paper has an error, so I'm plotting the image
like the <ahref="https://github.com/scottdraves/flam3/blob/7fb50c82e90e051f00efcc3123d0e06de26594b2/rect.c#L440-L441"target="_blank"rel="noopener noreferrer">reference implementation</a>.</small>
</p><h3class="anchor anchorWithStickyNavbar_LWe7"id="weights">Weights<ahref="https://speice.io/2024/11/playing-with-fire#weights"class="hash-link"aria-label="Direct link to Weights"title="Direct link to Weights"></a></h3>
<p>There's one last step before we finish the introduction. So far, each transform has
the same chance of being picked in the chaos game.
We can change that by giving them a "weight" (<spanclass="katex"><spanclass="katex-mathml"><math><semantics><mrow><msub><mi>w</mi><mi>i</mi></msub></mrow><annotationencoding="application/x-tex">w_i</annotation></semantics></math></span><spanclass="katex-html"aria-hidden="true"><spanclass="base"><spanclass="strut"style="height:0.5806em;vertical-align:-0.15em"></span><spanclass="mord"><spanclass="mord mathnormal"style="margin-right:0.02691em">w</span><spanclass="msupsub"><spanclass="vlist-t vlist-t2"><spanclass="vlist-r"><spanclass="vlist"style="height:0.3117em"><spanstyle="top:-2.55em;margin-left:-0.0269em;margin-right:0.05em"><spanclass="pstrut"style="height:2.7em"></span><spanclass="sizing reset-size6 size3 mtight"><spanclass="mord mathnormal mtight">i</span></span></span></span><spanclass="vlist-s"></span></span><spanclass="vlist-r"><spanclass="vlist"style="height:0.15em"><span></span></span></span></span></span></span></span></span></span>) instead:</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="summary">Summary<ahref="https://speice.io/2024/11/playing-with-fire#summary"class="hash-link"aria-label="Direct link to Summary"title="Direct link to Summary"></a></h2>
<p>Studying the foundations of fractal flames is challenging,
but we now have an understanding of the mathematics
and the implementation of iterated function systems.</p>
<p>In the next post, we'll look at the first innovation of fractal flame algorithm: variations.</p>]]></content:encoded>
<description><![CDATA[This started because I wanted to build a synthesizer. Setting a goal of "digital DX7" was ambitious, but I needed something unrelated to the day job. Beyond that, working with audio seemed like a good challenge. I enjoy performance-focused code, and performance problems in audio are conspicuous. Building a web project was an obvious choice because of the web audio API documentation and independence from a large Digital Audio Workstation (DAW).]]></description>
<content:encoded><![CDATA[<p>This started because I wanted to build a synthesizer. Setting a goal of "digital DX7" was ambitious, but I needed something unrelated to the day job. Beyond that, working with audio seemed like a good challenge. I enjoy performance-focused code, and performance problems in audio are conspicuous. Building a web project was an obvious choice because of the web audio API documentation and independence from a large Digital Audio Workstation (DAW).</p>
<p>The project was soon derailed trying to sort out technical issues unrelated to the original purpose. Finding a resolution was a frustrating journey, and it's still not clear whether those problems were my fault. As a result, I'm writing this to try making sense of it, as a case study/reference material, and to salvage something from the process.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="starting-strong">Starting strong<ahref="https://speice.io/2011/11/webpack-industrial-complex#starting-strong"class="hash-link"aria-label="Direct link to Starting strong"title="Direct link to Starting strong"></a></h2>
<p>The sole starting requirement was to write everything in TypeScript. Not because of project scale, but because guardrails help with unfamiliar territory. Keeping that in mind, the first question was: how does one start a new project? All I actually need is "compile TypeScript, show it in a browser."</p>
<p>Create React App (CRA) came to the rescue and the rest of that evening was a joy. My TypeScript/JavaScript skills were rusty, but the online documentation was helpful. I had never understood the appeal of JSX (why put a DOM in JavaScript?) until it made connecting an <code>onEvent</code> handler and a function easy.</p>
<p>Some quick dimensional analysis later and there was a sine wave oscillator playing A=440 through the speakers. I specifically remember thinking "modern browsers are magical."</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="continuing-on">Continuing on<ahref="https://speice.io/2011/11/webpack-industrial-complex#continuing-on"class="hash-link"aria-label="Direct link to Continuing on"title="Direct link to Continuing on"></a></h2>
<p>Now comes the first mistake: I began to worry about "scale" before encountering an actual problem. Rather than rendering audio in the main thread, why not use audio worklets and render in a background thread instead?</p>
<p>The first sign something was amiss came from the TypeScript compiler errors showing the audio worklet API <ahref="https://github.com/microsoft/TypeScript/issues/28308"target="_blank"rel="noopener noreferrer">was missing</a>. After searching out Github issues and (unsuccessfully) tweaking the <code>.tsconfig</code> settings, I settled on installing a package and moving on.</p>
<p>The next problem came from actually using the API. Worklets must load from separate "modules," but it wasn't clear how to guarantee the worklet code stayed separate from the application. I saw recommendations to use <code>new URL(<local path>, import.meta.url)</code> and it worked! Well, kind of:</p>
<p>That file has the audio processor code, so why does it get served with <code>Content-Type: video/mp2t</code>?</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="floundering-about">Floundering about<ahref="https://speice.io/2011/11/webpack-industrial-complex#floundering-about"class="hash-link"aria-label="Direct link to Floundering about"title="Direct link to Floundering about"></a></h2>
<p>Now comes the second mistake: even though I didn't understand the error, I ignored recommendations to <ahref="https://hackernoon.com/implementing-audioworklets-with-react-8a80a470474"target="_blank"rel="noopener noreferrer">just use JavaScript</a> and stuck by the original TypeScript requirement.</p>
<p>I tried different project structures. Moving the worklet code to a new folder didn't help, nor did setting up a monorepo and placing it in a new package.</p>
<p>I tried three different CRA tools - <code>react-app-rewired</code>, <code>craco</code>, <code>customize-react-app</code> - but got the same problem. Each has varying levels of compatibility with recent CRA versions, so it wasn't clear if I had the right solution but implemented it incorrectly. After attempting to eject the application and panicking after seeing the configuration, I abandoned that as well.</p>
<p>I tried changing the webpack configuration: using <ahref="https://github.com/webpack/webpack/issues/11543#issuecomment-917673256"target="_blank"rel="noopener noreferrer">new</a><ahref="https://github.com/popelenkow/worker-url"target="_blank"rel="noopener noreferrer">loaders</a>, setting <ahref="https://github.com/webpack/webpack/discussions/14093#discussioncomment-1257149"target="_blank"rel="noopener noreferrer">asset rules</a>, even <ahref="https://github.com/webpack/webpack/issues/11543#issuecomment-826897590"target="_blank"rel="noopener noreferrer">changing how webpack detects worker resources</a>. In hindsight, entry points may have been the answer. But because CRA actively resists attempts to change its webpack configuration, and I couldn't find audio worklet examples in any other framework, I gave up.</p>
<p>I tried so many application frameworks. Next.js looked like a good candidate, but added its own <ahref="https://github.com/vercel/next.js/issues/24907"target="_blank"rel="noopener noreferrer">bespoke webpack complexity</a> to the existing confusion. Astro had the best "getting started" experience, but I refuse to install an IDE-specific plugin. I first used Deno while exploring Lume, but it couldn't import the audio worklet types (maybe because of module compatibility?). Each framework was unique in its own way (shout-out to SvelteKit) but I couldn't figure out how to make them work.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="learning-and-reflecting">Learning and reflecting<ahref="https://speice.io/2011/11/webpack-industrial-complex#learning-and-reflecting"class="hash-link"aria-label="Direct link to Learning and reflecting"title="Direct link to Learning and reflecting"></a></h2>
<p>I ended up using Vite and vite-plugin-react-pages to handle both "build the app" and "bundle worklets," but the specific tool choice isn't important. Instead, the focus should be on lessons learned.</p>
<p>For myself:</p>
<ul>
<li>I'm obsessed with tooling, to the point it can derail the original goal. While it comes from a good place (for example: "types are awesome"), it can get in the way of more important work</li>
<li>I tend to reach for online resources right after seeing a new problem. While finding help online is often faster, spending time understanding the problem would have been more productive than cycling through (often outdated) blog posts</li>
</ul>
<p>For the tools:</p>
<ul>
<li>Resource bundling is great and solves a genuine challenge. I've heard too many horror stories of developers writing modules by hand to believe this is unnecessary complexity</li>
<li>Webpack is a build system and modern frameworks are deeply dependent on it (hence the "webpack industrial complex"). While this often saves users from unnecessary complexity, there's no path forward if something breaks</li>
<li>There's little ability to mix and match tools across frameworks. Next.js and Gatsby let users extend webpack, but because each framework adds its own modules, changes aren't portable. After spending a week looking at webpack, I had an example running with parcel in thirty minutes, but couldn't integrate it</li>
</ul>
<p>In the end, learning new systems is fun, but a focus on tools that "just work" can leave users out in the cold if they break down.</p>]]></content:encoded>
<description><![CDATA[Complaining about the Global Interpreter Lock]]></description>
<content:encoded><![CDATA[<p>Complaining about the <ahref="https://wiki.python.org/moin/GlobalInterpreterLock"target="_blank"rel="noopener noreferrer">Global Interpreter Lock</a>
(GIL) seems like a rite of passage for Python developers. It's easy to criticize a design decision
made before multi-core CPU's were widely available, but the fact that it's still around indicates
that it generally works <ahref="https://wiki.c2.com/?PrematureOptimization"target="_blank"rel="noopener noreferrer">Good</a>
<ahref="https://wiki.c2.com/?YouArentGonnaNeedIt"target="_blank"rel="noopener noreferrer">Enough</a>. Besides, there are simple and effective
workarounds; it's not hard to start a
<ahref="https://docs.python.org/3/library/multiprocessing.html"target="_blank"rel="noopener noreferrer">new process</a> and use message passing to
synchronize code running in parallel.</p>
<p>Still, wouldn't it be nice to have more than a single active interpreter thread? In an age of
asynchronicity and <em>M:N</em> threading, Python seems lacking. The ideal scenario is to take advantage of
both Python's productivity and the modern CPU's parallel capabilities.</p>
<p>Presented below are two strategies for releasing the GIL's icy grip without giving up on what makes
Python a nice language to start with. Bear in mind: these are just the tools, no claim is made about
whether it's a good idea to use them. Very often, unlocking the GIL is an
<ahref="https://en.wikipedia.org/wiki/XY_problem"target="_blank"rel="noopener noreferrer">XY problem</a>; you want application performance, and the
GIL seems like an obvious bottleneck. Remember that any gains from running code in parallel come at
the expense of project complexity; messing with the GIL is ultimately messing with Python's memory
<h2class="anchor anchorWithStickyNavbar_LWe7"id="cython">Cython<ahref="https://speice.io/2019/12/release-the-gil#cython"class="hash-link"aria-label="Direct link to Cython"title="Direct link to Cython"></a></h2>
<p>Put simply, <ahref="https://cython.org/"target="_blank"rel="noopener noreferrer">Cython</a> is a programming language that looks a lot like Python,
gets <ahref="https://en.wikipedia.org/wiki/Source-to-source_compiler"target="_blank"rel="noopener noreferrer">transpiled</a> to C/C++, and integrates
well with the <ahref="https://en.wikipedia.org/wiki/CPython"target="_blank"rel="noopener noreferrer">CPython</a> API. It's great for building Python
wrappers to C and C++ libraries, writing optimized code for numerical processing, and tons more. And
when it comes to managing the GIL, there are two special features:</p>
the GIL, so these conditions can be used to synchronize access. Finally, Cython's documentation for
<ahref="https://cython.readthedocs.io/en/latest/src/userguide/external_C_code.html#acquiring-and-releasing-the-gil"target="_blank"rel="noopener noreferrer">external C code</a>
contains more detail on how to safely manage the GIL.</p>
<p>To conclude: use Cython's <code>nogil</code> annotation to assert that functions are safe for calling when the
GIL is unlocked, and <code>with nogil</code> to actually unlock the GIL and run those functions.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="numba">Numba<ahref="https://speice.io/2019/12/release-the-gil#numba"class="hash-link"aria-label="Direct link to Numba"title="Direct link to Numba"></a></h2>
<p>Like Cython, <ahref="https://numba.pydata.org/"target="_blank"rel="noopener noreferrer">Numba</a> is a "compiled Python." Where Cython works by
compiling a Python-like language to C/C++, Numba compiles Python bytecode <em>directly to machine code</em>
at runtime. Behavior is controlled with a special <code>@jit</code> decorator; calling a decorated function
first compiles it to machine code before running. Calling the function a second time re-uses that
machine code unless the argument types have changed.</p>
<p>Numba works best when a <code>nopython=True</code> argument is added to the <code>@jit</code> decorator; functions
compiled in <ahref="http://numba.pydata.org/numba-doc/latest/user/jit.html?#nopython"target="_blank"rel="noopener noreferrer"><code>nopython</code></a> mode
avoid the CPython API and have performance comparable to C. Further, adding <code>nogil=True</code> to the
<code>@jit</code> decorator unlocks the GIL while that function is running. Note that <code>nogil</code> and <code>nopython</code>
are separate arguments; while it is necessary for code to be compiled in <code>nopython</code> mode in order to
release the lock, the GIL will remain locked if <code>nogil=False</code> (the default).</p>
<p>Let's repeat the same experiment, this time using Numba instead of Cython:</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="conclusion">Conclusion<ahref="https://speice.io/2019/12/release-the-gil#conclusion"class="hash-link"aria-label="Direct link to Conclusion"title="Direct link to Conclusion"></a></h2>
<p>Before finishing, it's important to address pain points that will show up if these techniques are
used in a more realistic project:</p>
<p>First, code running in a GIL-free context will likely also need non-trivial data structures;
GIL-free functions aren't useful if they're constantly interacting with Python objects whose access
requires the GIL. Cython provides
<ahref="http://docs.cython.org/en/latest/src/tutorial/cdef_classes.html"target="_blank"rel="noopener noreferrer">extension types</a> and Numba
provides a <ahref="https://numba.pydata.org/numba-doc/dev/user/jitclass.html"target="_blank"rel="noopener noreferrer"><code>@jitclass</code></a> decorator to
address this need.</p>
<p>Second, building and distributing applications that make use of Cython/Numba can be complicated.
Cython packages require running the compiler, (potentially) linking/packaging external dependencies,
and distributing a binary wheel. Numba is generally simpler because the code being distributed is
pure Python, but can be tricky since errors aren't detected until runtime.</p>
<p>Finally, while unlocking the GIL is often a solution in search of a problem, both Cython and Numba
provide tools to directly manage the GIL when appropriate. This enables true parallelism (not just
<ahref="https://stackoverflow.com/a/1050257"target="_blank"rel="noopener noreferrer">concurrency</a>) that is impossible in vanilla Python.</p>]]></content:encoded>
<description><![CDATA[I've found that in many personal projects,]]></description>
<content:encoded><![CDATA[<p>I've found that in many personal projects,
<ahref="https://en.wikipedia.org/wiki/Analysis_paralysis"target="_blank"rel="noopener noreferrer">analysis paralysis</a> is particularly deadly.
Making good decisions in the beginning avoids pain and suffering later; if extra research prevents
future problems, I'm happy to continue <del>procrastinating</del> researching indefinitely.</p>
<p>So let's say you're in need of a binary serialization format. Data will be going over the network,
not just in memory, so having a schema document and code generation is a must. Performance is
crucial, so formats that support zero-copy de/serialization are given priority. And the more
languages supported, the better; I use Rust, but can't predict what other languages this could
interact with.</p>
<p>Given these requirements, the candidates I could find were:</p>
<ol>
<li><ahref="https://capnproto.org/"target="_blank"rel="noopener noreferrer">Cap'n Proto</a> has been around the longest, and is the most established</li>
<li><ahref="https://google.github.io/flatbuffers/"target="_blank"rel="noopener noreferrer">Flatbuffers</a> is the newest, and claims to have a simpler
encoding</li>
<li><ahref="https://github.com/real-logic/simple-binary-encoding"target="_blank"rel="noopener noreferrer">Simple Binary Encoding</a> has the simplest
encoding, but the Rust implementation is unmaintained</li>
</ol>
<p>Any one of these will satisfy the project requirements: easy to transmit over a network, reasonably
fast, and polyglot support. But how do you actually pick one? It's impossible to know what issues
will follow that choice, so I tend to avoid commitment until the last possible moment.</p>
<p>Still, a choice must be made. Instead of worrying about which is "the best," I decided to build a
small proof-of-concept system in each format and pit them against each other. All code can be found
in the <ahref="https://github.com/speice-io/marketdata-shootout"target="_blank"rel="noopener noreferrer">repository</a> for this post.</p>
<p>We'll discuss more in detail, but a quick preview of the results:</p>
<ul>
<li>Cap'n Proto: Theoretically performs incredibly well, the implementation had issues</li>
<li>Flatbuffers: Has some quirks, but largely lived up to its "zero-copy" promises</li>
<li>SBE: Best median and worst-case performance, but the message structure has a limited feature set</li>
</ul>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="prologue-binary-parsing-with-nom">Prologue: Binary Parsing with Nom<ahref="https://speice.io/2019/09/binary-format-shootout#prologue-binary-parsing-with-nom"class="hash-link"aria-label="Direct link to Prologue: Binary Parsing with Nom"title="Direct link to Prologue: Binary Parsing with Nom"></a></h2>
<p>Our benchmark system will be a simple data processor; given depth-of-book market data from
<ahref="https://iextrading.com/trading/market-data/#deep"target="_blank"rel="noopener noreferrer">IEX</a>, serialize each message into the schema
format, read it back, and calculate total size of stock traded and the lowest/highest quoted prices.
This test isn't complex, but is representative of the project I need a binary format for.</p>
<p>But before we make it to that point, we have to actually read in the market data. To do so, I'm
using a library called <ahref="https://github.com/Geal/nom"target="_blank"rel="noopener noreferrer"><code>nom</code></a>. Version 5.0 was recently released and
brought some big changes, so this was an opportunity to build a non-trivial program and get
familiar.</p>
<p>If you don't already know about <code>nom</code>, it's a "parser generator". By combining different smaller
parsers, you can assemble a parser to handle complex structures without writing tedious code by
<p>Ultimately, because the <code>nom</code> code in this shootout was the same for all formats, we're not too
interested in its performance. Still, it's worth mentioning that building the market data parser was
actually fun; I didn't have to write tons of boring code by hand.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="capn-proto">Cap'n Proto<ahref="https://speice.io/2019/09/binary-format-shootout#capn-proto"class="hash-link"aria-label="Direct link to Cap'n Proto"title="Direct link to Cap'n Proto"></a></h2>
<p>Now it's time to get into the meaty part of the story. Cap'n Proto was the first format I tried
because of how long it has supported Rust (thanks to <ahref="https://github.com/dwrensha"target="_blank"rel="noopener noreferrer">dwrensha</a> for
maintaining the Rust port since
<ahref="https://github.com/capnproto/capnproto-rust/releases/tag/rustc-0.10"target="_blank"rel="noopener noreferrer">2014!</a>). However, I had a ton
of performance concerns once I started using it.</p>
<p>To serialize new messages, Cap'n Proto uses a "builder" object. This builder allocates memory on the
heap to hold the message content, but because builders
<ahref="https://github.com/capnproto/capnproto-rust/issues/111"target="_blank"rel="noopener noreferrer">can't be re-used</a>, we have to allocate a
new buffer for every single message. I was able to work around this with a
<ahref="https://doc.rust-lang.org/std/mem/fn.transmute.html"target="_blank"rel="noopener noreferrer"><code>std::mem::transmute</code></a> to bypass Rust's borrow
checker.</p>
<p>The process of reading messages was better, but still had issues. Cap'n Proto has two message
encodings: a <ahref="https://capnproto.org/encoding.html#packing"target="_blank"rel="noopener noreferrer">"packed"</a> representation, and an
"unpacked" version. When reading "packed" messages, we need a buffer to unpack the message into
before we can use it; Cap'n Proto allocates a new buffer for each message we unpack, and I wasn't
able to figure out a way around that. In contrast, the unpacked message format should be where Cap'n
Proto shines; its main selling point is that there's <ahref="https://capnproto.org/"target="_blank"rel="noopener noreferrer">no decoding step</a>.
However, accomplishing zero-copy deserialization required code in the private API
(<ahref="https://github.com/capnproto/capnproto-rust/issues/148"target="_blank"rel="noopener noreferrer">since fixed</a>), and we allocate a vector on
every read for the segment table.</p>
<p>In the end, I put in significant work to make Cap'n Proto as fast as possible, but there were too
many issues for me to feel comfortable using it long-term.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="flatbuffers">Flatbuffers<ahref="https://speice.io/2019/09/binary-format-shootout#flatbuffers"class="hash-link"aria-label="Direct link to Flatbuffers"title="Direct link to Flatbuffers"></a></h2>
<p>This is the new kid on the block. After a
<ahref="https://github.com/google/flatbuffers/pull/3894"target="_blank"rel="noopener noreferrer">first attempt</a> didn't pan out, official support
was <ahref="https://github.com/google/flatbuffers/pull/4898"target="_blank"rel="noopener noreferrer">recently launched</a>. Flatbuffers intends to
address the same problems as Cap'n Proto: high-performance, polyglot, binary messaging. The
difference is that Flatbuffers claims to have a simpler wire format and
in a <code>SmallVec</code> before building the final <code>MultiMessage</code>, but it was a painful process that I
believe contributed to poor serialization performance.</p>
<p>Second, streaming support in Flatbuffers seems to be something of an
<ahref="https://github.com/google/flatbuffers/issues/3898"target="_blank"rel="noopener noreferrer">afterthought</a>. Where Cap'n Proto in Rust handles
reading messages from a stream as part of the API, Flatbuffers just sticks a <code>u32</code> at the front of
each message to indicate the size. Not specifically a problem, but calculating message size without
that tag is nigh on impossible.</p>
<p>Ultimately, I enjoyed using Flatbuffers, and had to do significantly less work to make it perform
well.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="simple-binary-encoding">Simple Binary Encoding<ahref="https://speice.io/2019/09/binary-format-shootout#simple-binary-encoding"class="hash-link"aria-label="Direct link to Simple Binary Encoding"title="Direct link to Simple Binary Encoding"></a></h2>
<p>Support for SBE was added by the author of one of my favorite
<ahref="https://web.archive.org/web/20190427124806/https://polysync.io/blog/session-types-for-hearty-codecs/"target="_blank"rel="noopener noreferrer">Rust blog posts</a>.
I've <ahref="https://speice.io/2019/06/high-performance-systems">talked previously</a> about how important
variance is in high-performance systems, so it was encouraging to read about a format that
<ahref="https://github.com/real-logic/simple-binary-encoding/wiki/Why-Low-Latency"target="_blank"rel="noopener noreferrer">directly addressed</a> my
concerns. SBE has by far the simplest binary format, but it does make some tradeoffs.</p>
<p>Both Cap'n Proto and Flatbuffers use <ahref="https://capnproto.org/encoding.html#structs"target="_blank"rel="noopener noreferrer">message offsets</a>
to handle variable-length data, <ahref="https://capnproto.org/language.html#unions"target="_blank"rel="noopener noreferrer">unions</a>, and various
other features. In contrast, messages in SBE are essentially
However, if you don't need union types, and can accept that schemas are XML documents, it's still
worth using. SBE's implementation had the best streaming support of all formats I tested, and
doesn't trigger allocation during de/serialization.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="results">Results<ahref="https://speice.io/2019/09/binary-format-shootout#results"class="hash-link"aria-label="Direct link to Results"title="Direct link to Results"></a></h2>
<ahref="https://github.com/speice-io/marketdata-shootout/blob/master/src/sbe_runner.rs"target="_blank"rel="noopener noreferrer">format</a>, it was
time to actually take them for a spin. I used
<ahref="https://github.com/speice-io/marketdata-shootout/blob/master/run_shootout.sh"target="_blank"rel="noopener noreferrer">this script</a> to run
the benchmarks, and the raw results are
<ahref="https://github.com/speice-io/marketdata-shootout/blob/master/shootout.csv"target="_blank"rel="noopener noreferrer">here</a>. All data reported
below is the average of 10 runs on a single day of IEX data. Results were validated to make sure
that each format parsed the data correctly.</p>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="serialization">Serialization<ahref="https://speice.io/2019/09/binary-format-shootout#serialization"class="hash-link"aria-label="Direct link to Serialization"title="Direct link to Serialization"></a></h3>
how long it takes to serialize the IEX message into the desired format and write to a pre-allocated
buffer.</p>
<table><thead><tr><thstyle="text-align:left">Schema</th><thstyle="text-align:left">Median</th><thstyle="text-align:left">99th Pctl</th><thstyle="text-align:left">99.9th Pctl</th><thstyle="text-align:left">Total</th></tr></thead><tbody><tr><tdstyle="text-align:left">Cap'n Proto Packed</td><tdstyle="text-align:left">413ns</td><tdstyle="text-align:left">1751ns</td><tdstyle="text-align:left">2943ns</td><tdstyle="text-align:left">14.80s</td></tr><tr><tdstyle="text-align:left">Cap'n Proto Unpacked</td><tdstyle="text-align:left">273ns</td><tdstyle="text-align:left">1828ns</td><tdstyle="text-align:left">2836ns</td><tdstyle="text-align:left">10.65s</td></tr><tr><tdstyle="text-align:left">Flatbuffers</td><tdstyle="text-align:left">355ns</td><tdstyle="text-align:left">2185ns</td><tdstyle="text-align:left">3497ns</td><tdstyle="text-align:left">14.31s</td></tr><tr><tdstyle="text-align:left">SBE</td><tdstyle="text-align:left">91ns</td><tdstyle="text-align:left">1535ns</td><tdstyle="text-align:left">2423ns</td><tdstyle="text-align:left">3.91s</td></tr></tbody></table>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="deserialization">Deserialization<ahref="https://speice.io/2019/09/binary-format-shootout#deserialization"class="hash-link"aria-label="Direct link to Deserialization"title="Direct link to Deserialization"></a></h3>
how long it takes to read the previously-serialized message and perform some basic aggregation. The
aggregation code is the same for each format, so any performance differences are due solely to the
format implementation.</p>
<table><thead><tr><thstyle="text-align:left">Schema</th><thstyle="text-align:left">Median</th><thstyle="text-align:left">99th Pctl</th><thstyle="text-align:left">99.9th Pctl</th><thstyle="text-align:left">Total</th></tr></thead><tbody><tr><tdstyle="text-align:left">Cap'n Proto Packed</td><tdstyle="text-align:left">539ns</td><tdstyle="text-align:left">1216ns</td><tdstyle="text-align:left">2599ns</td><tdstyle="text-align:left">18.92s</td></tr><tr><tdstyle="text-align:left">Cap'n Proto Unpacked</td><tdstyle="text-align:left">366ns</td><tdstyle="text-align:left">737ns</td><tdstyle="text-align:left">1583ns</td><tdstyle="text-align:left">12.32s</td></tr><tr><tdstyle="text-align:left">Flatbuffers</td><tdstyle="text-align:left">173ns</td><tdstyle="text-align:left">421ns</td><tdstyle="text-align:left">1007ns</td><tdstyle="text-align:left">6.00s</td></tr><tr><tdstyle="text-align:left">SBE</td><tdstyle="text-align:left">116ns</td><tdstyle="text-align:left">286ns</td><tdstyle="text-align:left">659ns</td><tdstyle="text-align:left">4.05s</td></tr></tbody></table>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="conclusion">Conclusion<ahref="https://speice.io/2019/09/binary-format-shootout#conclusion"class="hash-link"aria-label="Direct link to Conclusion"title="Direct link to Conclusion"></a></h2>
<p>Building a benchmark turned out to be incredibly helpful in making a decision; because a "union"
type isn't important to me, I can be confident that SBE best addresses my needs.</p>
<p>While SBE was the fastest in terms of both median and worst-case performance, its worst case
performance was proportionately far higher than any other format. It seems to be that
de/serialization time scales with message size, but I'll need to do some more research to understand
what exactly is going on.</p>]]></content:encoded>
</item>
<item>
<title><![CDATA[On building high performance systems]]></title>
<description><![CDATA[Prior to working in the trading industry, my assumption was that High Frequency Trading (HFT) is]]></description>
<content:encoded><![CDATA[<p>Prior to working in the trading industry, my assumption was that High Frequency Trading (HFT) is
made up of people who have access to secret techniques mortal developers could only dream of. There
had to be some secret art that could only be learned if one had an appropriately tragic backstory.</p>
<p><imgdecoding="async"loading="lazy"alt="Kung Fu fight"src="https://speice.io/assets/images/kung-fu-5715f30eef7bf3aaa26770b1247024dc.webp"width="426"height="240"class="img_ev3q"></p>
<blockquote>
<p>How I assumed HFT people learn their secret techniques</p>
</blockquote>
<p>How else do you explain people working on systems that complete the round trip of market data in to
orders out (a.k.a. tick-to-trade) consistently within
<ahref="https://stackoverflow.com/a/22082528/1454178"target="_blank"rel="noopener noreferrer">750-800 nanoseconds</a>? In roughly the time it takes a
trading systems are capable of reading the market data packets, deciding what orders to send, doing
risk checks, creating new packets for exchange-specific protocols, and putting those packets on the
wire.</p>
<p>Having now worked in the trading industry, I can confirm the developers aren't super-human; I've
made some simple mistakes at the very least. Instead, what shows up in public discussions is that
philosophy, not technique, separates high-performance systems from everything else.
Performance-critical systems don't rely on "this one cool C++ optimization trick" to make code fast
(though micro-optimizations have their place); there's a lot more to worry about than just the code
written for the project.</p>
<p>The framework I'd propose is this: <strong>If you want to build high-performance systems, focus first on
reducing performance variance</strong> (reducing the gap between the fastest and slowest runs of the same
code), <strong>and only look at average latency once variance is at an acceptable level</strong>.</p>
<p>Don't get me wrong, I'm a much happier person when things are fast. Computer goes from booting in 20
seconds down to 10 because I installed a solid-state drive? Awesome. But if every fifth day it takes
a full minute to boot because of corrupted sectors? Not so great. Average speed over the course of a
week is the same in each situation, but you're painfully aware of that minute when it happens. When
it comes to code, the principal is the same: speeding up a function by an average of 10 milliseconds
doesn't mean much if there's a 100ms difference between your fastest and slowest runs. When
performance matters, you need to respond quickly <em>every time</em>, not just in aggregate.
High-performance systems should first optimize for time variance. Once you're consistent at the time
scale you care about, then focus on improving average time.</p>
<p>This focus on variance shows up all the time in industry too (emphasis added in all quotes below):</p>
<ul>
<li>
<p>In <ahref="https://business.nasdaq.com/market-tech/marketplaces/trading"target="_blank"rel="noopener noreferrer">marketing materials</a> for
NASDAQ's matching engine, the most performance-sensitive component of the exchange, dependability
is highlighted in addition to instantaneous metrics:</p>
<blockquote>
<p>Able to <strong>consistently sustain</strong> an order rate of over 100,000 orders per second at sub-40
microsecond average latency</p>
</blockquote>
</li>
<li>
<p>The <ahref="https://github.com/real-logic/aeron"target="_blank"rel="noopener noreferrer">Aeron</a> message bus has this to say about performance:</p>
<blockquote>
<p>Performance is the key focus. Aeron is designed to be the highest throughput with the lowest and
<strong>most predictable latency possible</strong> of any messaging system</p>
</blockquote>
</li>
<li>
<p>The company PolySync, which is working on autonomous vehicles,
<ahref="https://polysync.io/blog/session-types-for-hearty-codecs/"target="_blank"rel="noopener noreferrer">mentions why</a> they picked their
specific messaging format:</p>
<blockquote>
<p>In general, high performance is almost always desirable for serialization. But in the world of
autonomous vehicles, <strong>steady timing performance is even more important</strong> than peak throughput.
This is because safe operation is sensitive to timing outliers. Nobody wants the system that
decides when to slam on the brakes to occasionally take 100 times longer than usual to encode
its commands.</p>
</blockquote>
</li>
<li>
<p><ahref="https://solarflare.com/"target="_blank"rel="noopener noreferrer">Solarflare</a>, which makes highly-specialized network hardware, points out
<p>The high stakes world of electronic trading, investment banks, market makers, hedge funds and
exchanges demand the <strong>lowest possible latency and jitter</strong> while utilizing the highest
bandwidth and return on their investment.</p>
</blockquote>
</li>
</ul>
<p>And to further clarify: we're not discussing <em>total run-time</em>, but variance of total run-time. There
are situations where it's not reasonably possible to make things faster, and you'd much rather be
consistent. For example, trading firms use
<ahref="https://sniperinmahwah.wordpress.com/2017/06/07/network-effects-part-i/"target="_blank"rel="noopener noreferrer">wireless networks</a> because
the speed of light through air is faster than through fiber-optic cables. There's still at <em>absolute
minimum</em> a <ahref="http://tinyurl.com/y2vd7tn8"target="_blank"rel="noopener noreferrer">~33.76 millisecond</a> delay required to send data between,
say,
<ahref="https://www.theice.com/market-data/connectivity-and-feeds/wireless/tokyo-chicago"target="_blank"rel="noopener noreferrer">Chicago and Tokyo</a>.
If a trading system in Chicago calls the function for "send order to Tokyo" and waits to see if a
trade occurs, there's a physical limit to how long that will take. In this situation, the focus is
on keeping variance of <em>additional processing</em> to a minimum, since speed of light is the limiting
factor.</p>
<p>So how does one go about looking for and eliminating performance variance? To tell the truth, I
don't think a systematic answer or flow-chart exists. There's no substitute for (A) building a deep
understanding of the entire technology stack, and (B) actually measuring system performance (though
(C) watching a lot of <ahref="https://www.youtube.com/channel/UCMlGfpWw-RUdWX_JbLCukXg"target="_blank"rel="noopener noreferrer">CppCon</a> videos for
inspiration never hurt). Even then, every project cares about performance to a different degree; you
may need to build an entire
<ahref="https://www.youtube.com/watch?v=NH1Tta7purM&feature=youtu.be&t=3015"target="_blank"rel="noopener noreferrer">replica production system</a> to
accurately benchmark at nanosecond precision, or you may be content to simply
<ahref="https://www.youtube.com/watch?v=BD9cRbxWQx8&feature=youtu.be&t=1335"target="_blank"rel="noopener noreferrer">avoid garbage collection</a> in
your Java code.</p>
<p>Even though everyone has different needs, there are still common things to look for when trying to
isolate and eliminate variance. In no particular order, these are my focus areas when thinking about
high-performance systems:</p>
<p><strong>Update 2019-09-21</strong>: Added notes on <code>isolcpus</code> and <code>systemd</code> affinity.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="language-specific">Language-specific<ahref="https://speice.io/2019/06/high-performance-systems#language-specific"class="hash-link"aria-label="Direct link to Language-specific"title="Direct link to Language-specific"></a></h2>
<p><strong>Garbage Collection</strong>: How often does garbage collection happen? When is it triggered? What are the
impacts?</p>
<ul>
<li><ahref="https://rushter.com/blog/python-garbage-collector/"target="_blank"rel="noopener noreferrer">In Python</a>, individual objects are collected
if the reference count reaches 0, and each generation is collected if
<code>num_alloc - num_dealloc > gc_threshold</code> whenever an allocation happens. The GIL is acquired for
<ahref="https://gperftools.github.io/gperftools/tcmalloc.html"target="_blank"rel="noopener noreferrer">tcmalloc</a>) that might run faster than the
operating system default.</p>
<p><strong>Data Layout</strong>: How your data is arranged in memory matters;
<ahref="https://www.youtube.com/watch?v=yy8jQgmhbAU"target="_blank"rel="noopener noreferrer">data-oriented design</a> and
<ahref="https://www.youtube.com/watch?v=2EWejmkKlxs&feature=youtu.be&t=1185"target="_blank"rel="noopener noreferrer">cache locality</a> can have huge
impacts on performance. The C family of languages (C, value types in C#, C++) and Rust all have
guarantees about the shape every object takes in memory that others (e.g. Java and Python) can't
make. <ahref="http://valgrind.org/docs/manual/cg-manual.html"target="_blank"rel="noopener noreferrer">Cachegrind</a> and kernel
<ahref="https://perf.wiki.kernel.org/index.php/Main_Page"target="_blank"rel="noopener noreferrer">perf</a> counters are both great for understanding
how performance relates to memory layout.</p>
<p><strong>Just-In-Time Compilation</strong>: Languages that are compiled on the fly (LuaJIT, C#, Java, PyPy) are
great because they optimize your program for how it's actually being used, rather than how a
compiler expects it to be used. However, there's a variance problem if the program stops executing
while waiting for translation from VM bytecode to native code. As a remedy, many languages support
ahead-of-time compilation in addition to the JIT versions
(<ahref="https://github.com/dotnet/corert"target="_blank"rel="noopener noreferrer">CoreRT</a> in C# and <ahref="https://www.graalvm.org/"target="_blank"rel="noopener noreferrer">GraalVM</a> in Java).
which theoretically brings JIT benefits to non-JIT languages. Finally, be careful to avoid comparing
apples and oranges during benchmarks; you don't want your code to suddenly speed up because the JIT
compiler kicked in.</p>
<p><strong>Programming Tricks</strong>: These won't make or break performance, but can be useful in specific
circumstances. For example, C++ can use
<ahref="https://www.youtube.com/watch?v=NH1Tta7purM&feature=youtu.be&t=1206"target="_blank"rel="noopener noreferrer">templates instead of branches</a>
in critical sections.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="kernel">Kernel<ahref="https://speice.io/2019/06/high-performance-systems#kernel"class="hash-link"aria-label="Direct link to Kernel"title="Direct link to Kernel"></a></h2>
<p>Code you wrote is almost certainly not the <em>only</em> code running on your hardware. There are many ways
the operating system interacts with your program, from interrupts to system calls, that are
important to watch for. These are written from a Linux perspective, but Windows does typically have
equivalent functionality.</p>
<p><strong>Scheduling</strong>: The kernel is normally free to schedule any process on any core, so it's important
to reserve CPU cores exclusively for the important programs. There are a few parts to this: first,
limit the CPU cores that non-critical processes are allowed to run on by excluding cores from
kernel command-line option), or by setting the <code>init</code> process CPU affinity
(<ahref="https://access.redhat.com/solutions/2884991"target="_blank"rel="noopener noreferrer"><code>systemd</code> example</a>). Second, set critical processes
to run on the isolated cores by setting the
<ahref="https://en.wikipedia.org/wiki/Processor_affinity"target="_blank"rel="noopener noreferrer">processor affinity</a> using
<ahref="https://linux.die.net/man/1/taskset"target="_blank"rel="noopener noreferrer">taskset</a>. Finally, use
<ahref="https://github.com/torvalds/linux/blob/master/Documentation/timers/NO_HZ.txt"target="_blank"rel="noopener noreferrer"><code>NO_HZ</code></a> or
<ahref="https://linux.die.net/man/1/chrt"target="_blank"rel="noopener noreferrer"><code>chrt</code></a> to disable scheduling interrupts. Turning off
hyper-threading is also likely beneficial.</p>
<p><strong>System calls</strong>: Reading from a UNIX socket? Writing to a file? In addition to not knowing how long
the I/O operation takes, these all trigger expensive
<ahref="https://en.wikipedia.org/wiki/System_call"target="_blank"rel="noopener noreferrer">system calls (syscalls)</a>. To handle these, the CPU must
<ahref="https://en.wikipedia.org/wiki/Context_switch"target="_blank"rel="noopener noreferrer">context switch</a> to the kernel, let the kernel
operation complete, then context switch back to your program. We'd rather keep these
<ahref="https://www.destroyallsoftware.com/talks/the-birth-and-death-of-javascript"target="_blank"rel="noopener noreferrer">to a minimum</a> (see
timestamp 18:20). <ahref="https://linux.die.net/man/1/strace"target="_blank"rel="noopener noreferrer">Strace</a> is your friend for understanding when
and where syscalls happen.</p>
<p><strong>Signal Handling</strong>: Far less likely to be an issue, but signals do trigger a context switch if your
code has a handler registered. This will be highly dependent on the application, but you can
<p><strong>Interrupts</strong>: System interrupts are how devices connected to your computer notify the CPU that
something has happened. The CPU will then choose a processor core to pause and context switch to the
OS to handle the interrupt. Make sure that
<ahref="http://www.alexonlinux.com/smp-affinity-and-proper-interrupt-handling-in-linux"target="_blank"rel="noopener noreferrer">SMP affinity</a> is
set so that interrupts are handled on a CPU core not running the program you care about.</p>
<p><strong><ahref="https://www.kernel.org/doc/html/latest/vm/numa.html"target="_blank"rel="noopener noreferrer">NUMA</a></strong>: While NUMA is good at making
multi-cell systems transparent, there are variance implications; if the kernel moves a process
across nodes, future memory accesses must wait for the controller on the original node. Use
<ahref="https://linux.die.net/man/8/numactl"target="_blank"rel="noopener noreferrer">numactl</a> to handle memory-/cpu-cell pinning so this doesn't
happen.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="hardware">Hardware<ahref="https://speice.io/2019/06/high-performance-systems#hardware"class="hash-link"aria-label="Direct link to Hardware"title="Direct link to Hardware"></a></h2>
<p><strong>CPU Pipelining/Speculation</strong>: Speculative execution in modern processors gave us vulnerabilities
like Spectre, but it also gave us performance improvements like
<ahref="https://stackoverflow.com/a/11227902/1454178"target="_blank"rel="noopener noreferrer">branch prediction</a>. And if the CPU mis-speculates
your code, there's variance associated with rewind and replay. While the compiler knows a lot about
how your CPU <ahref="https://youtu.be/nAbCKa0FzjQ?t=4467"target="_blank"rel="noopener noreferrer">pipelines instructions</a>, code can be
<ahref="https://www.youtube.com/watch?v=NH1Tta7purM&feature=youtu.be&t=755"target="_blank"rel="noopener noreferrer">structured to help</a> the branch
predictor.</p>
<p><strong>Paging</strong>: For most systems, virtual memory is incredible. Applications live in their own worlds,
and the CPU/<ahref="https://en.wikipedia.org/wiki/Memory_management_unit"target="_blank"rel="noopener noreferrer">MMU</a> figures out the details.
However, there's a variance penalty associated with memory paging and caching; if you access more
memory pages than the <ahref="https://en.wikipedia.org/wiki/Translation_lookaside_buffer"target="_blank"rel="noopener noreferrer">TLB</a> can store,
you'll have to wait for the page walk. Kernel perf tools are necessary to figure out if this is an
issue, but using <ahref="https://blog.pythian.com/performance-tuning-hugepages-in-linux/"target="_blank"rel="noopener noreferrer">huge pages</a> can
reduce TLB burdens. Alternately, running applications in a hypervisor like
<ahref="https://github.com/siemens/jailhouse"target="_blank"rel="noopener noreferrer">Jailhouse</a> allows one to skip virtual memory entirely, but
this is probably more work than the benefits are worth.</p>
<p><strong>Network Interfaces</strong>: When more than one computer is involved, variance can go up dramatically.
Tuning kernel
<ahref="https://github.com/leandromoreira/linux-network-performance-parameters"target="_blank"rel="noopener noreferrer">network parameters</a> may be
helpful, but modern systems more frequently opt to skip the kernel altogether with a technique
called <ahref="https://blog.cloudflare.com/kernel-bypass/"target="_blank"rel="noopener noreferrer">kernel bypass</a>. This typically requires
specialized hardware and <ahref="https://www.openonload.org/"target="_blank"rel="noopener noreferrer">drivers</a>, but even industries like
<ahref="https://www.bbc.co.uk/rd/blog/2018-04-high-speed-networking-open-source-kernel-bypass"target="_blank"rel="noopener noreferrer">telecom</a> are
finding the benefits.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="networks">Networks<ahref="https://speice.io/2019/06/high-performance-systems#networks"class="hash-link"aria-label="Direct link to Networks"title="Direct link to Networks"></a></h2>
<p><strong>Routing</strong>: There's a reason financial firms are willing to pay
<ahref="https://sniperinmahwah.wordpress.com/2019/03/26/4-les-moeres-english-version/"target="_blank"rel="noopener noreferrer">millions of euros</a>
for rights to a small plot of land - having a straight-line connection from point A to point B means
the path their data takes is the shortest possible. In contrast, there are currently 6 computers in
between me and Google, but that may change at any moment if my ISP realizes a
<ahref="https://en.wikipedia.org/wiki/Border_Gateway_Protocol"target="_blank"rel="noopener noreferrer">more efficient route</a> is available. Whether
designs, switches will begin forwarding data as soon as they know where the destination is,
checksums be damned. This means there's a fixed cost (at the switch) for network traffic, no matter
the size.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="final-thoughts">Final Thoughts<ahref="https://speice.io/2019/06/high-performance-systems#final-thoughts"class="hash-link"aria-label="Direct link to Final Thoughts"title="Direct link to Final Thoughts"></a></h2>
<p>High-performance systems, regardless of industry, are not magical. They do require extreme precision
and attention to detail, but they're designed, built, and operated by regular people, using a lot of
tools that are publicly available. Interested in seeing how context switching affects performance of
your benchmarks? <code>taskset</code> should be installed in all modern Linux distributions, and can be used to
make sure the OS never migrates your process. Curious how often garbage collection triggers during a
crucial operation? Your language of choice will typically expose details of its operations
<p><imgdecoding="async"loading="lazy"alt="White dough with bubbles"src="https://speice.io/assets/images/white-dough-rising-after-fold-d7a27f12c1d2be572807105d6d7321f3.jpg"width="432"height="428"class="img_ev3q"></p>
<p>After shaping the dough, I've got two loaves ready:</p>
<p>I've been writing a lot more during this break, so I'm looking forward to sharing that in the
future. In the mean-time, I'm planning on making a sandwich.</p>]]></content:encoded>
</item>
<item>
<title><![CDATA[Allocations in Rust: Summary]]></title>
<link>https://speice.io/2019/02/summary</link>
<guid>https://speice.io/2019/02/summary</guid>
<pubDate>Sat, 09 Feb 2019 12:00:00 GMT</pubDate>
<description><![CDATA[While there's a lot of interesting detail captured in this series, it's often helpful to have a]]></description>
<content:encoded><![CDATA[<p>While there's a lot of interesting detail captured in this series, it's often helpful to have a
document that answers some "yes/no" questions. You may not care about what an <code>Iterator</code> looks like
in assembly, you just need to know whether it allocates an object on the heap or not. And while Rust
will prioritize the fastest behavior it can, here are the rules for each memory type:</p>
<p><strong>Global Allocation</strong>:</p>
<ul>
<li><code>const</code> is a fixed value; the compiler is allowed to copy it wherever useful.</li>
<li><code>static</code> is a fixed reference; the compiler will guarantee it is unique.</li>
</ul>
<p><strong>Stack Allocation</strong>:</p>
<ul>
<li>Everything not using a smart pointer will be allocated on the stack.</li>
<li>Structs, enums, iterators, arrays, and closures are all stack allocated.</li>
<li>Cell types (<code>RefCell</code>) behave like smart pointers, but are stack-allocated.</li>
<li>Inlining (<code>#[inline]</code>) will not affect allocation behavior for better or worse.</li>
<li>Types that are marked <code>Copy</code> are guaranteed to have their contents stack-allocated.</li>
</ul>
<p><strong>Heap Allocation</strong>:</p>
<ul>
<li>Smart pointers (<code>Box</code>, <code>Rc</code>, <code>Mutex</code>, etc.) allocate their contents in heap memory.</li>
<li>Collections (<code>HashMap</code>, <code>Vec</code>, <code>String</code>, etc.) allocate their contents in heap memory.</li>
<li>Some smart pointers in the standard library have counterparts in other crates that don't need heap
memory. If possible, use those.</li>
</ul>
<p><imgdecoding="async"loading="lazy"alt="Container Sizes in Rust"src="https://speice.io/assets/images/container-size-7fd54cbb2391e3e7310b0424c5f92cc1.svg"width="960"height="540"class="img_ev3q"></p>
<description><![CDATA[A lot. The answer is a lot.]]></description>
<content:encoded><![CDATA[<p>Up to this point, we've been discussing memory usage in the Rust language by focusing on simple
rules that are mostly right for small chunks of code. We've spent time showing how those rules work
themselves out in practice, and become familiar with reading the assembly code needed to see each
memory type (global, stack, heap) in action.</p>
<p>Throughout the series so far, we've put a handicap on the code. In the name of consistent and
understandable results, we've asked the compiler to pretty please leave the training wheels on. Now
is the time where we throw out all the rules and take off the kid gloves. As it turns out, both the
Rust compiler and the LLVM optimizers are incredibly sophisticated, and we'll step back and let them
do their job.</p>
<p>Similar to
<ahref="https://www.youtube.com/watch?v=bSkpMdDe4g4"target="_blank"rel="noopener noreferrer">"What Has My Compiler Done For Me Lately?"</a>, we're
focusing on interesting things the Rust language (and LLVM!) can do with memory management. We'll
still be looking at assembly code to understand what's going on, but it's important to mention
again: <strong>please use automated tools like <ahref="https://crates.io/crates/alloc_counter"target="_blank"rel="noopener noreferrer">alloc-counter</a> to
double-check memory behavior if it's something you care about</strong>. It's far too easy to mis-read
assembly in large code sections, you should always verify behavior if you care about memory usage.</p>
<p>The guiding principal as we move forward is this: <em>optimizing compilers won't produce worse programs
than we started with.</em> There won't be any situations where stack allocations get moved to heap
allocations. There will, however, be an opera of optimization.</p>
<p><strong>Update 2019-02-10</strong>: When debugging a
<ahref="https://gitlab.com/sio4/code/alloc-counter/issues/1"target="_blank"rel="noopener noreferrer">related issue</a>, it was discovered that the
original code worked because LLVM optimized out the entire function, rather than just the allocation
segments. The code has been updated with proper use of
<ahref="https://doc.rust-lang.org/std/ptr/fn.read_volatile.html"target="_blank"rel="noopener noreferrer"><code>read_volatile</code></a>, and a previous section
on vector capacity has been removed.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="the-case-of-the-disappearing-box">The Case of the Disappearing Box<ahref="https://speice.io/2019/02/08/compiler-optimizations#the-case-of-the-disappearing-box"class="hash-link"aria-label="Direct link to The Case of the Disappearing Box"title="Direct link to The Case of the Disappearing Box"></a></h2>
<p>Our first optimization comes when LLVM can reason that the lifetime of an object is sufficiently
short that heap allocations aren't necessary. In these cases, LLVM will move the allocation to the
stack instead! The way this interacts with <code>#[inline]</code> attributes is a bit opaque, but the important
part is that LLVM can sometimes do better than the baseline Rust language:</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="dr-array-or-how-i-learned-to-love-the-optimizer">Dr. Array or: how I learned to love the optimizer<ahref="https://speice.io/2019/02/08/compiler-optimizations#dr-array-or-how-i-learned-to-love-the-optimizer"class="hash-link"aria-label="Direct link to Dr. Array or: how I learned to love the optimizer"title="Direct link to Dr. Array or: how I learned to love the optimizer"></a></h2>
<p>Finally, this isn't so much about LLVM figuring out different memory behavior, but LLVM stripping
out code that doesn't do anything. Optimizations of this type have a lot of nuance to them; if
you're not careful, they can make your benchmarks look
<ahref="https://www.youtube.com/watch?v=nXaxk27zwlk&feature=youtu.be&t=1199"target="_blank"rel="noopener noreferrer">impossibly good</a>. In Rust, the
<code>black_box</code> function (implemented in both
<ahref="https://doc.rust-lang.org/1.1.0/test/fn.black_box.html"target="_blank"rel="noopener noreferrer"><code>libtest</code></a> and
<ahref="https://docs.rs/criterion/0.2.10/criterion/fn.black_box.html"target="_blank"rel="noopener noreferrer"><code>criterion</code></a>) will tell the compiler
to disable this kind of optimization. But if you let LLVM remove unnecessary code, you can end up
running programs that previously caused errors:</p>
<description><![CDATA[Managing dynamic memory is hard. Some languages assume users will do it themselves (C, C++), and]]></description>
<content:encoded><![CDATA[<p>Managing dynamic memory is hard. Some languages assume users will do it themselves (C, C++), and
some languages go to extreme lengths to protect users from themselves (Java, Python). In Rust, how
the language uses dynamic memory (also referred to as the <strong>heap</strong>) is a system called <em>ownership</em>.
And as the docs mention, ownership
<ahref="https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html"target="_blank"rel="noopener noreferrer">is Rust's most unique feature</a>.</p>
<p>The heap is used in two situations; when the compiler is unable to predict either the <em>total size of
memory needed</em>, or <em>how long the memory is needed for</em>, it allocates space in the heap.</p>
<p>This happens
pretty frequently; if you want to download the Google home page, you won't know how large it is
until your program runs. And when you're finished with Google, we deallocate the memory so it can be
used to store other webpages. If you're interested in a slightly longer explanation of the heap,
check out
<ahref="https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html#the-stack-and-the-heap"target="_blank"rel="noopener noreferrer">The Stack and the Heap</a>
in Rust's documentation.</p>
<p>We won't go into detail on how the heap is managed; the
<ahref="https://doc.rust-lang.org/book/ch04-01-what-is-ownership.html"target="_blank"rel="noopener noreferrer">ownership documentation</a> does a
phenomenal job explaining both the "why" and "how" of memory management. Instead, we're going to
focus on understanding "when" heap allocations occur in Rust.</p>
<p>To start off, take a guess for how many allocations happen in the program below:</p>
<p>As of the time of writing, there are five allocations that happen before <code>main</code> is ever called.</p>
<p>But when we want to understand more practically where heap allocation happens, we'll follow this
guide:</p>
<ul>
<li>Smart pointers hold their contents in the heap</li>
<li>Collections are smart pointers for many objects at a time, and reallocate when they need to grow</li>
</ul>
<p>Finally, there are two "addendum" issues that are important to address when discussing Rust and the
heap:</p>
<ul>
<li>Non-heap alternatives to many standard library types are available.</li>
<li>Special allocators to track memory behavior should be used to benchmark code.</li>
</ul>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="smart-pointers">Smart pointers<ahref="https://speice.io/2019/02/a-heaping-helping#smart-pointers"class="hash-link"aria-label="Direct link to Smart pointers"title="Direct link to Smart pointers"></a></h2>
<p>The first thing to note are the "smart pointer" types. When you have data that must outlive the
scope in which it is declared, or your data is of unknown or dynamic size, you'll make use of these
types.</p>
<p>The term <ahref="https://en.wikipedia.org/wiki/Smart_pointer"target="_blank"rel="noopener noreferrer">smart pointer</a> comes from C++, and while it's
closely linked to a general design pattern of
<ahref="https://en.cppreference.com/w/cpp/language/raii"target="_blank"rel="noopener noreferrer">"Resource Acquisition Is Initialization"</a>, we'll
use it here specifically to describe objects that are responsible for managing ownership of data
allocated on the heap. The smart pointers available in the <code>alloc</code> crate should look mostly
<p>Finally, there is one <ahref="https://www.merriam-webster.com/dictionary/gotcha"target="_blank"rel="noopener noreferrer">"gotcha"</a>: <strong>cell types</strong>
(like <ahref="https://doc.rust-lang.org/stable/core/cell/struct.RefCell.html"target="_blank"rel="noopener noreferrer"><code>RefCell</code></a>) look and behave
similarly, but <strong>don't involve heap allocation</strong>. The
<ahref="https://doc.rust-lang.org/stable/core/cell/index.html"target="_blank"rel="noopener noreferrer"><code>core::cell</code> docs</a> have more information.</p>
<p>When a smart pointer is created, the data it is given is placed in heap memory and the location of
that data is recorded in the smart pointer. Once the smart pointer has determined it's safe to
deallocate that memory (when a <code>Box</code> has
<ahref="https://doc.rust-lang.org/stable/std/boxed/index.html"target="_blank"rel="noopener noreferrer">gone out of scope</a> or a reference count
<ahref="https://doc.rust-lang.org/alloc/rc/index.html"target="_blank"rel="noopener noreferrer">goes to zero</a>), the heap space is reclaimed. We can
prove these types use heap memory by looking at code:</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="collections">Collections<ahref="https://speice.io/2019/02/a-heaping-helping#collections"class="hash-link"aria-label="Direct link to Collections"title="Direct link to Collections"></a></h2>
<p>Collection types use heap memory because their contents have dynamic size; they will request more
memory <ahref="https://doc.rust-lang.org/std/vec/struct.Vec.html#method.reserve"target="_blank"rel="noopener noreferrer">when needed</a>, and can
<ahref="https://doc.rust-lang.org/std/vec/struct.Vec.html#method.shrink_to_fit"target="_blank"rel="noopener noreferrer">release memory</a> when it's
no longer necessary. This dynamic property forces Rust to heap allocate everything they contain. In
a way, <strong>collections are smart pointers for many objects at a time</strong>. Common types that fall under
this umbrella are <ahref="https://doc.rust-lang.org/stable/alloc/vec/struct.Vec.html"target="_blank"rel="noopener noreferrer"><code>Vec</code></a>,
<ahref="https://doc.rust-lang.org/stable/std/collections/struct.HashMap.html"target="_blank"rel="noopener noreferrer"><code>HashMap</code></a>, and
and <ahref="https://doc.rust-lang.org/std/string/struct.String.html#method.new"target="_blank"rel="noopener noreferrer"><code>String::new()</code></a>.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="heap-alternatives">Heap Alternatives<ahref="https://speice.io/2019/02/a-heaping-helping#heap-alternatives"class="hash-link"aria-label="Direct link to Heap Alternatives"title="Direct link to Heap Alternatives"></a></h2>
<p>While it is a bit strange to speak of the stack after spending time with the heap, it's worth
pointing out that some heap-allocated objects in Rust have stack-based counterparts provided by
other crates. If you have need of the functionality, but want to avoid allocating, there are
typically alternatives available.</p>
<p>When it comes to some standard library smart pointers
(<ahref="https://doc.rust-lang.org/std/sync/struct.RwLock.html"target="_blank"rel="noopener noreferrer"><code>RwLock</code></a> and
<ahref="https://doc.rust-lang.org/std/sync/struct.Mutex.html"target="_blank"rel="noopener noreferrer"><code>Mutex</code></a>), stack-based alternatives are
provided in crates like <ahref="https://crates.io/crates/parking_lot"target="_blank"rel="noopener noreferrer">parking_lot</a> and
<ahref="https://crates.io/crates/spin"target="_blank"rel="noopener noreferrer">spin</a>. You can check out
<ahref="https://docs.rs/lock_api/0.1.5/lock_api/struct.Mutex.html"target="_blank"rel="noopener noreferrer"><code>lock_api::Mutex</code></a>, and
<ahref="https://mvdnes.github.io/rust-docs/spin-rs/spin/struct.Once.html"target="_blank"rel="noopener noreferrer"><code>spin::Once</code></a> if you're in need
of synchronization primitives.</p>
<p><ahref="https://crates.io/crates/thread-id"target="_blank"rel="noopener noreferrer">thread_id</a> may be necessary if you're implementing an allocator
because <ahref="https://doc.rust-lang.org/std/thread/struct.ThreadId.html"target="_blank"rel="noopener noreferrer"><code>thread::current().id()</code></a> uses a
<h2class="anchor anchorWithStickyNavbar_LWe7"id="tracing-allocators">Tracing Allocators<ahref="https://speice.io/2019/02/a-heaping-helping#tracing-allocators"class="hash-link"aria-label="Direct link to Tracing Allocators"title="Direct link to Tracing Allocators"></a></h2>
<p>When writing performance-sensitive code, there's no alternative to measuring your code. If you
didn't write a benchmark,
<ahref="https://www.youtube.com/watch?v=2EWejmkKlxs&feature=youtu.be&t=263"target="_blank"rel="noopener noreferrer">you don't care about it's performance</a>
You should never rely on your instincts when
<ahref="https://www.youtube.com/watch?v=NH1Tta7purM"target="_blank"rel="noopener noreferrer">a microsecond is an eternity</a>.</p>
<p>Similarly, there's great work going on in Rust with allocators that keep track of what they're doing
(like <ahref="https://crates.io/crates/alloc_counter"target="_blank"rel="noopener noreferrer"><code>alloc_counter</code></a>). When it comes to tracking heap
behavior, it's easy to make mistakes; please write tests and make sure you have tools to guard
against future issues.</p>]]></content:encoded>
</item>
<item>
<title><![CDATA[Allocations in Rust: Fixed memory]]></title>
<description><![CDATA[const and static are perfectly fine, but it's relatively rare that we know at compile-time about]]></description>
<content:encoded><![CDATA[<p><code>const</code> and <code>static</code> are perfectly fine, but it's relatively rare that we know at compile-time about
either values or references that will be the same for the duration of our program. Put another way,
it's not often the case that either you or your compiler knows how much memory your entire program
will ever need.</p>
<p>However, there are still some optimizations the compiler can do if it knows how much memory
individual functions will need. Specifically, the compiler can make use of "stack" memory (as
opposed to "heap" memory) which can be managed far faster in both the short- and long-term.</p>
<p>When requesting memory, the <ahref="http://www.cs.virginia.edu/~evans/cs216/guides/x86.html"target="_blank"rel="noopener noreferrer"><code>push</code> instruction</a>
can typically complete in <ahref="https://agner.org/optimize/instruction_tables.ods"target="_blank"rel="noopener noreferrer">1 or 2 cycles</a> (<1ns
on modern CPUs). Contrast that to heap memory which requires an allocator (specialized
software to track what memory is in use) to reserve space. When you're finished with stack memory,
the <code>pop</code> instruction runs in 1-3 cycles, as opposed to an allocator needing to worry about memory
fragmentation and other issues with the heap. All sorts of incredibly sophisticated techniques have
<p>But no matter how fast your allocator is, the principle remains: the fastest allocator is the one
you never use. As such, we're not going to discuss how exactly the
<ahref="http://www.cs.virginia.edu/~evans/cs216/guides/x86.html"target="_blank"rel="noopener noreferrer"><code>push</code> and <code>pop</code> instructions work</a>, but
we'll focus instead on the conditions that enable the Rust compiler to use faster stack-based
allocation for variables.</p>
<p>So, <strong>how do we know when Rust will or will not use stack allocation for objects we create?</strong>
Looking at other languages, it's often easy to delineate between stack and heap. Managed memory
languages (Python, Java,
<ahref="https://blogs.msdn.microsoft.com/ericlippert/2010/09/30/the-truth-about-value-types/"target="_blank"rel="noopener noreferrer">C#</a>) place
everything on the heap. JIT compilers (<ahref="https://www.pypy.org/"target="_blank"rel="noopener noreferrer">PyPy</a>,
<ahref="https://www.oracle.com/technetwork/java/javase/tech/index-jsp-136373.html"target="_blank"rel="noopener noreferrer">HotSpot</a>) may optimize
some heap allocations away, but you should never assume it will happen. C makes things clear with
calls to special functions (like <ahref="https://linux.die.net/man/3/malloc"target="_blank"rel="noopener noreferrer">malloc(3)</a>) needed to access
heap memory. Old C++ has the <ahref="https://stackoverflow.com/a/655086/1454178"target="_blank"rel="noopener noreferrer"><code>new</code></a> keyword, though
modern C++/C++11 is more complicated with <ahref="https://en.cppreference.com/w/cpp/language/raii"target="_blank"rel="noopener noreferrer">RAII</a>.</p>
<p>For Rust, we can summarize as follows: <strong>stack allocation will be used for everything that doesn't
involve "smart pointers" and collections</strong>. We'll skip over a precise definition of the term "smart
pointer" for now, and instead discuss what we should watch for to understand when stack and heap
memory regions are used:</p>
<ol>
<li>
<p>Stack manipulation instructions (<code>push</code>, <code>pop</code>, and <code>add</code>/<code>sub</code> of the <code>rsp</code> register) indicate
<p>Tracking when exactly heap allocation calls occur is difficult. It's typically easier to watch
for <code>call core::ptr::real_drop_in_place</code>, and infer that a heap allocation happened in the recent
past:</p>
<divclass="language-rust codeBlockContainer_Ckt0 theme-code-block"style="--prism-background-color:hsl(230, 1%, 98%);--prism-color:hsl(230, 8%, 24%)"><divclass="codeBlockContent_biex"><pretabindex="0"class="prism-code language-rust codeBlock_bY9V thin-scrollbar"style="background-color:hsl(230, 1%, 98%);color:hsl(230, 8%, 24%)"><codeclass="codeBlockLines_e6Vv"><spanclass="token-line"style="color:hsl(230, 8%, 24%)"><spanclass="token keyword"style="color:hsl(301, 63%, 40%)">pub</span><spanclass="token plain"></span><spanclass="token keyword"style="color:hsl(301, 63%, 40%)">fn</span><spanclass="token plain"></span><spanclass="token function-definition function"style="color:hsl(221, 87%, 60%)">heap_alloc</span><spanclass="token punctuation"style="color:hsl(119, 34%, 47%)">(</span><spanclass="token plain">x</span><spanclass="token punctuation"style="color:hsl(119, 34%, 47%)">:</span><spanclass="token plain"></span><spanclass="token keyword"style="color:hsl(301, 63%, 40%)">usize</span><spanclass="token punctuation"style="color:hsl(119, 34%, 47%)">)</span><spanclass="token plain"></span><spanclass="token punctuation"style="color:hsl(119, 34%, 47%)">-></span><spanclass="token plain"></span><spanclass="token keyword"style="color:hsl(301, 63%, 40%)">usize</span><spanclass="token plain"></span><spanclass="token punctuation"style="color:hsl(119, 34%, 47%)">{</span><spanclass="token plain"></span><br></span><spanclass="token-line"style="color:hsl(230, 8%, 24%)"><spanclass="token plain"></span><spanclass="token comment"style="color:hsl(230, 4%, 64%)">// Space for elements in a vector has to be allocated</span><spanclass="token plain"></span><br></span><spanclass="token-line"style="color:hsl(230, 8%, 24%)"><spanclass="token plain"></span><spanclass="token comment"style="color:hsl(230, 4%, 64%)">// on the heap, and is then de-allocated once the</span><spanclass="token plain"></span><br></span><spanclass="token-line"style="color:hsl(230, 8%, 24%)"><spanclass="token plain"></span><spanclass="token comment"style="color:hsl(230, 4%, 64%)">// vector goes out of scope</span><spanclass="token plain"></span><br></span><spanclass="token-line"style="color:hsl(230, 8%, 24%)"><spanclass="token plain"></span><spanclass="token keyword"style="color:hsl(301, 63%, 40%)">let</span><spanclass="token plain"> y</span><spanclass="token punctuation"style="color:hsl(119, 34%, 47%)">:</span><spanclass="token plain"></span><spanclass="token class-name"style="color:hsl(35, 99%, 36%)">Vec</span><spanclass="token operator"style="color:hsl(221, 87%, 60%)"><</span><spanclass="token keyword"style="color:hsl(301, 63%, 40%)">u8</span><spanclass="token operator"style="color:hsl(221, 87%, 60%)">></span><spanclass="token plain"></span><spanclass="token operator"style="color:hsl(221, 87%, 60%)">=</span><spanclass="token plain"></span><spanclass="token class-name"style="color:hsl(35, 99%, 36%)">Vec</span><spanclass="token punctuation"style="color:hsl(119, 34%, 47%)">::</span><spanclass="token function"style="color:hsl(221, 87%, 60%)">with_capacity</span><spanclass="token punctuation"style="color:hsl(119, 34%, 47%)">(</span><spanclass="token plain">x</span><spanclass="token punctuation"style="color:hsl(119, 34%, 47%)">)</span><spanclass="token punctuation"style="color:hsl(119, 34%, 47%)">;</span><spanclass="token plain"></span><br></span><spanclass="token-line"style="color:hsl(230, 8%, 24%)"><spanclass="token plain"> x</span><br></span><spanclass="token-line"style="color:hsl(230, 8%, 24%)"><spanclass="token plain"></span><spanclass="token punctuation"style="color:hsl(119, 34%, 47%)">}</span><br></span></code></pre><divclass="buttonGroup__atx"><buttontype="button"aria-label="Copy code to clipboard"title="Copy"class="clean-btn"><spanclass="copyButtonIcons_eSgA"aria-hidden="true"><svgviewBox="0 0 24 24"class="copyButtonIcon_y97N"><pathfill="currentColor"d="M19,21H8V7H19M19,5H8A2,200,06,7V21A2,200,08,23H19A2,200,021,21V7A2,200
<p>-- <ahref="https://godbolt.org/z/epfgoQ"target="_blank"rel="noopener noreferrer">Compiler Explorer</a> (<code>real_drop_in_place</code> happens on line 1317)
<small>Note: While the
<ahref="https://doc.rust-lang.org/std/ops/trait.Drop.html"target="_blank"rel="noopener noreferrer"><code>Drop</code> trait</a> is
<ahref="https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=87edf374d8983816eb3d8cfeac657b46"target="_blank"rel="noopener noreferrer">called for stack-allocated objects</a>,
the Rust standard library only defines <code>Drop</code> implementations for types that involve heap
allocation.</small></p>
</li>
<li>
<p>If you don't want to inspect the assembly, use a custom allocator that's able to track and alert
when heap allocations occur. Crates like
<ahref="https://crates.io/crates/alloc_counter"target="_blank"rel="noopener noreferrer"><code>alloc_counter</code></a> are designed for exactly this purpose.</p>
</li>
</ol>
<p>With all that in mind, let's talk about situations in which we're guaranteed to use stack memory:</p>
<ul>
<li>Structs are created on the stack.</li>
<li>Function arguments are passed on the stack, meaning the
<ahref="https://doc.rust-lang.org/reference/attributes.html#inline-attribute"target="_blank"rel="noopener noreferrer"><code>#[inline]</code> attribute</a> will
not change the memory region used.</li>
<li>Enums and unions are stack-allocated.</li>
<li><ahref="https://doc.rust-lang.org/std/primitive.array.html"target="_blank"rel="noopener noreferrer">Arrays</a> are always stack-allocated.</li>
<li>Closures capture their arguments on the stack.</li>
<li>Generics will use stack allocation, even with dynamic dispatch.</li>
<li><ahref="https://doc.rust-lang.org/std/marker/trait.Copy.html"target="_blank"rel="noopener noreferrer"><code>Copy</code></a> types are guaranteed to be
stack-allocated, and copying them will be done in stack memory.</li>
<li><ahref="https://doc.rust-lang.org/std/iter/trait.Iterator.html"target="_blank"rel="noopener noreferrer"><code>Iterator</code>s</a> in the standard library are
stack-allocated even when iterating over heap-based collections.</li>
</ul>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="structs">Structs<ahref="https://speice.io/2019/02/stacking-up#structs"class="hash-link"aria-label="Direct link to Structs"title="Direct link to Structs"></a></h2>
<p>The simplest case comes first. When creating vanilla <code>struct</code> objects, we use stack memory to hold
<p>Note that while some extra-fancy instructions are used for memory manipulation in the assembly, the
<code>sub rsp, 64</code> instruction indicates we're still working with the stack.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="function-arguments">Function arguments<ahref="https://speice.io/2019/02/stacking-up#function-arguments"class="hash-link"aria-label="Direct link to Function arguments"title="Direct link to Function arguments"></a></h2>
<p>Have you ever wondered how functions communicate with each other? Like, once the variables are given
to you, everything's fine. But how do you "give" those variables to another function? How do you get
the results back afterward? The answer: the compiler arranges memory and assembly instructions using
a pre-determined <ahref="http://llvm.org/docs/LangRef.html#calling-conventions"target="_blank"rel="noopener noreferrer">calling convention</a>. This
convention governs the rules around where arguments needed by a function will be located (either in
memory offsets relative to the stack pointer <code>rsp</code>, or in other registers), and where the results
can be found once the function has finished. And when multiple languages agree on what the calling
conventions are, you can do things like having <ahref="https://blog.filippo.io/rustgo/"target="_blank"rel="noopener noreferrer">Go call Rust code</a>!</p>
<p>Put simply: it's the compiler's job to figure out how to call other functions, and you can assume
that the compiler is good at its job.</p>
<p>We can see this in action using a simple example:</p>
<ahref="https://doc.rust-lang.org/std/marker/trait.Copy.html"target="_blank"rel="noopener noreferrer"><code>Copy</code></a>) and passing by reference (either
moving ownership or passing a pointer) may have slightly different layouts in assembly, but will
still use either stack memory or CPU registers:</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="enums">Enums<ahref="https://speice.io/2019/02/stacking-up#enums"class="hash-link"aria-label="Direct link to Enums"title="Direct link to Enums"></a></h2>
<p>If you've ever worried that wrapping your types in
<ahref="https://doc.rust-lang.org/stable/core/option/enum.Option.html"target="_blank"rel="noopener noreferrer"><code>Option</code></a> or
<ahref="https://doc.rust-lang.org/stable/core/result/enum.Result.html"target="_blank"rel="noopener noreferrer"><code>Result</code></a> would finally make them
large enough that Rust decides to use heap allocation instead, fear no longer: <code>enum</code> and union
<h2class="anchor anchorWithStickyNavbar_LWe7"id="arrays">Arrays<ahref="https://speice.io/2019/02/stacking-up#arrays"class="hash-link"aria-label="Direct link to Arrays"title="Direct link to Arrays"></a></h2>
<p>The array type is guaranteed to be stack allocated, which is why the array size must be declared.
Interestingly enough, this can be used to cause safe Rust programs to crash:</p>
<p>There aren't any security implications of this (no memory corruption occurs), but it's good to note
that the Rust compiler won't move arrays into heap memory even if they can be reasonably expected to
overflow the stack.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="closures">Closures<ahref="https://speice.io/2019/02/stacking-up#closures"class="hash-link"aria-label="Direct link to Closures"title="Direct link to Closures"></a></h2>
<p>Rules for how anonymous functions capture their arguments are typically language-specific. In Java,
<ahref="https://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html"target="_blank"rel="noopener noreferrer">Lambda Expressions</a> are
actually objects created on the heap that capture local primitives by copying, and capture local
non-primitives as (<code>final</code>) references.
<ahref="https://docs.python.org/3.7/reference/expressions.html#lambda"target="_blank"rel="noopener noreferrer">Python</a> and
<p>-- <ahref="https://godbolt.org/z/p37qFl"target="_blank"rel="noopener noreferrer">Compiler Explorer</a>, 70 total assembly instructions</p>
<p>In every circumstance though, the compiler ensured that no heap allocations were necessary.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="generics">Generics<ahref="https://speice.io/2019/02/stacking-up#generics"class="hash-link"aria-label="Direct link to Generics"title="Direct link to Generics"></a></h2>
<p>Traits in Rust come in two broad forms: static dispatch (monomorphization, <code>impl Trait</code>) and dynamic
dispatch (trait objects, <code>dyn Trait</code>). While dynamic dispatch is often <em>associated</em> with trait
objects being stored in the heap, dynamic dispatch can be used with stack allocated objects as well:</p>
<p>It's hard to imagine practical situations where dynamic dispatch would be used for objects that
aren't heap allocated, but it technically can be done.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="copy-types">Copy types<ahref="https://speice.io/2019/02/stacking-up#copy-types"class="hash-link"aria-label="Direct link to Copy types"title="Direct link to Copy types"></a></h2>
<p>Understanding move semantics and copy semantics in Rust is weird at first. The Rust docs
<ahref="https://doc.rust-lang.org/stable/core/marker/trait.Copy.html"target="_blank"rel="noopener noreferrer">go into detail</a> far better than can
be addressed here, so I'll leave them to do the job. From a memory perspective though, their
guideline is reasonable:
<ahref="https://doc.rust-lang.org/stable/core/marker/trait.Copy.html#when-should-my-type-be-copy"target="_blank"rel="noopener noreferrer">if your type can implemement <code>Copy</code>, it should</a>.
While there are potential speed tradeoffs to <em>benchmark</em> when discussing <code>Copy</code> (move semantics for
stack objects vs. copying stack pointers vs. copying stack <code>struct</code>s), <em>it's impossible for <code>Copy</code>
to introduce a heap allocation</em>.</p>
<p>But why is this the case? Fundamentally, it's because the language controls what <code>Copy</code> means -
<ahref="https://doc.rust-lang.org/std/marker/trait.Copy.html#whats-the-difference-between-copy-and-clone"target="_blank"rel="noopener noreferrer">"the behavior of <code>Copy</code> is not overloadable"</a>
because it's a marker trait. From there we'll note that a type
<h2class="anchor anchorWithStickyNavbar_LWe7"id="iterators">Iterators<ahref="https://speice.io/2019/02/stacking-up#iterators"class="hash-link"aria-label="Direct link to Iterators"title="Direct link to Iterators"></a></h2>
<p>In managed memory languages (like
<ahref="https://www.youtube.com/watch?v=bSkpMdDe4g4&feature=youtu.be&t=357"target="_blank"rel="noopener noreferrer">Java</a>), there's a subtle
is allocated on the heap, and will eventually be garbage-collected. This isn't a great design;
iterators are often transient objects that you need during a function and can discard once the
function ends. Sounds exactly like the issue stack-allocated objects address, no?</p>
<p>In Rust, iterators are allocated on the stack. The objects to iterate over are almost certainly in
heap memory, but the iterator itself
(<ahref="https://doc.rust-lang.org/std/slice/struct.Iter.html"target="_blank"rel="noopener noreferrer"><code>Iter</code></a>) doesn't need to use the heap. In
each of the examples below we iterate over a collection, but never use heap allocation:</p>
<description><![CDATA[The first memory type we'll look at is pretty special: when Rust can prove that a value is fixed]]></description>
<content:encoded><![CDATA[<p>The first memory type we'll look at is pretty special: when Rust can prove that a <em>value</em> is fixed
for the life of a program (<code>const</code>), and when a <em>reference</em> is unique for the life of a program
(<code>static</code> as a declaration, not
<ahref="https://doc.rust-lang.org/book/ch10-03-lifetime-syntax.html#the-static-lifetime"target="_blank"rel="noopener noreferrer"><code>'static</code></a> as a
lifetime), we can make use of global memory. This special section of data is embedded directly in
the program binary so that variables are ready to go once the program loads; no additional
computation is necessary.</p>
<p>Understanding the value/reference distinction is important for reasons we'll go into below, and
while the
<ahref="https://github.com/rust-lang/rfcs/blob/master/text/0246-const-vs-static.md"target="_blank"rel="noopener noreferrer">full specification</a> for
these two keywords is available, we'll take a hands-on approach to the topic.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="const-values"><code>const</code> values<ahref="https://speice.io/2019/02/the-whole-world#const-values"class="hash-link"aria-label="Direct link to const-values"title="Direct link to const-values"></a></h2>
<p>When a <em>value</em> is guaranteed to be unchanging in your program (where "value" may be scalars,
<code>struct</code>s, etc.), you can declare it <code>const</code>. This tells the compiler that it's safe to treat the
value as never changing, and enables some interesting optimizations; not only is there no
initialization cost to creating the value (it is loaded at the same time as the executable parts of
your program), but the compiler can also copy the value around if it speeds up the code.</p>
<p>The points we need to address when talking about <code>const</code> are:</p>
<ul>
<li><code>Const</code> values are stored in read-only memory - it's impossible to modify.</li>
<li>Values resulting from calling a <code>const fn</code> are materialized at compile-time.</li>
<li>The compiler may (or may not) copy <code>const</code> values wherever it chooses.</li>
</ul>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="read-only">Read-Only<ahref="https://speice.io/2019/02/the-whole-world#read-only"class="hash-link"aria-label="Direct link to Read-Only"title="Direct link to Read-Only"></a></h3>
<p>The first point is a bit strange - "read-only memory."
mentions in a couple places that using <code>mut</code> with constants is illegal, but it's also important to
demonstrate just how immutable they are. <em>Typically</em> in Rust you can use
<ahref="https://doc.rust-lang.org/book/ch15-05-interior-mutability.html"target="_blank"rel="noopener noreferrer">interior mutability</a> to modify
things that aren't declared <code>mut</code>.
<ahref="https://doc.rust-lang.org/std/cell/struct.RefCell.html"target="_blank"rel="noopener noreferrer"><code>RefCell</code></a> provides an example of this
<p>And a second example using <ahref="https://doc.rust-lang.org/std/sync/struct.Once.html"target="_blank"rel="noopener noreferrer"><code>Once</code></a>:</p>
refers to <ahref="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3055.pdf"target="_blank"rel="noopener noreferrer">"rvalues"</a>, this
behavior is what they refer to. <ahref="https://github.com/rust-lang/rust-clippy"target="_blank"rel="noopener noreferrer">Clippy</a> will treat this
as an error, but it's still something to be aware of.</p>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="initialization">Initialization<ahref="https://speice.io/2019/02/the-whole-world#initialization"class="hash-link"aria-label="Direct link to Initialization"title="Direct link to Initialization"></a></h3>
<p>The next thing to mention is that <code>const</code> values are loaded into memory <em>as part of your program
binary</em>. Because of this, any <code>const</code> values declared in your program will be "realized" at
compile-time; accessing them may trigger a main-memory lookup (with a fixed address, so your CPU may
be able to prefetch the value), but that's it.</p>
<p>The compiler creates one <code>RefCell</code>, uses it everywhere, and never needs to call the <code>RefCell::new</code>
function.</p>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="copying">Copying<ahref="https://speice.io/2019/02/the-whole-world#copying"class="hash-link"aria-label="Direct link to Copying"title="Direct link to Copying"></a></h3>
<p>If it's helpful though, the compiler can choose to copy <code>const</code> values.</p>
<p>In this example, the <code>FACTOR</code> value is turned into the <code>mov edi, 1000</code> instruction in both the
<code>multiply</code> and <code>multiply_twice</code> functions; the "1000" value is never "stored" anywhere, as it's
small enough to inline into the assembly instructions.</p>
<p>Finally, getting the address of a <code>const</code> value is possible, but not guaranteed to be unique
(because the compiler can choose to copy values). I was unable to get non-unique pointers in my
testing (even using different crates), but the specifications are clear enough: <em>don't rely on
pointers to <code>const</code> values being consistent</em>. To be frank, caring about locations for <code>const</code> values
is almost certainly a code smell.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="static-values"><code>static</code> values<ahref="https://speice.io/2019/02/the-whole-world#static-values"class="hash-link"aria-label="Direct link to static-values"title="Direct link to static-values"></a></h2>
<p>Static variables are related to <code>const</code> variables, but take a slightly different approach. When we
declare that a <em>reference</em> is unique for the life of a program, you have a <code>static</code> variable
(unrelated to the <code>'static</code> lifetime). Because of the reference/value distinction with
<code>const</code>/<code>static</code>, static variables behave much more like typical "global" variables.</p>
<p>But to understand <code>static</code>, here's what we'll look at:</p>
<ul>
<li><code>static</code> variables are globally unique locations in memory.</li>
<li>Like <code>const</code>, <code>static</code> variables are loaded at the same time as your program being read into
memory.</li>
<li>All <code>static</code> variables must implement the
<li>Interior mutability is safe and acceptable when using <code>static</code> variables.</li>
</ul>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="memory-uniqueness">Memory Uniqueness<ahref="https://speice.io/2019/02/the-whole-world#memory-uniqueness"class="hash-link"aria-label="Direct link to Memory Uniqueness"title="Direct link to Memory Uniqueness"></a></h3>
<p>The single biggest difference between <code>const</code> and <code>static</code> is the guarantees provided about
uniqueness. Where <code>const</code> variables may or may not be copied in code, <code>static</code> variables are
guarantee to be unique. If we take a previous <code>const</code> example and change it to <code>static</code>, the
<p>Where <ahref="https://speice.io/2019/02/the-whole-world#copying">previously</a> there were plenty of references to multiplying by 1000, the new
assembly refers to <code>FACTOR</code> as a named memory location instead. No initialization work needs to be
done, but the compiler can no longer prove the value never changes during execution.</p>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="initialization-1">Initialization<ahref="https://speice.io/2019/02/the-whole-world#initialization-1"class="hash-link"aria-label="Direct link to Initialization"title="Direct link to Initialization"></a></h3>
<p>Next, let's talk about initialization. The simplest case is initializing static variables with
<p>However, there's a caveat: you're currently not allowed to use <code>const fn</code> to initialize static
variables of types that aren't marked <code>Sync</code>. For example,
<ahref="https://doc.rust-lang.org/std/cell/struct.RefCell.html#method.new"target="_blank"rel="noopener noreferrer"><code>RefCell::new()</code></a> is a
<ahref="https://github.com/rust-lang/rfcs/blob/master/text/0911-const-fn.md"target="_blank"rel="noopener noreferrer">change in the future</a> though.</p>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="the-sync-marker">The <code>Sync</code> marker<ahref="https://speice.io/2019/02/the-whole-world#the-sync-marker"class="hash-link"aria-label="Direct link to the-sync-marker"title="Direct link to the-sync-marker"></a></h3>
<p>Which leads well to the next point: static variable types must implement the
<ahref="https://doc.rust-lang.org/std/marker/trait.Sync.html"target="_blank"rel="noopener noreferrer"><code>Sync</code> marker</a>. Because they're globally
unique, it must be safe for you to access static variables from any thread at any time. Most
<code>struct</code> definitions automatically implement the <code>Sync</code> trait because they contain only elements
which themselves implement <code>Sync</code> (read more in the
<ahref="https://doc.rust-lang.org/nomicon/send-and-sync.html"target="_blank"rel="noopener noreferrer">Nomicon</a>). This is why earlier examples could
get away with initializing statics, even though we never included an <code>impl Sync for MyStruct</code> in the
code. To demonstrate this property, Rust refuses to compile our earlier example if we add a
non-<code>Sync</code> element to the <code>struct</code> definition:</p>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="interior-mutability">Interior mutability<ahref="https://speice.io/2019/02/the-whole-world#interior-mutability"class="hash-link"aria-label="Direct link to Interior mutability"title="Direct link to Interior mutability"></a></h3>
<p>Finally, while <code>static mut</code> variables are allowed, mutating them is an <code>unsafe</code> operation. If we
want to stay in <code>safe</code> Rust, we can use interior mutability to accomplish similar goals:</p>
<description><![CDATA[There's an alchemy of distilling complex technical topics into articles and videos that change the]]></description>
<content:encoded><![CDATA[<p>There's an alchemy of distilling complex technical topics into articles and videos that change the
way programmers see the tools they interact with on a regular basis. I knew what a linker was, but
there's a staggering amount of complexity in between
<ahref="https://www.youtube.com/watch?v=dOfucXtyEsU"target="_blank"rel="noopener noreferrer">the OS and <code>main()</code></a>. Rust programmers use the
<ahref="https://doc.rust-lang.org/stable/std/boxed/struct.Box.html"target="_blank"rel="noopener noreferrer"><code>Box</code></a> type all the time, but there's a
rich history of the Rust language itself wrapped up in
<ahref="https://manishearth.github.io/blog/2017/01/10/rust-tidbits-box-is-special/"target="_blank"rel="noopener noreferrer">how special it is</a>.</p>
<p>In a similar vein, this series attempts to look at code and understand how memory is used; the
complex choreography of operating system, compiler, and program that frees you to focus on
functionality far-flung from frivolous book-keeping. The Rust compiler relieves a great deal of the
cognitive burden associated with memory management, but we're going to step into its world for a
while.</p>
<p>Let's learn a bit about memory in Rust.</p>
<hr>
<p>Rust's three defining features of
<ahref="https://www.rust-lang.org/"target="_blank"rel="noopener noreferrer">Performance, Reliability, and Productivity</a> are all driven to a great
degree by the how the Rust compiler understands memory usage. Unlike managed memory languages (Java,
and the ownership system to ensure you can't accidentally corrupt memory. It's not as fast, but it
is important to have available.</p>
<p>That said, there are specific situations in Rust where you'd never need to worry about the
stack/heap distinction! If you:</p>
<ol>
<li>Never use <code>unsafe</code></li>
<li>Never use <code>#![feature(alloc)]</code> or the <ahref="https://doc.rust-lang.org/alloc/index.html"target="_blank"rel="noopener noreferrer"><code>alloc</code> crate</a></li>
</ol>
<p>...then it's not possible for you to use dynamic memory!</p>
<p>For some uses of Rust, typically embedded devices, these constraints are OK. They have very limited
memory, and the program binary size itself may significantly affect what's available! There's no
operating system able to manage this
<ahref="https://en.wikipedia.org/wiki/Virtual_memory"target="_blank"rel="noopener noreferrer">"virtual memory"</a> thing, but that's not an issue
because there's only one running application. The
<ahref="https://docs.rust-embedded.org/embedonomicon/preface.html"target="_blank"rel="noopener noreferrer">embedonomicon</a> is ever in mind, and
interacting with the "real world" through extra peripherals is accomplished by reading and writing
to <ahref="https://bob.cs.sonoma.edu/IntroCompOrg-RPi/sec-gpio-mem.html"target="_blank"rel="noopener noreferrer">specific memory addresses</a>.</p>
<p>Most Rust programs find these requirements overly burdensome though. C++ developers would struggle
without access to <ahref="https://en.cppreference.com/w/cpp/container/vector"target="_blank"rel="noopener noreferrer"><code>std::vector</code></a> (except those
hardcore no-STL people), and Rust developers would struggle without
<ahref="https://doc.rust-lang.org/std/vec/struct.Vec.html"target="_blank"rel="noopener noreferrer"><code>std::vec</code></a>. But with the constraints above,
<code>std::vec</code> is actually a part of the
<ahref="https://doc.rust-lang.org/alloc/vec/struct.Vec.html"target="_blank"rel="noopener noreferrer"><code>alloc</code> crate</a>, and thus off-limits. <code>Box</code>,
<code>Rc</code>, etc., are also unusable for the same reason.</p>
<p>Whether writing code for embedded devices or not, the important thing in both situations is how much
you know <em>before your application starts</em> about what its memory usage will look like. In embedded
devices, there's a small, fixed amount of memory to use. In a browser, you have no idea how large
<ahref="https://www.google.com/"target="_blank"rel="noopener noreferrer">google.com</a>'s home page is until you start trying to download it. The
compiler uses this knowledge (or lack thereof) to optimize how memory is used; put simply, your code
runs faster when the compiler can guarantee exactly how much memory your program needs while it's
running. This series is all about understanding how the compiler reasons about your program, with an
emphasis on the implications for performance.</p>
<p>Now let's address some conditions and caveats before going much further:</p>
<ul>
<li>We'll focus on "safe" Rust only; <code>unsafe</code> lets you use platform-specific allocation API's
(<ahref="https://www.tutorialspoint.com/c_standard_library/c_function_malloc.htm"target="_blank"rel="noopener noreferrer"><code>malloc</code></a>) that we'll
ignore.</li>
<li>We'll assume a "debug" build of Rust code (what you get with <code>cargo run</code> and <code>cargo test</code>) and
address (pun intended) release mode at the end (<code>cargo run --release</code> and <code>cargo test --release</code>).</li>
<li>All content will be run using Rust 1.32, as that's the highest currently supported in the
<ahref="https://godbolt.org/"target="_blank"rel="noopener noreferrer">Compiler Exporer</a>. As such, we'll avoid upcoming innovations like
<ahref="https://github.com/rust-lang/rfcs/blob/master/text/0911-const-fn.md"target="_blank"rel="noopener noreferrer">compile-time evaluation of <code>static</code></a>
that are available in nightly.</li>
<li>Because of the nature of the content, being able to read assembly is helpful. We'll keep it
simple, but I <ahref="https://stackoverflow.com/a/4584131/1454178"target="_blank"rel="noopener noreferrer">found</a> a
<ahref="https://stackoverflow.com/a/26026278/1454178"target="_blank"rel="noopener noreferrer">refresher</a> on the <code>push</code> and <code>pop</code>
<ahref="http://www.cs.virginia.edu/~evans/cs216/guides/x86.html"target="_blank"rel="noopener noreferrer">instructions</a> was helpful while writing
this.</li>
<li>I've tried to be precise in saying only what I can prove using the tools (ASM, docs) that are
available, but if there's something said in error it will be corrected expeditiously. Please let
me know at <ahref="mailto:bradlee@speice.io"target="_blank"rel="noopener noreferrer">bradlee@speice.io</a></li>
</ul>
<p>Finally, I'll do what I can to flag potential future changes but the Rust docs have a notice worth
repeating:</p>
<blockquote>
<p>Rust does not currently have a rigorously and formally defined memory model.</p>
<p>And <em>that's</em> the part I'm going to focus on.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="why-an-allocator">Why an Allocator?<ahref="https://speice.io/2018/12/allocation-safety#why-an-allocator"class="hash-link"aria-label="Direct link to Why an Allocator?"title="Direct link to Why an Allocator?"></a></h2>
<p>So why, after complaining about allocators, would I still want to write one? There are three reasons
for that:</p>
<ol>
<li>Allocation/dropping is slow</li>
<li>It's difficult to know exactly when Rust will allocate or drop, especially when using code that
you did not write</li>
<li>I want automated tools to verify behavior, instead of inspecting by hand</li>
</ol>
<p>When I say "slow," it's important to define the terms. If you're writing web applications, you'll
spend orders of magnitude more time waiting for the database than you will the allocator. However,
there's still plenty of code where micro- or nano-seconds matter; think
<ahref="https://polysync.io/blog/session-types-for-hearty-codecs/"target="_blank"rel="noopener noreferrer">self-driving cars</a>, and
<ahref="https://carllerche.github.io/bytes/bytes/index.html"target="_blank"rel="noopener noreferrer">networking</a>. In these situations it's simply
unacceptable for you to spend time doing things that are not your program, and waiting on the
allocator is not cool.</p>
<p>As I continue to learn Rust, it's difficult for me to predict where exactly allocations will happen.
So, I propose we play a quick trivia game: <strong>Does this code invoke the allocator?</strong></p>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="example-1">Example 1<ahref="https://speice.io/2018/12/allocation-safety#example-1"class="hash-link"aria-label="Direct link to Example 1"title="Direct link to Example 1"></a></h3>
<p><strong>No</strong>: Rust <ahref="https://doc.rust-lang.org/std/mem/fn.size_of.html"target="_blank"rel="noopener noreferrer">knows how big</a> the <code>Vec</code> type is,
and reserves a fixed amount of memory on the stack for the <code>v</code> vector. However, if we wanted to
reserve extra space (using <code>Vec::with_capacity</code>) the allocator would get invoked.</p>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="example-2">Example 2<ahref="https://speice.io/2018/12/allocation-safety#example-2"class="hash-link"aria-label="Direct link to Example 2"title="Direct link to Example 2"></a></h3>
<p><strong>Yes</strong>: Because Boxes allow us to work with things that are of unknown size, it has to allocate on
the heap. While the <code>Box</code> is unnecessary in this snippet (release builds will optimize out the
allocation), reserving heap space more generally is needed to pass a dynamically sized type to
another function.</p>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="example-3">Example 3<ahref="https://speice.io/2018/12/allocation-safety#example-3"class="hash-link"aria-label="Direct link to Example 3"title="Direct link to Example 3"></a></h3>
<p><strong>Maybe</strong>: Depending on whether the Vector we were given has space available, we may or may not
allocate. Especially when dealing with code that you did not author, it's difficult to verify that
things behave as you expect them to.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="blowing-things-up">Blowing Things Up<ahref="https://speice.io/2018/12/allocation-safety#blowing-things-up"class="hash-link"aria-label="Direct link to Blowing Things Up"title="Direct link to Blowing Things Up"></a></h2>
<p>So, how exactly does QADAPT solve these problems? <strong>Whenever an allocation or drop occurs in code
marked allocation-safe, QADAPT triggers a thread panic.</strong> We don't want to let the program continue
as if nothing strange happened, <em>we want things to explode</em>.</p>
<p>However, you don't want code to panic in production because of circumstances you didn't predict.
Just like <ahref="https://doc.rust-lang.org/std/macro.debug_assert.html"target="_blank"rel="noopener noreferrer"><code>debug_assert!</code></a>, <strong>QADAPT will
strip out its own code when building in release mode to guarantee no panics and no performance
impact.</strong></p>
<p>Finally, there are three ways to have QADAPT check that your code will not invoke the allocator:</p>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="using-a-procedural-macro">Using a procedural macro<ahref="https://speice.io/2018/12/allocation-safety#using-a-procedural-macro"class="hash-link"aria-label="Direct link to Using a procedural macro"title="Direct link to Using a procedural macro"></a></h3>
<p>The easiest method, watch an entire function for allocator invocation:</p>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="using-a-regular-macro">Using a regular macro<ahref="https://speice.io/2018/12/allocation-safety#using-a-regular-macro"class="hash-link"aria-label="Direct link to Using a regular macro"title="Direct link to Using a regular macro"></a></h3>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="using-function-calls">Using function calls<ahref="https://speice.io/2018/12/allocation-safety#using-function-calls"class="hash-link"aria-label="Direct link to Using function calls"title="Direct link to Using function calls"></a></h3>
<h3class="anchor anchorWithStickyNavbar_LWe7"id="caveats">Caveats<ahref="https://speice.io/2018/12/allocation-safety#caveats"class="hash-link"aria-label="Direct link to Caveats"title="Direct link to Caveats"></a></h3>
<p>It's important to point out that QADAPT code is synchronous, so please be careful when mixing in
<h2class="anchor anchorWithStickyNavbar_LWe7"id="conclusion">Conclusion<ahref="https://speice.io/2018/12/allocation-safety#conclusion"class="hash-link"aria-label="Direct link to Conclusion"title="Direct link to Conclusion"></a></h2>
<p>While there's a lot more to writing high-performance code than managing your usage of the allocator,
it's critical that you do use the allocator correctly. QADAPT will verify that your code is doing
what you expect. It's usable even on stable Rust from version 1.31 onward, which isn't the case for
most allocators. Version 1.0 was released today, and you can check it out over at
<ahref="https://crates.io/crates/qadapt"target="_blank"rel="noopener noreferrer">crates.io</a> or on <ahref="https://github.com/bspeice/qadapt"target="_blank"rel="noopener noreferrer">github</a>.</p>
<p>I'm hoping to write more about high-performance Rust in the future, and I expect that QADAPT will
help guide that. If there are topics you're interested in, let me know in the comments below!</p>]]></content:encoded>
<description><![CDATA[I recently stumbled across a phenomenal small article entitled]]></description>
<content:encoded><![CDATA[<p>I recently stumbled across a phenomenal small article entitled
<ahref="https://angel.co/blog/what-startups-really-mean-by-why-should-we-hire-you"target="_blank"rel="noopener noreferrer">What Startups Really Mean By "Why Should We Hire You?"</a>.
Having been interviewed by smaller companies (though not exactly startups), the questions and
subtexts are the same. There's often a question behind the question that you're actually trying to
answer, and I wish I spotted the nuance earlier in my career.</p>
<p>Let me also make note of one more question/euphemism I've come across:</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="how-do-you-feel-about-production-support">How do you feel about production support?<ahref="https://speice.io/2018/12/what-small-business-really-means#how-do-you-feel-about-production-support"class="hash-link"aria-label="Direct link to How do you feel about production support?"title="Direct link to How do you feel about production support?"></a></h2>
<p><strong>Translation</strong>: <em>We're a fairly small team, and when things break on an evening/weekend/Christmas
Day, can we call on you to be there?</em></p>
<p>I've met decidedly few people in my life who truly enjoy the "ops" side of "devops". They're
incredibly good at taking an impossible problem, pre-existing knowledge of arcane arts, and turning
that into a functioning system at the end. And if they all left for lunch, we probably wouldn't make
it out the door before the zombie apocalypse.</p>
<p>Larger organizations (in my experience, 500+ person organizations) have the luxury of hiring people
who either enjoy that, or play along nicely enough that our systems keep working.</p>
<p>Small teams have no such luck. If you're interviewing at a small company, especially as a "data
scientist" or other somesuch position, be aware that systems can and do spontaneously combust at the
most inopportune moments.</p>
<p><strong>Terrible-but-popular answers include</strong>: <em>It's a part of the job, and I'm happy to contribute.</em></p>]]></content:encoded>
</item>
<item>
<title><![CDATA[A case study in heaptrack]]></title>
<p>But the principle remains: be efficient with the resources you have, because
<ahref="http://exo-blog.blogspot.com/2007/09/what-intel-giveth-microsoft-taketh-away.html"target="_blank"rel="noopener noreferrer">what Intel giveth, Microsoft taketh away</a>.</p>
<p>My professional work is focused on this kind of efficiency; low-latency financial markets demand
that you understand at a deep level <em>exactly</em> what your code is doing. As I continue experimenting
with Rust for personal projects, it's exciting to bring a utilitarian mindset with me: there's
flexibility for the times I pretend to have a garbage collector, and flexibility for the times that
I really care about how memory is used.</p>
<p>This post is a (small) case study in how I went from the former to the latter. And ultimately, it's
intended to be a starting toolkit to empower analysis of your own code.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="curiosity">Curiosity<ahref="https://speice.io/2018/10/case-study-optimization#curiosity"class="hash-link"aria-label="Direct link to Curiosity"title="Direct link to Curiosity"></a></h2>
<p>When I first started building the <ahref="https://crates.io/crates/dtparse"target="_blank"rel="noopener noreferrer">dtparse</a> crate, my intention was to mirror as closely as possible
the equivalent <ahref="https://github.com/dateutil/dateutil"target="_blank"rel="noopener noreferrer">Python library</a>. Python, as you may know, is garbage collected. Very
rarely is memory usage considered in Python, and I likewise wasn't paying too much attention when
<code>dtparse</code> was first being built.</p>
<p>This lackadaisical approach to memory works well enough, and I'm not planning on making <code>dtparse</code>
hyper-efficient. But every so often, I've wondered: "what exactly is going on in memory?" With the
advent of Rust 1.28 and the
<ahref="https://doc.rust-lang.org/std/alloc/trait.GlobalAlloc.html"target="_blank"rel="noopener noreferrer">Global Allocator trait</a>, I had a really
great idea: <em>build a custom allocator that allows you to track your own allocations.</em> That way, you
can do things like writing tests for both correct results and correct memory usage. I gave it a
<ahref="https://crates.io/crates/qadapt"target="_blank"rel="noopener noreferrer">shot</a>, but learned very quickly: <strong>never write your own allocator</strong>. It went from "fun
weekend project" to "I have literally no idea what my computer is doing" at breakneck speed.</p>
<p>Instead, I'll highlight a separate path I took to make sense of my memory usage: <ahref="https://github.com/KDE/heaptrack"target="_blank"rel="noopener noreferrer">heaptrack</a>.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="turning-on-the-system-allocator">Turning on the System Allocator<ahref="https://speice.io/2018/10/case-study-optimization#turning-on-the-system-allocator"class="hash-link"aria-label="Direct link to Turning on the System Allocator"title="Direct link to Turning on the System Allocator"></a></h2>
<p>This is the hardest part of the post. Because Rust uses
<ahref="https://github.com/rust-lang/rust/pull/27400#issue-41256384"target="_blank"rel="noopener noreferrer">its own allocator</a> by default,
<code>heaptrack</code> is unable to properly record unmodified Rust code. To remedy this, we'll make use of the
<code>#[global_allocator]</code> attribute.</p>
<p>Specifically, in <code>lib.rs</code> or <code>main.rs</code>, add this:</p>
<p>...and that's it. Everything else comes essentially for free.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="running-heaptrack">Running heaptrack<ahref="https://speice.io/2018/10/case-study-optimization#running-heaptrack"class="hash-link"aria-label="Direct link to Running heaptrack"title="Direct link to Running heaptrack"></a></h2>
<p>Assuming you've installed heaptrack <small>(Homebrew in Mac, package manager
in Linux, ??? in Windows)</small>, all that's left is to fire up your application:</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="reading-flamegraphs">Reading Flamegraphs<ahref="https://speice.io/2018/10/case-study-optimization#reading-flamegraphs"class="hash-link"aria-label="Direct link to Reading Flamegraphs"title="Direct link to Reading Flamegraphs"></a></h2>
<p>To make sense of our memory usage, we're going to focus on that last picture - it's called a
<ahref="http://www.brendangregg.com/flamegraphs.html"target="_blank"rel="noopener noreferrer">"flamegraph"</a>. These charts are typically used to
show how much time your program spends executing each function, but they're used here to show how
much memory was allocated during those functions instead.</p>
<p>For example, we can see that all executions happened during the <code>main</code> function:</p>
<p><imgdecoding="async"loading="lazy"alt="allocations in main"src="https://speice.io/assets/images/heaptrack-main-colorized-cfe5d7d345d32cfc1a0f297580619718.png"width="654"height="343"class="img_ev3q"></p>
<p>...and within that, all allocations happened during <code>dtparse::parse</code>:</p>
<p><imgdecoding="async"loading="lazy"alt="allocations in dtparse"src="https://speice.io/assets/images/heaptrack-dtparse-colorized-e6caf224f50df2dd56981f5b02970325.png"width="654"height="315"class="img_ev3q"></p>
<p>...and within <em>that</em>, allocations happened in two different places:</p>
<p><imgdecoding="async"loading="lazy"alt="allocations in parseinfo"src="https://speice.io/assets/images/heaptrack-parseinfo-colorized-a1898beaf28a3997ac86810f872539b7.png"width="654"height="372"class="img_ev3q"></p>
<p>Now I apologize that it's hard to see, but there's one area specifically that stuck out as an issue:
<strong>what the heck is the <code>Default</code> thing doing?</strong></p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="optimizing-dtparse">Optimizing dtparse<ahref="https://speice.io/2018/10/case-study-optimization#optimizing-dtparse"class="hash-link"aria-label="Direct link to Optimizing dtparse"title="Direct link to Optimizing dtparse"></a></h2>
<p>See, I knew that there were some allocations during calls to <code>dtparse::parse</code>, but I was totally
wrong about where the bulk of allocations occurred in my program. Let me post the code and see if
<p>Because <code>Parser::parse</code> requires a mutable reference to itself, I have to create a new
<code>Parser::default</code> every time it receives a string. This is excessive! We'd rather have an immutable
parser that can be re-used, and avoid allocating memory in the first place.</p>
<p>Armed with that information, I put some time in to
<ahref="https://github.com/bspeice/dtparse/commit/741afa34517d6bc1155713bbc5d66905fea13fad#diff-b4aea3e418ccdb71239b96952d9cddb6"target="_blank"rel="noopener noreferrer">make the parser immutable</a>.
Now that I can re-use the same parser over and over, the allocations disappear:</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="conclusion">Conclusion<ahref="https://speice.io/2018/10/case-study-optimization#conclusion"class="hash-link"aria-label="Direct link to Conclusion"title="Direct link to Conclusion"></a></h2>
<p>In the end, you don't need to write a custom allocator to be efficient with memory, great tools
already exist to help you understand what your program is doing.</p>
<p><strong>Use them.</strong></p>
<p>Given that <ahref="https://en.wikipedia.org/wiki/Moore%27s_law"target="_blank"rel="noopener noreferrer">Moore's Law</a> is
<ahref="https://www.technologyreview.com/s/601441/moores-law-is-dead-now-what/"target="_blank"rel="noopener noreferrer">dead</a>, we've all got to do
our part to take back what Microsoft stole.</p>]]></content:encoded>
</item>
<item>
<title><![CDATA[Isomorphic desktop apps with Rust]]></title>
<ahref="https://kangax.github.io/compat-table/es2016plus/"target="_blank"rel="noopener noreferrer">actual implementation</a>. The answer to this
conundrum is of course to recompile code from newer versions of the language to older versions <em>of
the same language</em> before running. At least <ahref="https://babeljs.io/"target="_blank"rel="noopener noreferrer">Babel</a> is a nice tongue-in-cheek reference.</p>
<p>Yet for as much hate as <ahref="https://electronjs.org/"target="_blank"rel="noopener noreferrer">Electron</a> receives, it does a stunningly good job at solving a really hard
problem: <em>how the hell do I put a button on the screen and react when the user clicks it</em>? GUI
programming is hard, straight up. But if browsers are already able to run everywhere, why don't we
take advantage of someone else solving the hard problems for us? I don't like that I have to use
Javascript for it, but I really don't feel inclined to whip out good ol' <ahref="https://wxwidgets.org/"target="_blank"rel="noopener noreferrer">wxWidgets</a>.</p>
<p>Now there are other native solutions (<ahref="https://github.com/LeoTindall/libui-rs/"target="_blank"rel="noopener noreferrer">libui-rs</a>, <ahref="https://github.com/PistonDevelopers/conrod"target="_blank"rel="noopener noreferrer">conrod</a>, <ahref="https://github.com/kenz-gelsoft/wxRust"target="_blank"rel="noopener noreferrer">oh hey wxWdidgets again!</a>), but
those also have their own issues with distribution, styling, etc. With Electron, I can
<code>yarn create electron-app my-app</code> and just get going, knowing that packaging/upgrades/etc. are built
in.</p>
<p>My question is: given recent innovations with WASM, <em>are we Electron yet</em>?</p>
<p>No, not really.</p>
<p>Instead, <strong>what would it take to get to a point where we can skip Javascript in Electron apps?</strong></p>
<p>Truth is, WASM/Webassembly is a pretty new technology and I'm a total beginner in this area. There
may already be solutions to the issues I discuss, but I'm totally unaware of them, so I'm going to
try and organize what I did manage to discover.</p>
<p>I should also mention that the content and things I'm talking about here are not intended to be
prescriptive, but more "if someone else is interested, what do we already know doesn't work?" <em>I
expect everything in this post to be obsolete within two months.</em> Even over the course of writing
this, <ahref="https://mnt.io/2018/08/28/from-rust-to-beyond-the-asm-js-galaxy/"target="_blank"rel="noopener noreferrer">a separate blog post</a> had
to be modified because <ahref="https://github.com/WebAssembly/binaryen/pull/1642"target="_blank"rel="noopener noreferrer">upstream changes</a> broke a
<ahref="https://github.com/rustwasm/wasm-bindgen/pull/787"target="_blank"rel="noopener noreferrer">Rust tool</a> the post tried to use. The post
all this happened within the span of a week.</strong> Things are moving quickly.</p>
<p>I'll also note that we're going to skip <ahref="http://asmjs.org/"target="_blank"rel="noopener noreferrer">asm.js</a> and <ahref="https://kripken.github.io/emscripten-site/"target="_blank"rel="noopener noreferrer">emscripten</a>. Truth be told, I couldn't get
either of these to output anything, and so I'm just going to say
<ahref="https://en.wikipedia.org/wiki/Here_be_dragons"target="_blank"rel="noopener noreferrer">here be dragons.</a> Everything I'm discussing here
uses the <code>wasm32-unknown-unknown</code> target.</p>
<p>The code that I <em>did</em> get running is available
<ahref="https://github.com/speice-io/isomorphic-rust"target="_blank"rel="noopener noreferrer">over here</a>. Feel free to use it as a starting point,
but I'm mostly including the link as a reference for the things that were attempted.</p>
<h1>An Example Running Application</h1>
<p>So, I did <em>technically</em> get a running application:</p>
<p><imgdecoding="async"loading="lazy"alt="Electron app using WASM"src="https://speice.io/assets/images/electron-percy-wasm-9ccb2be15a9bed6da44486afc266bad5.png"width="800"height="319"class="img_ev3q"></p>
<p>...but I wouldn't really call it a "high quality" starting point to base future work on. It's mostly
there to prove this is possible in the first place. And that's something to be proud of! There's a
huge amount of engineering that went into showing a window with the text "It's alive!".</p>
<p>There's also a lot of usability issues that prevent me from recommending anyone try Electron and
WASM apps at the moment, and I think that's the more important thing to discuss.</p>
<h1>Issue the First: Complicated Toolchains</h1>
<p>I quickly established that <ahref="https://github.com/rustwasm/wasm-bindgen"target="_blank"rel="noopener noreferrer">wasm-bindgen</a> was necessary to "link" my Rust code to Javascript. At
that point you've got an Electron app that starts an HTML page which ultimately fetches your WASM
blob. To keep things simple, the goal was to package everything using <ahref="https://webpack.js.org/"target="_blank"rel="noopener noreferrer">webpack</a> so that I could just
load a <code>bundle.js</code> file on the page. That decision was to be the last thing that kinda worked in
this process.</p>
<p>The first issue
<ahref="https://www.reddit.com/r/rust/comments/98lpun/unable_to_load_wasm_for_electron_application/"target="_blank"rel="noopener noreferrer">I ran into</a>
while attempting to bundle everything via <code>webpack</code> is a detail in the WASM spec:</p>
<blockquote>
<p>This function accepts a Response object, or a promise for one, and ... <strong>[if > it] does not match
the <code>application/wasm</code> MIME type</strong>, the returned promise will be rejected with a TypeError;</p>
<p><ahref="https://webassembly.org/docs/web/#additional-web-embedding-api"target="_blank"rel="noopener noreferrer">WebAssembly - Additional Web Embedding API</a></p>
</blockquote>
<p>Specifically, if you try and load a WASM blob without the MIME type set, you'll get an error. On the
web this isn't a huge issue, as the server can set MIME types when delivering the blob. With
Electron, you're resolving things with a <code>file://</code> URL and thus can't control the MIME type:</p>
<p>There are a couple of solutions depending on how far into the deep end you care to venture:</p>
<ul>
<li>Embed a static file server in your Electron application</li>
<li>Use a <ahref="https://electronjs.org/docs/api/protocol"target="_blank"rel="noopener noreferrer">custom protocol</a> and custom protocol handler</li>
<li>Host your WASM blob on a website that you resolve at runtime</li>
</ul>
<p>But all these are pretty bad solutions and defeat the purpose of using WASM in the first place.
Instead, my workaround was to
<ahref="https://github.com/webpack/webpack/issues/7918"target="_blank"rel="noopener noreferrer">open a PR with <code>webpack</code></a> and use regex to remove
<li><code>yarn start</code> triggers the <code>prestart</code> script</li>
<li><code>prestart</code> checks for missing tools (<code>wasm-bindgen-cli</code>, etc.) and then:<!---->
<ul>
<li>Uses <code>cargo</code> to compile the Rust code into WASM</li>
<li>Uses <code>wasm-bindgen</code> to link the WASM blob into a Javascript file with exported symbols</li>
<li>Uses <code>webpack</code> to bundle the page start script with the Javascript we just generated<!---->
<ul>
<li>Uses <code>babel</code> under the hood to compile the <code>wasm-bindgen</code> code down from ES6 into something
browser-compatible</li>
</ul>
</li>
</ul>
</li>
<li>The <code>start</code> script runs an Electron Forge handler to do some sanity checks</li>
<li>Electron actually starts</li>
</ul>
<p>...which is complicated. I think more work needs to be done to either build a high-quality starter
app that can manage these steps, or another tool that "just handles" the complexity of linking a
compiled WASM file into something the Electron browser can run.</p>
<h1>Issue the Second: WASM tools in Rust</h1>
<p>For as much as I didn't enjoy the Javascript tooling needed to interface with Rust, the Rust-only
bits aren't any better at the moment. I get it, a lot of projects are just starting off, and that
leads to a fragmented ecosystem. Here's what I can recommend as a starting point:</p>
<p>Don't check in your <code>Cargo.lock</code> files to version control. If there's a disagreement between the
version of <code>wasm-bindgen-cli</code> you have installed and the <code>wasm-bindgen</code> you're compiling with in
<code>Cargo.lock</code>, you get a nasty error:</p>
<divclass="codeBlockContainer_Ckt0 theme-code-block"style="--prism-background-color:hsl(230, 1%, 98%);--prism-color:hsl(230, 8%, 24%)"><divclass="codeBlockContent_biex"><pretabindex="0"class="prism-code language-text codeBlock_bY9V thin-scrollbar"style="background-color:hsl(230, 1%, 98%);color:hsl(230, 8%, 24%)"><codeclass="codeBlockLines_e6Vv"><spanclass="token-line"style="color:hsl(230, 8%, 24%)"><spanclass="token plain">it looks like the Rust project used to create this wasm file was linked against</span><br></span><spanclass="token-line"style="color:hsl(230, 8%, 24%)"><spanclass="token plain">a different version of wasm-bindgen than this binary:</span><br></span><spanclass="token-line"style="color:hsl(230, 8%, 24%)"><spanclass="token plain"style="display:inline-block"></span><br></span><spanclass="token-line"style="color:hsl(230, 8%, 24%)"><spanclass="token plain">rust wasm file: 0.2.21</span><br></span><spanclass="token-line"style="color:hsl(230, 8%, 24%)"><spanclass="token plain"> this binary: 0.2.17</span><br></span><spanclass="token-line"style="color:hsl(230, 8%, 24%)"><spanclass="token plain"style="display:inline-block"></span><br></span><spanclass="token-line"style="color:hsl(230, 8%, 24%)"><spanclass="token plain">Currently the bindgen format is unstable enough that these two version must</span><br></span><spanclass="token-line"style="color:hsl(230, 8%, 24%)"><spanclass="token plain">exactly match, so it's required that these two version are kept in sync by</span><br></span><spanclass="token-line"style="color:hsl(230, 8%, 24%)"><spanclass="token plain">either updating the wasm-bindgen dependency or this binary.</span><br></span></code></pre><divclass="buttonGroup__atx"><buttontype="button"aria-label="Copy code to clipboard"title="Copy"class="clean-btn"><spanclass="copyButtonIcons_eSgA"aria-hidden="true"><svgviewBox="0 0 24 24"class="copyButtonIcon_y97N"><pathfill="currentColor"d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svgviewBox="0 0 24 24"class="copyButtonSuccessIcon_LjdS"><pathfill="currentColor"d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div>
<p>Not that I ever managed to run into this myself (<em>coughs nervously</em>).</p>
<p>There are two projects attempting to be "application frameworks": <ahref="https://chinedufn.github.io/percy/"target="_blank"rel="noopener noreferrer">percy</a> and <ahref="https://github.com/DenisKolodin/yew"target="_blank"rel="noopener noreferrer">yew</a>. Between those,
I managed to get <ahref="https://github.com/speice-io/isomorphic-rust/tree/master/percy"target="_blank"rel="noopener noreferrer">two</a>
using <code>percy</code>, but was unable to get an
<ahref="https://github.com/speice-io/isomorphic-rust/tree/master/yew"target="_blank"rel="noopener noreferrer">example</a> running with <code>yew</code> because
of issues with "missing modules" during the <code>webpack</code> step:</p>
<p>If you want to work with the browser APIs directly, your choices are <ahref="https://crates.io/crates/percy-webapis"target="_blank"rel="noopener noreferrer">percy-webapis</a> or <ahref="https://crates.io/crates/stdweb"target="_blank"rel="noopener noreferrer">stdweb</a> (or
eventually <ahref="https://crates.io/crates/web-sys"target="_blank"rel="noopener noreferrer">web-sys</a>). See above for my <code>percy</code> examples, but when I tried
<ahref="https://github.com/speice-io/isomorphic-rust/tree/master/stdweb"target="_blank"rel="noopener noreferrer">an example with <code>stdweb</code></a>, I was
<p>At this point I'm pretty convinced that <code>stdweb</code> is causing issues for <code>yew</code> as well, but can't
prove it.</p>
<p>I did also get a <ahref="https://github.com/speice-io/isomorphic-rust/tree/master/minimal"target="_blank"rel="noopener noreferrer">minimal example</a>
running that doesn't depend on any tools besides <code>wasm-bindgen</code>. However, it requires manually
writing "<code>extern C</code>" blocks for everything you need from the browser. Es no bueno.</p>
<p>Finally, from a tools and platform view, there are two up-and-coming packages that should be
mentioned: <ahref="https://crates.io/crates/js-sys"target="_blank"rel="noopener noreferrer">js-sys</a> and <ahref="https://crates.io/crates/web-sys"target="_blank"rel="noopener noreferrer">web-sys</a>. Their purpose is to be fundamental building blocks that exposes
the browser's APIs to Rust. If you're interested in building an app framework from scratch, these
should give you the most flexibility. I didn't touch either in my research, though I expect them to
be essential long-term.</p>
<p>So there's a lot in play from the Rust side of things, and it's just going to take some time to
figure out what works and what doesn't.</p>
<h1>Issue the Third: Known Unknowns</h1>
<p>Alright, so after I managed to get an application started, I stopped there. It was a good deal of
effort to chain together even a proof of concept, and at this point I'd rather learn <ahref="https://www.typescriptlang.org/"target="_blank"rel="noopener noreferrer">Typescript</a>
than keep trying to maintain an incredibly brittle pipeline. Blasphemy, I know...</p>
<p>The important point I want to make is that there's a lot unknown about how any of this holds up
outside proofs of concept. Things I didn't attempt:</p>
<ul>
<li>Testing</li>
<li>Packaging</li>
<li>Updates</li>
<li>Literally anything related to why I wanted to use Electron in the first place</li>
</ul>
<h1>What it Would Take</h1>
<p>Much as I don't like Javascript, the tools are too shaky for me to recommend mixing Electron and
WASM at the moment. There's a lot of innovation happening, so who knows? Someone might have an
application in production a couple months from now. But at the moment, I'm personally going to stay
away.</p>
<p>Let's finish with a wishlist then - here are the things that I think need to happen before
Electron/WASM/Rust can become a thing:</p>
<ul>
<li>Webpack still needs some updates. The necessary work is in progress, but hasn't landed yet
<li>Browser API libraries (<code>web-sys</code> and <code>stdweb</code>) need to make sure they can support running in
Electron (see module error above)</li>
<li>Projects need to stabilize. There's talk of <code>stdweb</code> being turned into a Rust API
<ahref="https://github.com/rustwasm/team/issues/226#issuecomment-418475778"target="_blank"rel="noopener noreferrer">on top of web-sys</a>, and percy
<ahref="https://github.com/chinedufn/percy/issues/24"target="_blank"rel="noopener noreferrer">moving to web-sys</a>, both of which are big changes</li>
<li><code>wasm-bindgen</code> is great, but still in the "move fast and break things" phase</li>
<li>A good "boilerplate" app would dramatically simplify the start-up costs;
<ahref="https://github.com/chentsulin/electron-react-boilerplate"target="_blank"rel="noopener noreferrer">electron-react-boilerplate</a> comes to
mind as a good project to imitate</li>
<li>More blog posts/contributors! I think Electron + Rust could be cool, but I have no idea what I'm
doing</li>
</ul>]]></content:encoded>
</item>
<item>
<title><![CDATA[Primitives in Rust are weird (and cool)]]></title>
<p>And to my complete befuddlement, it compiled, ran, and produced a completely sensible output.</p>
<p>The reason I was so surprised has to do with how Rust treats a special category of things I'm going to
call <em>primitives</em>. In the current version of the Rust book, you'll see them referred to as
<ahref="https://doc.rust-lang.org/book/second-edition/ch03-02-data-types.html#scalar-types"target="_blank"rel="noopener noreferrer">scalars</a>, and in older versions they'll be called <ahref="https://doc.rust-lang.org/book/first-edition/primitive-types.html"target="_blank"rel="noopener noreferrer">primitives</a>, but
we're going to stick with the name <em>primitive</em> for the time being. Explaining why this program is so
cool requires talking about a number of other programming languages, and keeping a consistent
terminology makes things easier.</p>
<p><strong>You've been warned:</strong> this is going to be a tedious post about a relatively minor issue that
involves Java, Python, C, and x86 Assembly. And also me pretending like I know what I'm talking
about with assembly.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="defining-primitives-java">Defining primitives (Java)<ahref="https://speice.io/2018/09/primitives-in-rust-are-weird#defining-primitives-java"class="hash-link"aria-label="Direct link to Defining primitives (Java)"title="Direct link to Defining primitives (Java)"></a></h2>
<p>The reason I'm using the name <em>primitive</em> comes from how much of my life is Java right now. For the most part I like Java, but I digress. In Java, there's a special
<p>They are referred to as <ahref="https://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html"target="_blank"rel="noopener noreferrer">primitives</a>. And relative to the other bits of Java,
they have two unique features. First, they don't have to worry about the
and things that inherit from it are pointers under the hood, and we have to dereference them before
the fields and methods they define can be used. In contrast, <em>primitive types are just values</em> -
there's nothing to be dereferenced. In memory, they're just a sequence of bits.</p>
<p>If we really want, we can turn the <code>int</code> into an
<ahref="https://docs.oracle.com/javase/10/docs/api/java/lang/Integer.html"target="_blank"rel="noopener noreferrer"><code>Integer</code></a> and then dereference
<p>This creates the variable <code>y</code> of type <code>Integer</code> (which inherits <code>Object</code>), and at run time we
dereference <code>y</code> to locate the <code>toString()</code> function and call it. Rust obviously handles things a bit
differently, but we have to dig into the low-level details to see it in action.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="low-level-handling-of-primitives-c">Low Level Handling of Primitives (C)<ahref="https://speice.io/2018/09/primitives-in-rust-are-weird#low-level-handling-of-primitives-c"class="hash-link"aria-label="Direct link to Low Level Handling of Primitives (C)"title="Direct link to Low Level Handling of Primitives (C)"></a></h2>
<p>We first need to build a foundation for reading and understanding the assembly code the final answer
requires. Let's begin with showing how the <code>C</code> language (and your computer) thinks about "primitive"
<p>At a really low level of memory, we're copying bits around using the <ahref="http://www.cs.virginia.edu/~evans/cs216/guides/x86.html"target="_blank"rel="noopener noreferrer"><code>mov</code></a> instruction;
nothing crazy. But to show how similar Rust is, let's take a look at our program translated from C
<p>The generated Rust assembly is functionally pretty close to the C assembly: <em>When working with
primitives, we're just dealing with bits in memory</em>.</p>
<p>In Java we have to dereference a pointer to call its functions; in Rust, there's no pointer to
dereference. So what exactly is going on with this <code>.to_string()</code> function call?</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="impl-primitive-and-python">impl primitive (and Python)<ahref="https://speice.io/2018/09/primitives-in-rust-are-weird#impl-primitive-and-python"class="hash-link"aria-label="Direct link to impl primitive (and Python)"title="Direct link to impl primitive (and Python)"></a></h2>
<p>Now it's time to <strike>reveal my trap card</strike> show the revelation that tied all this
together: <em>Rust has implementations for its primitive types.</em> That's right, <code>impl</code> blocks aren't
only for <code>structs</code> and <code>traits</code>, primitives get them too. Don't believe me? Check out
<p>So while Python handles binding instance methods in a way similar to Rust, it's still not able to
run the example we started with.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="conclusion">Conclusion<ahref="https://speice.io/2018/09/primitives-in-rust-are-weird#conclusion"class="hash-link"aria-label="Direct link to Conclusion"title="Direct link to Conclusion"></a></h2>
<p>This was a super-roundabout way of demonstrating it, but the way Rust handles incredibly minor
details like primitives leads to really cool effects. Primitives are optimized like C in how they
have a space-efficient memory layout, yet the language still has a lot of features I enjoy in Python
(like both instance and late binding).</p>
<p>And when you put it together, there are areas where Rust does cool things nobody else can; as a
quirky feature of Rust's type system, <code>8.to_string()</code> is actually valid code.</p>
<p>Now go forth and fool your friends into thinking you know assembly. This is all I've got.</p>]]></content:encoded>
</item>
<item>
<title><![CDATA[What I learned porting dateutil to Rust]]></title>
<description><![CDATA[I've mostly been a lurker in Rust for a while, making a couple small contributions here and there.]]></description>
<content:encoded><![CDATA[<p>I've mostly been a lurker in Rust for a while, making a couple small contributions here and there.
So launching <ahref="https://github.com/bspeice/dtparse"target="_blank"rel="noopener noreferrer">dtparse</a> feels like nice step towards becoming a
functioning member of society. But not too much, because then you know people start asking you to
pay bills, and ain't nobody got time for that.</p>
<p>But I built dtparse, and you can read about my thoughts on the process. Or don't. I won't tell you
what to do with your life (but you should totally keep reading).</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="slow-down-what">Slow down, what?<ahref="https://speice.io/2018/06/dateutil-parser-to-rust#slow-down-what"class="hash-link"aria-label="Direct link to Slow down, what?"title="Direct link to Slow down, what?"></a></h2>
<p>OK, fine, I guess I should start with <em>why</em> someone would do this.</p>
<p><ahref="https://github.com/dateutil/dateutil"target="_blank"rel="noopener noreferrer">Dateutil</a> is a Python library for handling dates. The
standard library support for time in Python is kinda dope, but there are a lot of extras that go
into making it useful beyond just the <ahref="https://docs.python.org/3.6/library/datetime.html"target="_blank"rel="noopener noreferrer">datetime</a>
module. <code>dateutil.parser</code> specifically is code to take all the super-weird time formats people come
up with and turn them into something actually useful.</p>
<p>Date/time parsing, it turns out, is just like everything else involving
<ahref="https://infiniteundo.com/post/25326999628/falsehoods-programmers-believe-about-time"target="_blank"rel="noopener noreferrer">computers</a> and
<ahref="https://infiniteundo.com/post/25509354022/more-falsehoods-programmers-believe-about-time"target="_blank"rel="noopener noreferrer">time</a>: it
feels like it shouldn't be that difficult to do, until you try to do it, and you realize that people
suck and this is why
<ahref="https://zachholman.com/talk/utc-is-enough-for-everyone-right"target="_blank"rel="noopener noreferrer">we can't we have nice things</a>. But
alas, we'll try and make contemporary art out of the rubble and give it a pretentious name like
It takes in the time as a string, and gives you back a reasonable "look, this is the best anyone can
possibly do to make sense of your input" value. It doesn't expect much of you.</p>
<p><ahref="https://github.com/bspeice/dtparse/blob/7d565d3a78876dbebd9711c9720364fe9eba7915/src/lib.rs#L1332"target="_blank"rel="noopener noreferrer">And now it's in Rust.</a></p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="lost-in-translation">Lost in Translation<ahref="https://speice.io/2018/06/dateutil-parser-to-rust#lost-in-translation"class="hash-link"aria-label="Direct link to Lost in Translation"title="Direct link to Lost in Translation"></a></h2>
<p>Having worked at a bulge-bracket bank watching Java programmers try to be Python programmers, I'm
admittedly hesitant to publish Python code that's trying to be Rust. Interestingly, Rust code can
actually do a great job of mimicking Python. It's certainly not idiomatic Rust, but I've had better
who attempted the same thing for D. These are the actual take-aways:</p>
<p>When transcribing code, <strong>stay as close to the original library as possible</strong>. I'm talking about
using the same variable names, same access patterns, the whole shebang. It's way too easy to make a
couple of typos, and all of a sudden your code blows up in new and exciting ways. Having a reference
manual for verbatim what your code should be means that you don't spend that long debugging
complicated logic, you're more looking for typos.</p>
<p>Also, <strong>don't use nice Rust things like enums</strong>. While
<ahref="https://github.com/bspeice/dtparse/blob/7d565d3a78876dbebd9711c9720364fe9eba7915/src/lib.rs#L88-L94"target="_blank"rel="noopener noreferrer">one time it worked out OK for me</a>,
I also managed to shoot myself in the foot a couple times because <code>dateutil</code> stores AM/PM as a
boolean and I mixed up which was true, and which was false (side note: AM is false, PM is true). In
general, writing nice code <em>should not be a first-pass priority</em> when you're just trying to recreate
the same functionality.</p>
<p><strong>Exceptions are a pain.</strong> Make peace with it. Python code is just allowed to skip stack frames. So
when a co-worker told me "Rust is getting try-catch syntax" I properly freaked out. Turns out
<ahref="https://github.com/rust-lang/rfcs/pull/243"target="_blank"rel="noopener noreferrer">he's not quite right</a>, and I'm OK with that. And while
<code>dateutil</code> is pretty well-behaved about not skipping multiple stack frames,
I used to think that Python's whitespace was just there to get you to format your code correctly. I
think that no longer. It's way too easy to close a block too early and have incredibly weird issues
in the logic. Make sure you use an editor that displays indentation levels so you can keep things
straight.</p>
<p><strong>Rust macros are not free.</strong> I originally had the
<ahref="https://github.com/bspeice/dtparse/blob/b0e737f088eca8e83ab4244c6621a2797d247697/tests/compat.rs#L63-L217"target="_blank"rel="noopener noreferrer">main test body</a>
wrapped up in a macro using <ahref="https://github.com/PyO3/PyO3"target="_blank"rel="noopener noreferrer">pyo3</a>. It took two minutes to compile.
After
<ahref="https://github.com/bspeice/dtparse/blob/e017018295c670e4b6c6ee1cfff00dbb233db47d/tests/compat.rs#L76-L205"target="_blank"rel="noopener noreferrer">moving things to a function</a>
compile times dropped down to ~5 seconds. Turns out 150 lines * 100 tests = a lot of redundant code
to be compiled. My new rule of thumb is that any macros longer than 10-15 lines are actually
functions that need to be liberated, man.</p>
<p>Finally, <strong>I really miss list comprehensions and dictionary comprehensions.</strong> As a quick comparison,
<ahref="https://github.com/bspeice/dtparse/blob/7d565d3a78876dbebd9711c9720364fe9eba7915/src/lib.rs#L619-L629"target="_blank"rel="noopener noreferrer">the implementation in Rust</a>.
I probably wrote it wrong, and I'm sorry. Ultimately though, I hope that these comprehensions can be
added through macros or syntax extensions. Either way, they're expressive, save typing, and are
super-readable. Let's get more of that.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="using-a-young-language">Using a young language<ahref="https://speice.io/2018/06/dateutil-parser-to-rust#using-a-young-language"class="hash-link"aria-label="Direct link to Using a young language"title="Direct link to Using a young language"></a></h2>
<p>Now, Rust is exciting and new, which means that there's opportunity to make a substantive impact. On
more than one occasion though, I've had issues navigating the Rust ecosystem.</p>
<p>What I'll call the "canonical library" is still being built. In Python, if you need datetime
parsing, you use <code>dateutil</code>. If you want <code>decimal</code> types, it's already in the
<ahref="https://docs.python.org/3.6/library/decimal.html"target="_blank"rel="noopener noreferrer">standard library</a>. While I might've gotten away
with <code>f64</code>, <code>dateutil</code> uses decimals, and I wanted to follow the principle of <strong>staying as close to
the original library as possible</strong>. Thus began my quest to find a decimal library in Rust. What I
quickly found was summarized in a comment:</p>
<blockquote>
<p>Writing a BigDecimal is easy. Writing a <em>good</em> BigDecimal is hard.</p>
<ahref="https://github.com/rust-num/num/issues/8"target="_blank"rel="noopener noreferrer">threads</a> to figure out if the library I'm look at is dead
or just stable.</p>
<p>And even when the "canonical library" exists, there's no guarantees that it will be well-maintained.
<ahref="https://github.com/chronotope/chrono"target="_blank"rel="noopener noreferrer">Chrono</a> is the <em>de facto</em> date/time library in Rust, and just
released version 0.4.4 like two days ago. Meanwhile,
<ahref="https://github.com/chronotope/chrono-tz"target="_blank"rel="noopener noreferrer">chrono-tz</a> appears to be dead in the water even though
<ahref="https://github.com/chronotope/chrono-tz/issues/19"target="_blank"rel="noopener noreferrer">there are people happy to help maintain it</a>. I
know relatively little about it, but it appears that most of the release process is automated;
keeping that up to date should be a no-brainer.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="trial-maintenance-policy">Trial Maintenance Policy<ahref="https://speice.io/2018/06/dateutil-parser-to-rust#trial-maintenance-policy"class="hash-link"aria-label="Direct link to Trial Maintenance Policy"title="Direct link to Trial Maintenance Policy"></a></h2>
issue, I'm going to try out the following policy to keep things moving on <code>dtparse</code>:</p>
<ol>
<li>
<p>Issues/PRs needing <em>maintainer</em> feedback will be updated at least weekly. I want to make sure
nobody's blocking on me.</p>
</li>
<li>
<p>To keep issues/PRs needing <em>contributor</em> feedback moving, I'm going to (kindly) ask the
contributor to check in after two weeks, and close the issue without resolution if I hear nothing
back after a month.</p>
</li>
</ol>
<p>The second point I think has the potential to be a bit controversial, so I'm happy to receive
feedback on that. And if a contributor responds with "hey, still working on it, had a kid and I'm
running on 30 seconds of sleep a night," then first: congratulations on sustaining human life. And
second: I don't mind keeping those requests going indefinitely. I just want to try and balance
keeping things moving with giving people the necessary time they need.</p>
<p>I should also note that I'm still getting some best practices in place - CONTRIBUTING and
CONTRIBUTORS files need to be added, as well as issue/PR templates. In progress. None of us are
perfect.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id="roadmap-and-conclusion">Roadmap and Conclusion<ahref="https://speice.io/2018/06/dateutil-parser-to-rust#roadmap-and-conclusion"class="hash-link"aria-label="Direct link to Roadmap and Conclusion"title="Direct link to Roadmap and Conclusion"></a></h2>
<p>So if I've now built a <code>dateutil</code>-compatible parser, we're done, right? Of course not! That's not
nearly ambitious enough.</p>
<p>Ultimately, I'd love to have a library that's capable of parsing everything the Linux <code>date</code> command
can do (and not <code>date</code> on OSX, because seriously, BSD coreutils are the worst). I know Rust has a
coreutils rewrite going on, and <code>dtparse</code> would potentially be an interesting candidate since it
doesn't bring in a lot of extra dependencies. <ahref="https://crates.io/crates/humantime"target="_blank"rel="noopener noreferrer"><code>humantime</code></a>
could help pick up some of the (current) slack in dtparse, so maybe we can share and care with each
other?</p>
<p>All in all, I'm mostly hoping that nobody's already done this and I haven't spent a bit over a month
on redundant code. So if it exists, tell me. I need to know, but be nice about it, because I'm going
to take it hard.</p>
<p>And in the mean time, I'm looking forward to building more. Onwards.</p>]]></content:encoded>