<!doctype html><htmllang=endir=ltrclass="blog-wrapper blog-post-page plugin-blog plugin-id-default"data-has-hydrated=false><metacharset=UTF-8><metaname=generatorcontent="Docusaurus v3.7.0"><titledata-rh=true>Binary format shootout | The Old Speice Guy</title><metadata-rh=truename=viewportcontent="width=device-width, initial-scale=1.0"><metadata-rh=truename=twitter:cardcontent=summary_large_image><metadata-rh=trueproperty=og:urlcontent=https://speice.io/2019/09/binary-format-shootout/><metadata-rh=trueproperty=og:localecontent=en><metadata-rh=truename=docusaurus_localecontent=en><metadata-rh=truename=docusaurus_tagcontent=default><metadata-rh=truename=docsearch:languagecontent=en><metadata-rh=truename=docsearch:docusaurus_tagcontent=default><metadata-rh=trueproperty=og:titlecontent="Binary format shootout | The Old Speice Guy"><metadata-rh=truename=descriptioncontent="I've found that in many personal projects,"><metadata-rh=trueproperty=og:descriptioncontent="I've found that in many personal projects,"><metadata-rh=trueproperty=og:typecontent=article><metadata-rh=trueproperty=article:published_timecontent=2019-09-28T12:00:00.000Z><linkdata-rh=truerel=iconhref=/img/favicon.ico><linkdata-rh=truerel=canonicalhref=https://speice.io/2019/09/binary-format-shootout/><linkdata-rh=truerel=alternatehref=https://speice.io/2019/09/binary-format-shootout/hreflang=en><linkdata-rh=truerel=alternatehref=https://speice.io/2019/09/binary-format-shootout/hreflang=x-default><scriptdata-rh=truetype=application/ld+json>{"@context":"https://schema.org","@id":"https://speice.io/2019/09/binary-format-shootout","@type":"BlogPosting","author":{"@type":"Person","name":"Bradlee Speice"},"dateModified":"2024-11-10T03:06:23.000Z","datePublished":"2019-09-28T12:00:00.000Z","description":"I've found that in many personal projects,","headline":"Binary format shootout","isPartOf":{"@id":"https://speice.io/","@type":"Blog","name":"Blog"},"keywords":[],"mainEntityOfPage":"https://speice.io/2019/09/binary-format-shootout","name":"Binary format shootout","url":"https://speice.io/2019/09/binary-format-shootout"}</script><linkrel=alternatetype=application/rss+xmlhref=/rss.xmltitle="The Old Speice Guy RSS Feed"><linkrel=alternatetype=application/atom+xmlhref=/atom.xmltitle="The Old Speice Guy Atom Feed"><linkrel=stylesheethref=/katex/katex.min.csstype=text/css><linkrel=stylesheethref=/assets/css/styles.24ac2c37.css><scriptsrc=/assets/js/runtime~main.75ada3c5.jsdefer></script><scriptsrc=/assets/js/main.d0bb06d2.jsdefer></script><bodyclass=navigation-with-keyboard><script>!function(){vart,e=function(){try{returnnewURLSearchParams(window.location.search).get("docusaurus-theme")}catch(t){}}()||function(){try{returnwindow.localStorage.getItem("theme")}catch(t){}}();t=null!==e?e:"light",document.documentElement.setAttribute("data-theme",t)}(),function(){try{for(var[t,e]ofnewURLSearchParams(window.location.search).entries())if(t.startsWith("docusaurus-data-")){vara=t.replace("docusaurus-data-","data-");document.documentElement.setAttribute(a,e)}}catch(t){}}()</script><divid=__docusaurus><divrole=regionaria-label="Skip to main content"><aclass=skipToContent_fXgnhref=#__docusaurus_skipToContent_fallback>Skip to main content</a></div><navaria-label=Mainclass="navbar navbar--fixed-top"><divclass=navbar__inner><divclass=navbar__items><buttonaria-label="Toggle navigation bar"aria-expanded=falseclass="navbar__toggle clean-btn"type=button><svgwidth=30height=30viewBox="0 0 30 30"aria-hidden=true><pathstroke=currentColorstroke-linecap=roundstroke-miterlimit=10stroke-width=2d="M4 7h22M4 15h22M4 23h22"/></svg></button><aclass=navbar__brandhref=/><divclass=navbar__logo><imgsrc=/img/logo.svgalt="Sierpinski Gasket"class="themedComponent_mlkZ themedComponent--light_NVdE"><imgsrc=/img/logo-dark.svgalt="Sierpinski Gasket"class="themedComponent_mlkZ themedComponent--dark_xIcU"></div><bclass="navbar__title text--truncate">The Old Speice Guy</b></a></div><divclass="navba
<ahref=https://en.wikipedia.org/wiki/Analysis_paralysistarget=_blankrel="noopener noreferrer">analysis paralysis</a> is particularly deadly.
Making good decisions in the beginning avoids pain and suffering later; if extra research prevents
future problems, I'm happy to continue <del>procrastinating</del> researching indefinitely.</p>
<p>So let's say you're in need of a binary serialization format. Data will be going over the network,
not just in memory, so having a schema document and code generation is a must. Performance is
crucial, so formats that support zero-copy de/serialization are given priority. And the more
languages supported, the better; I use Rust, but can't predict what other languages this could
interact with.</p>
<p>Given these requirements, the candidates I could find were:</p>
<ol>
<li><ahref=https://capnproto.org/target=_blankrel="noopener noreferrer">Cap'n Proto</a> has been around the longest, and is the most established</li>
<li><ahref=https://google.github.io/flatbuffers/target=_blankrel="noopener noreferrer">Flatbuffers</a> is the newest, and claims to have a simpler
encoding</li>
<li><ahref=https://github.com/real-logic/simple-binary-encodingtarget=_blankrel="noopener noreferrer">Simple Binary Encoding</a> has the simplest
encoding, but the Rust implementation is unmaintained</li>
</ol>
<p>Any one of these will satisfy the project requirements: easy to transmit over a network, reasonably
fast, and polyglot support. But how do you actually pick one? It's impossible to know what issues
will follow that choice, so I tend to avoid commitment until the last possible moment.</p>
<p>Still, a choice must be made. Instead of worrying about which is "the best," I decided to build a
small proof-of-concept system in each format and pit them against each other. All code can be found
in the <ahref=https://github.com/speice-io/marketdata-shootouttarget=_blankrel="noopener noreferrer">repository</a> for this post.</p>
<p>We'll discuss more in detail, but a quick preview of the results:</p>
<ul>
<li>Cap'n Proto: Theoretically performs incredibly well, the implementation had issues</li>
<li>Flatbuffers: Has some quirks, but largely lived up to its "zero-copy" promises</li>
<li>SBE: Best median and worst-case performance, but the message structure has a limited feature set</li>
</ul>
<h2class="anchor anchorWithStickyNavbar_LWe7"id=prologue-binary-parsing-with-nom>Prologue: Binary Parsing with Nom<ahref=#prologue-binary-parsing-with-nomclass=hash-linkaria-label="Direct link to Prologue: Binary Parsing with Nom"title="Direct link to Prologue: Binary Parsing with Nom"></a></h2>
<p>Our benchmark system will be a simple data processor; given depth-of-book market data from
<ahref=https://iextrading.com/trading/market-data/#deeptarget=_blankrel="noopener noreferrer">IEX</a>, serialize each message into the schema
format, read it back, and calculate total size of stock traded and the lowest/highest quoted prices.
This test isn't complex, but is representative of the project I need a binary format for.</p>
<p>But before we make it to that point, we have to actually read in the market data. To do so, I'm
using a library called <ahref=https://github.com/Geal/nomtarget=_blankrel="noopener noreferrer"><code>nom</code></a>. Version 5.0 was recently released and
brought some big changes, so this was an opportunity to build a non-trivial program and get
familiar.</p>
<p>If you don't already know about <code>nom</code>, it's a "parser generator". By combining different smaller
parsers, you can assemble a parser to handle complex structures without writing tedious code by
<p>Ultimately, because the <code>nom</code> code in this shootout was the same for all formats, we're not too
interested in its performance. Still, it's worth mentioning that building the market data parser was
actually fun; I didn't have to write tons of boring code by hand.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id=capn-proto>Cap'n Proto<ahref=#capn-protoclass=hash-linkaria-label="Direct link to Cap'n Proto"title="Direct link to Cap'n Proto"></a></h2>
<p>Now it's time to get into the meaty part of the story. Cap'n Proto was the first format I tried
because of how long it has supported Rust (thanks to <ahref=https://github.com/dwrenshatarget=_blankrel="noopener noreferrer">dwrensha</a> for
maintaining the Rust port since
<ahref=https://github.com/capnproto/capnproto-rust/releases/tag/rustc-0.10target=_blankrel="noopener noreferrer">2014!</a>). However, I had a ton
of performance concerns once I started using it.</p>
<p>To serialize new messages, Cap'n Proto uses a "builder" object. This builder allocates memory on the
heap to hold the message content, but because builders
<ahref=https://github.com/capnproto/capnproto-rust/issues/111target=_blankrel="noopener noreferrer">can't be re-used</a>, we have to allocate a
new buffer for every single message. I was able to work around this with a
<ahref=https://doc.rust-lang.org/std/mem/fn.transmute.htmltarget=_blankrel="noopener noreferrer"><code>std::mem::transmute</code></a> to bypass Rust's borrow
checker.</p>
<p>The process of reading messages was better, but still had issues. Cap'n Proto has two message
encodings: a <ahref=https://capnproto.org/encoding.html#packingtarget=_blankrel="noopener noreferrer">"packed"</a> representation, and an
"unpacked" version. When reading "packed" messages, we need a buffer to unpack the message into
before we can use it; Cap'n Proto allocates a new buffer for each message we unpack, and I wasn't
able to figure out a way around that. In contrast, the unpacked message format should be where Cap'n
Proto shines; its main selling point is that there's <ahref=https://capnproto.org/target=_blankrel="noopener noreferrer">no decoding step</a>.
However, accomplishing zero-copy deserialization required code in the private API
(<ahref=https://github.com/capnproto/capnproto-rust/issues/148target=_blankrel="noopener noreferrer">since fixed</a>), and we allocate a vector on
every read for the segment table.</p>
<p>In the end, I put in significant work to make Cap'n Proto as fast as possible, but there were too
many issues for me to feel comfortable using it long-term.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id=flatbuffers>Flatbuffers<ahref=#flatbuffersclass=hash-linkaria-label="Direct link to Flatbuffers"title="Direct link to Flatbuffers"></a></h2>
<p>This is the new kid on the block. After a
<ahref=https://github.com/google/flatbuffers/pull/3894target=_blankrel="noopener noreferrer">first attempt</a> didn't pan out, official support
was <ahref=https://github.com/google/flatbuffers/pull/4898target=_blankrel="noopener noreferrer">recently launched</a>. Flatbuffers intends to
address the same problems as Cap'n Proto: high-performance, polyglot, binary messaging. The
difference is that Flatbuffers claims to have a simpler wire format and
in a <code>SmallVec</code> before building the final <code>MultiMessage</code>, but it was a painful process that I
believe contributed to poor serialization performance.</p>
<p>Second, streaming support in Flatbuffers seems to be something of an
<ahref=https://github.com/google/flatbuffers/issues/3898target=_blankrel="noopener noreferrer">afterthought</a>. Where Cap'n Proto in Rust handles
reading messages from a stream as part of the API, Flatbuffers just sticks a <code>u32</code> at the front of
each message to indicate the size. Not specifically a problem, but calculating message size without
that tag is nigh on impossible.</p>
<p>Ultimately, I enjoyed using Flatbuffers, and had to do significantly less work to make it perform
well.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id=simple-binary-encoding>Simple Binary Encoding<ahref=#simple-binary-encodingclass=hash-linkaria-label="Direct link to Simple Binary Encoding"title="Direct link to Simple Binary Encoding"></a></h2>
<p>Support for SBE was added by the author of one of my favorite
<ahref=https://web.archive.org/web/20190427124806/https://polysync.io/blog/session-types-for-hearty-codecs/target=_blankrel="noopener noreferrer">Rust blog posts</a>.
However, if you don't need union types, and can accept that schemas are XML documents, it's still
worth using. SBE's implementation had the best streaming support of all formats I tested, and
doesn't trigger allocation during de/serialization.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id=results>Results<ahref=#resultsclass=hash-linkaria-label="Direct link to Results"title="Direct link to Results"></a></h2>
<ahref=https://github.com/speice-io/marketdata-shootout/blob/master/src/sbe_runner.rstarget=_blankrel="noopener noreferrer">format</a>, it was
time to actually take them for a spin. I used
<ahref=https://github.com/speice-io/marketdata-shootout/blob/master/run_shootout.shtarget=_blankrel="noopener noreferrer">this script</a> to run
the benchmarks, and the raw results are
<ahref=https://github.com/speice-io/marketdata-shootout/blob/master/shootout.csvtarget=_blankrel="noopener noreferrer">here</a>. All data reported
below is the average of 10 runs on a single day of IEX data. Results were validated to make sure
that each format parsed the data correctly.</p>
<h3class="anchor anchorWithStickyNavbar_LWe7"id=serialization>Serialization<ahref=#serializationclass=hash-linkaria-label="Direct link to Serialization"title="Direct link to Serialization"></a></h3>
how long it takes to serialize the IEX message into the desired format and write to a pre-allocated
buffer.</p>
<table><thead><tr><thstyle=text-align:left>Schema<thstyle=text-align:left>Median<thstyle=text-align:left>99th Pctl<thstyle=text-align:left>99.9th Pctl<thstyle=text-align:left>Total<tbody><tr><tdstyle=text-align:left>Cap'n Proto Packed<tdstyle=text-align:left>413ns<tdstyle=text-align:left>1751ns<tdstyle=text-align:left>2943ns<tdstyle=text-align:left>14.80s<tr><tdstyle=text-align:left>Cap'n Proto Unpacked<tdstyle=text-align:left>273ns<tdstyle=text-align:left>1828ns<tdstyle=text-align:left>2836ns<tdstyle=text-align:left>10.65s<tr><tdstyle=text-align:left>Flatbuffers<tdstyle=text-align:left>355ns<tdstyle=text-align:left>2185ns<tdstyle=text-align:left>3497ns<tdstyle=text-align:left>14.31s<tr><tdstyle=text-align:left>SBE<tdstyle=text-align:left>91ns<tdstyle=text-align:left>1535ns<tdstyle=text-align:left>2423ns<tdstyle=text-align:left>3.91s</table>
<h3class="anchor anchorWithStickyNavbar_LWe7"id=deserialization>Deserialization<ahref=#deserializationclass=hash-linkaria-label="Direct link to Deserialization"title="Direct link to Deserialization"></a></h3>
how long it takes to read the previously-serialized message and perform some basic aggregation. The
aggregation code is the same for each format, so any performance differences are due solely to the
format implementation.</p>
<table><thead><tr><thstyle=text-align:left>Schema<thstyle=text-align:left>Median<thstyle=text-align:left>99th Pctl<thstyle=text-align:left>99.9th Pctl<thstyle=text-align:left>Total<tbody><tr><tdstyle=text-align:left>Cap'n Proto Packed<tdstyle=text-align:left>539ns<tdstyle=text-align:left>1216ns<tdstyle=text-align:left>2599ns<tdstyle=text-align:left>18.92s<tr><tdstyle=text-align:left>Cap'n Proto Unpacked<tdstyle=text-align:left>366ns<tdstyle=text-align:left>737ns<tdstyle=text-align:left>1583ns<tdstyle=text-align:left>12.32s<tr><tdstyle=text-align:left>Flatbuffers<tdstyle=text-align:left>173ns<tdstyle=text-align:left>421ns<tdstyle=text-align:left>1007ns<tdstyle=text-align:left>6.00s<tr><tdstyle=text-align:left>SBE<tdstyle=text-align:left>116ns<tdstyle=text-align:left>286ns<tdstyle=text-align:left>659ns<tdstyle=text-align:left>4.05s</table>
<h2class="anchor anchorWithStickyNavbar_LWe7"id=conclusion>Conclusion<ahref=#conclusionclass=hash-linkaria-label="Direct link to Conclusion"title="Direct link to Conclusion"></a></h2>
<p>Building a benchmark turned out to be incredibly helpful in making a decision; because a "union"
type isn't important to me, I can be confident that SBE best addresses my needs.</p>
<p>While SBE was the fastest in terms of both median and worst-case performance, its worst case
performance was proportionately far higher than any other format. It seems to be that
de/serialization time scales with message size, but I'll need to do some more research to understand