<!doctype html><htmllang=endir=ltrclass="blog-wrapper blog-post-page plugin-blog plugin-id-default"data-has-hydrated=false><metacharset=UTF-8><metaname=generatorcontent="Docusaurus v3.6.1"><titledata-rh=true>Binary format shootout | The Old Speice Guy</title><metadata-rh=truename=viewportcontent="width=device-width,initial-scale=1.0"><metadata-rh=truename=twitter:cardcontent=summary_large_image><metadata-rh=trueproperty=og:urlcontent=https://speice.io/2019/09/binary-format-shootout><metadata-rh=trueproperty=og:localecontent=en><metadata-rh=truename=docusaurus_localecontent=en><metadata-rh=truename=docusaurus_tagcontent=default><metadata-rh=truename=docsearch:languagecontent=en><metadata-rh=truename=docsearch:docusaurus_tagcontent=default><metadata-rh=trueproperty=og:titlecontent="Binary format shootout | The Old Speice Guy"><metadata-rh=truename=descriptioncontent="I've found that in many personal projects,"><metadata-rh=trueproperty=og:descriptioncontent="I've found that in many personal projects,"><metadata-rh=trueproperty=og:typecontent=article><metadata-rh=trueproperty=article:published_timecontent=2019-09-28T12:00:00.000Z><linkdata-rh=truerel=iconhref=/img/favicon.ico><linkdata-rh=truerel=canonicalhref=https://speice.io/2019/09/binary-format-shootout><linkdata-rh=truerel=alternatehref=https://speice.io/2019/09/binary-format-shootouthreflang=en><linkdata-rh=truerel=alternatehref=https://speice.io/2019/09/binary-format-shootouthreflang=x-default><scriptdata-rh=truetype=application/ld+json>{"@context":"https://schema.org","@id":"https://speice.io/2019/09/binary-format-shootout","@type":"BlogPosting","author":{"@type":"Person","name":"Bradlee Speice"},"dateModified":"2024-11-10T03:06:23.000Z","datePublished":"2019-09-28T12:00:00.000Z","description":"I've found that in many personal projects,","headline":"Binary format shootout","isPartOf":{"@id":"https://speice.io/","@type":"Blog","name":"Blog"},"keywords":[],"mainEntityOfPage":"https://speice.io/2019/09/binary-format-shootout","name":"Binary format shootout","url":"https://speice.io/2019/09/binary-format-shootout"}</script><linkrel=alternatetype=application/rss+xmlhref=/rss.xmltitle="The Old Speice Guy RSS Feed"><linkrel=alternatetype=application/atom+xmlhref=/atom.xmltitle="The Old Speice Guy Atom Feed"><linkrel=stylesheethref=/katex/katex.min.css><linkrel=stylesheethref=/assets/css/styles.16c3428d.css><scriptsrc=/assets/js/runtime~main.9e43ec1c.jsdefer></script><scriptsrc=/assets/js/main.a9d33393.jsdefer></script><bodyclass=navigation-with-keyboard><script>!function(){vart,e=function(){try{returnnewURLSearchParams(window.location.search).get("docusaurus-theme")}catch(t){}}()||function(){try{returnwindow.localStorage.getItem("theme")}catch(t){}}();t=null!==e?e:"light",document.documentElement.setAttribute("data-theme",t)}(),function(){try{for(var[t,e]ofnewURLSearchParams(window.location.search).entries())if(t.startsWith("docusaurus-data-")){vara=t.replace("docusaurus-data-","data-");document.documentElement.setAttribute(a,e)}}catch(t){}}()</script><divid=__docusaurus><divrole=regionaria-label="Skip to main content"><aclass=skipToContent_fXgnhref=#__docusaurus_skipToContent_fallback>Skip to main content</a></div><navaria-label=Mainclass="navbar navbar--fixed-top"><divclass=navbar__inner><divclass=navbar__items><buttonaria-label="Toggle navigation bar"aria-expanded=falseclass="navbar__toggle clean-btn"type=button><svgwidth=30height=30viewBox="0 0 30 30"aria-hidden=true><pathstroke=currentColorstroke-linecap=roundstroke-miterlimit=10stroke-width=2d="M4 7h22M4 15h22M4 23h22"/></svg></button><aclass=navbar__brandhref=/><divclass=navbar__logo><imgsrc=/img/logo.svgalt="Sierpinski Gasket"class="themedComponent_mlkZ themedComponent--light_NVdE"><imgsrc=/img/logo-dark.svgalt="Sierpinski Gasket"class="themedComponent_mlkZ themedComponent--dark_xIcU"></div><bclass="navbar__title text--truncate">The Old Speice Guy</b></a></div><divclass="navbar__itemsnavbar__it
<ahref=https://en.wikipedia.org/wiki/Analysis_paralysistarget=_blankrel="noopener noreferrer">analysis paralysis</a> is particularly deadly.
Making good decisions in the beginning avoids pain and suffering later; if extra research prevents
future problems, I'm happy to continue <del>procrastinating</del> researching indefinitely.</p>
<p>So let's say you're in need of a binary serialization format. Data will be going over the network,
not just in memory, so having a schema document and code generation is a must. Performance is
crucial, so formats that support zero-copy de/serialization are given priority. And the more
languages supported, the better; I use Rust, but can't predict what other languages this could
interact with.</p>
<p>Given these requirements, the candidates I could find were:</p>
<ol>
<li><ahref=https://capnproto.org/target=_blankrel="noopener noreferrer">Cap'n Proto</a> has been around the longest, and is the most established</li>
<li><ahref=https://google.github.io/flatbuffers/target=_blankrel="noopener noreferrer">Flatbuffers</a> is the newest, and claims to have a simpler
encoding</li>
<li><ahref=https://github.com/real-logic/simple-binary-encodingtarget=_blankrel="noopener noreferrer">Simple Binary Encoding</a> has the simplest
encoding, but the Rust implementation is unmaintained</li>
</ol>
<p>Any one of these will satisfy the project requirements: easy to transmit over a network, reasonably
fast, and polyglot support. But how do you actually pick one? It's impossible to know what issues
will follow that choice, so I tend to avoid commitment until the last possible moment.</p>
<p>Still, a choice must be made. Instead of worrying about which is "the best," I decided to build a
small proof-of-concept system in each format and pit them against each other. All code can be found
in the <ahref=https://github.com/speice-io/marketdata-shootouttarget=_blankrel="noopener noreferrer">repository</a> for this post.</p>
<p>We'll discuss more in detail, but a quick preview of the results:</p>
<ul>
<li>Cap'n Proto: Theoretically performs incredibly well, the implementation had issues</li>
<li>Flatbuffers: Has some quirks, but largely lived up to its "zero-copy" promises</li>
<li>SBE: Best median and worst-case performance, but the message structure has a limited feature set</li>
</ul>
<h2class="anchor anchorWithStickyNavbar_LWe7"id=prologue-binary-parsing-with-nom>Prologue: Binary Parsing with Nom<ahref=#prologue-binary-parsing-with-nomclass=hash-linkaria-label="Direct link to Prologue: Binary Parsing with Nom"title="Direct link to Prologue: Binary Parsing with Nom"></a></h2>
<p>Our benchmark system will be a simple data processor; given depth-of-book market data from
<ahref=https://iextrading.com/trading/market-data/#deeptarget=_blankrel="noopener noreferrer">IEX</a>, serialize each message into the schema
format, read it back, and calculate total size of stock traded and the lowest/highest quoted prices.
This test isn't complex, but is representative of the project I need a binary format for.</p>
<p>But before we make it to that point, we have to actually read in the market data. To do so, I'm
using a library called <ahref=https://github.com/Geal/nomtarget=_blankrel="noopener noreferrer"><code>nom</code></a>. Version 5.0 was recently released and
brought some big changes, so this was an opportunity to build a non-trivial program and get
familiar.</p>
<p>If you don't already know about <code>nom</code>, it's a "parser generator". By combining different smaller
parsers, you can assemble a parser to handle complex structures without writing tedious code by
<p>Ultimately, because the <code>nom</code> code in this shootout was the same for all formats, we're not too
interested in its performance. Still, it's worth mentioning that building the market data parser was
actually fun; I didn't have to write tons of boring code by hand.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id=capn-proto>Cap'n Proto<ahref=#capn-protoclass=hash-linkaria-label="Direct link to Cap'n Proto"title="Direct link to Cap'n Proto"></a></h2>
<p>Now it's time to get into the meaty part of the story. Cap'n Proto was the first format I tried
because of how long it has supported Rust (thanks to <ahref=https://github.com/dwrenshatarget=_blankrel="noopener noreferrer">dwrensha</a> for
maintaining the Rust port since
<ahref=https://github.com/capnproto/capnproto-rust/releases/tag/rustc-0.10target=_blankrel="noopener noreferrer">2014!</a>). However, I had a ton
of performance concerns once I started using it.</p>
<p>To serialize new messages, Cap'n Proto uses a "builder" object. This builder allocates memory on the
heap to hold the message content, but because builders
<ahref=https://github.com/capnproto/capnproto-rust/issues/111target=_blankrel="noopener noreferrer">can't be re-used</a>, we have to allocate a
new buffer for every single message. I was able to work around this with a
<ahref=https://doc.rust-lang.org/std/mem/fn.transmute.htmltarget=_blankrel="noopener noreferrer"><code>std::mem::transmute</code></a> to bypass Rust's borrow
checker.</p>
<p>The process of reading messages was better, but still had issues. Cap'n Proto has two message
encodings: a <ahref=https://capnproto.org/encoding.html#packingtarget=_blankrel="noopener noreferrer">"packed"</a> representation, and an
"unpacked" version. When reading "packed" messages, we need a buffer to unpack the message into
before we can use it; Cap'n Proto allocates a new buffer for each message we unpack, and I wasn't
able to figure out a way around that. In contrast, the unpacked message format should be where Cap'n
Proto shines; its main selling point is that there's <ahref=https://capnproto.org/target=_blankrel="noopener noreferrer">no decoding step</a>.
However, accomplishing zero-copy deserialization required code in the private API
(<ahref=https://github.com/capnproto/capnproto-rust/issues/148target=_blankrel="noopener noreferrer">since fixed</a>), and we allocate a vector on
every read for the segment table.</p>
<p>In the end, I put in significant work to make Cap'n Proto as fast as possible, but there were too
many issues for me to feel comfortable using it long-term.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id=flatbuffers>Flatbuffers<ahref=#flatbuffersclass=hash-linkaria-label="Direct link to Flatbuffers"title="Direct link to Flatbuffers"></a></h2>
<p>This is the new kid on the block. After a
<ahref=https://github.com/google/flatbuffers/pull/3894target=_blankrel="noopener noreferrer">first attempt</a> didn't pan out, official support
was <ahref=https://github.com/google/flatbuffers/pull/4898target=_blankrel="noopener noreferrer">recently launched</a>. Flatbuffers intends to
address the same problems as Cap'n Proto: high-performance, polyglot, binary messaging. The
difference is that Flatbuffers claims to have a simpler wire format and
in a <code>SmallVec</code> before building the final <code>MultiMessage</code>, but it was a painful process that I
believe contributed to poor serialization performance.</p>
<p>Second, streaming support in Flatbuffers seems to be something of an
<ahref=https://github.com/google/flatbuffers/issues/3898target=_blankrel="noopener noreferrer">afterthought</a>. Where Cap'n Proto in Rust handles
reading messages from a stream as part of the API, Flatbuffers just sticks a <code>u32</code> at the front of
each message to indicate the size. Not specifically a problem, but calculating message size without
that tag is nigh on impossible.</p>
<p>Ultimately, I enjoyed using Flatbuffers, and had to do significantly less work to make it perform
well.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id=simple-binary-encoding>Simple Binary Encoding<ahref=#simple-binary-encodingclass=hash-linkaria-label="Direct link to Simple Binary Encoding"title="Direct link to Simple Binary Encoding"></a></h2>
<p>Support for SBE was added by the author of one of my favorite
<ahref=https://web.archive.org/web/20190427124806/https://polysync.io/blog/session-types-for-hearty-codecs/target=_blankrel="noopener noreferrer">Rust blog posts</a>.
I've <ahref=/2019/06/high-performance-systems>talked previously</a> about how important
variance is in high-performance systems, so it was encouraging to read about a format that
<ahref=https://github.com/real-logic/simple-binary-encoding/wiki/Why-Low-Latencytarget=_blankrel="noopener noreferrer">directly addressed</a> my
concerns. SBE has by far the simplest binary format, but it does make some tradeoffs.</p>
<p>Both Cap'n Proto and Flatbuffers use <ahref=https://capnproto.org/encoding.html#structstarget=_blankrel="noopener noreferrer">message offsets</a>
to handle variable-length data, <ahref=https://capnproto.org/language.html#unionstarget=_blankrel="noopener noreferrer">unions</a>, and various
other features. In contrast, messages in SBE are essentially
However, if you don't need union types, and can accept that schemas are XML documents, it's still
worth using. SBE's implementation had the best streaming support of all formats I tested, and
doesn't trigger allocation during de/serialization.</p>
<h2class="anchor anchorWithStickyNavbar_LWe7"id=results>Results<ahref=#resultsclass=hash-linkaria-label="Direct link to Results"title="Direct link to Results"></a></h2>
<ahref=https://github.com/speice-io/marketdata-shootout/blob/master/src/sbe_runner.rstarget=_blankrel="noopener noreferrer">format</a>, it was
time to actually take them for a spin. I used
<ahref=https://github.com/speice-io/marketdata-shootout/blob/master/run_shootout.shtarget=_blankrel="noopener noreferrer">this script</a> to run
the benchmarks, and the raw results are
<ahref=https://github.com/speice-io/marketdata-shootout/blob/master/shootout.csvtarget=_blankrel="noopener noreferrer">here</a>. All data reported
below is the average of 10 runs on a single day of IEX data. Results were validated to make sure
that each format parsed the data correctly.</p>
<h3class="anchor anchorWithStickyNavbar_LWe7"id=serialization>Serialization<ahref=#serializationclass=hash-linkaria-label="Direct link to Serialization"title="Direct link to Serialization"></a></h3>
how long it takes to serialize the IEX message into the desired format and write to a pre-allocated
buffer.</p>
<table><thead><tr><thstyle=text-align:left>Schema<thstyle=text-align:left>Median<thstyle=text-align:left>99th Pctl<thstyle=text-align:left>99.9th Pctl<thstyle=text-align:left>Total<tbody><tr><tdstyle=text-align:left>Cap'n Proto Packed<tdstyle=text-align:left>413ns<tdstyle=text-align:left>1751ns<tdstyle=text-align:left>2943ns<tdstyle=text-align:left>14.80s<tr><tdstyle=text-align:left>Cap'n Proto Unpacked<tdstyle=text-align:left>273ns<tdstyle=text-align:left>1828ns<tdstyle=text-align:left>2836ns<tdstyle=text-align:left>10.65s<tr><tdstyle=text-align:left>Flatbuffers<tdstyle=text-align:left>355ns<tdstyle=text-align:left>2185ns<tdstyle=text-align:left>3497ns<tdstyle=text-align:left>14.31s<tr><tdstyle=text-align:left>SBE<tdstyle=text-align:left>91ns<tdstyle=text-align:left>1535ns<tdstyle=text-align:left>2423ns<tdstyle=text-align:left>3.91s</table>
<h3class="anchor anchorWithStickyNavbar_LWe7"id=deserialization>Deserialization<ahref=#deserializationclass=hash-linkaria-label="Direct link to Deserialization"title="Direct link to Deserialization"></a></h3>
how long it takes to read the previously-serialized message and perform some basic aggregation. The
aggregation code is the same for each format, so any performance differences are due solely to the
format implementation.</p>
<table><thead><tr><thstyle=text-align:left>Schema<thstyle=text-align:left>Median<thstyle=text-align:left>99th Pctl<thstyle=text-align:left>99.9th Pctl<thstyle=text-align:left>Total<tbody><tr><tdstyle=text-align:left>Cap'n Proto Packed<tdstyle=text-align:left>539ns<tdstyle=text-align:left>1216ns<tdstyle=text-align:left>2599ns<tdstyle=text-align:left>18.92s<tr><tdstyle=text-align:left>Cap'n Proto Unpacked<tdstyle=text-align:left>366ns<tdstyle=text-align:left>737ns<tdstyle=text-align:left>1583ns<tdstyle=text-align:left>12.32s<tr><tdstyle=text-align:left>Flatbuffers<tdstyle=text-align:left>173ns<tdstyle=text-align:left>421ns<tdstyle=text-align:left>1007ns<tdstyle=text-align:left>6.00s<tr><tdstyle=text-align:left>SBE<tdstyle=text-align:left>116ns<tdstyle=text-align:left>286ns<tdstyle=text-align:left>659ns<tdstyle=text-align:left>4.05s</table>
<h2class="anchor anchorWithStickyNavbar_LWe7"id=conclusion>Conclusion<ahref=#conclusionclass=hash-linkaria-label="Direct link to Conclusion"title="Direct link to Conclusion"></a></h2>
<p>Building a benchmark turned out to be incredibly helpful in making a decision; because a "union"
type isn't important to me, I can be confident that SBE best addresses my needs.</p>
<p>While SBE was the fastest in terms of both median and worst-case performance, its worst case
performance was proportionately far higher than any other format. It seems to be that
de/serialization time scales with message size, but I'll need to do some more research to understand