speice.io/_posts/2019-09-28-binary-format-sh...

---
layout: post
title: "Binary Format Shootout"
description: "Cap'n Proto vs. Flatbuffers vs. SBE"
category:
tags: [rust]
---

I've found that in many personal projects,
[analysis paralysis](https://en.wikipedia.org/wiki/Analysis_paralysis) is particularly deadly.
Making good decisions in the beginning avoids pain and suffering later; if extra research prevents
future problems, I'm happy to continue ~~procrastinating~~ researching indefinitely.

So let's say you're in need of a binary serialization format. Data will be going over the network,
not just in memory, so having a schema document and code generation is a must. Performance is
crucial, so formats that support zero-copy de/serialization are given priority. And the more
languages supported, the better; I use Rust, but can't predict what other languages this could
interact with.

Given these requirements, the candidates I could find were:

1. [Cap'n Proto](https://capnproto.org/) has been around the longest, and is the most established
2. [Flatbuffers](https://google.github.io/flatbuffers/) is the newest, and claims to have a simpler
   encoding
3. [Simple Binary Encoding](https://github.com/real-logic/simple-binary-encoding) has the simplest
   encoding, but the Rust implementation is unmaintained

Any one of these will satisfy the project requirements: easy to transmit over a network, reasonably
fast, and polyglot support. But how do you actually pick one? It's impossible to know what issues
will follow that choice, so I tend to avoid commitment until the last possible moment.

Still, a choice must be made. Instead of worrying about which is "the best," I decided to build a
small proof-of-concept system in each format and pit them against each other. All code can be found
in the [repository](https://github.com/speice-io/marketdata-shootout) for this post.

We'll discuss more in detail, but a quick preview of the results:

- Cap'n Proto: Theoretically performs incredibly well, the implementation had issues
- Flatbuffers: Has some quirks, but largely lived up to its "zero-copy" promises
- SBE: Best median and worst-case performance, but the message structure has a limited feature set

# Prologue: Binary Parsing with Nom

Our benchmark system will be a simple data processor; given depth-of-book market data from
[IEX](https://iextrading.com/trading/market-data/#deep), serialize each message into the schema
format, read it back, and calculate total size of stock traded and the lowest/highest quoted prices.
This test isn't complex, but is representative of the project I need a binary format for.

But before we make it to that point, we have to actually read in the market data. To do so, I'm
using a library called [`nom`](https://github.com/Geal/nom). Version 5.0 was recently released and
brought some big changes, so this was an opportunity to build a non-trivial program and get
familiar.

If you don't already know about `nom`, it's a "parser generator". By combining different smaller
parsers, you can assemble a parser to handle complex structures without writing tedious code by
hand. For example, when parsing
[PCAP files](https://www.winpcap.org/ntar/draft/PCAP-DumpFileFormat.html#rfc.section.3.3):

```
   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +---------------------------------------------------------------+
 0 |                    Block Type = 0x00000006                    |
   +---------------------------------------------------------------+
 4 |                      Block Total Length                       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 8 |                         Interface ID                          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
12 |                        Timestamp (High)                       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
16 |                        Timestamp (Low)                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
20 |                         Captured Len                          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
24 |                          Packet Len                           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                          Packet Data                          |
   |                              ...                              |
```

...you can build a parser in `nom` that looks like
[this](https://github.com/speice-io/marketdata-shootout/blob/369613843d39cfdc728e1003123bf87f79422497/src/parsers.rs#L59-L93):

```rust
const ENHANCED_PACKET: [u8; 4] = [0x06, 0x00, 0x00, 0x00];
pub fn enhanced_packet_block(input: &[u8]) -> IResult<&[u8], &[u8]> {
    let (
        remaining,
        (
            block_type,
            block_len,
            interface_id,
            timestamp_high,
            timestamp_low,
            captured_len,
            packet_len,
        ),
    ) = tuple((
        tag(ENHANCED_PACKET),
        le_u32,
        le_u32,
        le_u32,
        le_u32,
        le_u32,
        le_u32,
    ))(input)?;

    let (remaining, packet_data) = take(captured_len)(remaining)?;
    Ok((remaining, packet_data))
}
```

While this example isn't too interesting, more complex formats (like IEX market data) are where
[`nom` really shines](https://github.com/speice-io/marketdata-shootout/blob/369613843d39cfdc728e1003123bf87f79422497/src/iex.rs).

Ultimately, because the `nom` code in this shootout was the same for all formats, we're not too
interested in its performance. Still, it's worth mentioning that building the market data parser was
actually fun; I didn't have to write tons of boring code by hand.

# Part 1: Cap'n Proto

Now it's time to get into the meaty part of the story. Cap'n Proto was the first format I tried
because of how long it has supported Rust (thanks to [dwrensha](https://github.com/dwrensha) for
maintaining the Rust port since
[2014!](https://github.com/capnproto/capnproto-rust/releases/tag/rustc-0.10)). However, I had a ton
of performance concerns once I started using it.

To serialize new messages, Cap'n Proto uses a "builder" object. This builder allocates memory on the
heap to hold the message content, but because builders
[can't be re-used](https://github.com/capnproto/capnproto-rust/issues/111), we have to allocate a
new buffer for every single message. I was able to work around this with a
[special builder](https://github.com/speice-io/marketdata-shootout/blob/369613843d39cfdc728e1003123bf87f79422497/src/capnp_runner.rs#L17-L51)
that could re-use the buffer, but it required reading through Cap'n Proto's
[benchmarks](https://github.com/capnproto/capnproto-rust/blob/master/benchmark/benchmark.rs#L124-L156)
to find an example, and used
[`std::mem::transmute`](https://doc.rust-lang.org/std/mem/fn.transmute.html) to bypass Rust's borrow
checker.

The process of reading messages was better, but still had issues. Cap'n Proto has two message
encodings: a ["packed"](https://capnproto.org/encoding.html#packing) representation, and an
"unpacked" version. When reading "packed" messages, we need a buffer to unpack the message into
before we can use it; Cap'n Proto allocates a new buffer for each message we unpack, and I wasn't
able to figure out a way around that. In contrast, the unpacked message format should be where Cap'n
Proto shines; its main selling point is that there's [no decoding step](https://capnproto.org/).
However, accomplishing zero-copy deserialization required code in the private API
([since fixed](https://github.com/capnproto/capnproto-rust/issues/148)), and we allocate a vector on
every read for the segment table.

In the end, I put in significant work to make Cap'n Proto as fast as possible, but there were too
many issues for me to feel comfortable using it long-term.

# Part 2: Flatbuffers

This is the new kid on the block. After a
[first attempt](https://github.com/google/flatbuffers/pull/3894) didn't pan out, official support
was [recently launched](https://github.com/google/flatbuffers/pull/4898). Flatbuffers intends to
address the same problems as Cap'n Proto: high-performance, polyglot, binary messaging. The
difference is that Flatbuffers claims to have a simpler wire format and
[more flexibility](https://google.github.io/flatbuffers/flatbuffers_benchmarks.html).

On the whole, I enjoyed using Flatbuffers; the [tooling](https://crates.io/crates/flatc-rust) is
nice, and unlike Cap'n Proto, parsing messages was actually zero-copy and zero-allocation. However,
there were still some issues.

First, Flatbuffers (at least in Rust) can't handle nested vectors. This is a problem for formats
like the following:

```
table Message {
  symbol: string;
}
table MultiMessage {
  messages:[Message];
}
```

We want to create a `MultiMessage` which contains a vector of `Message`, and each `Message` itself
contains a vector (the `string` type). I was able to work around this by
[caching `Message` elements](https://github.com/speice-io/marketdata-shootout/blob/e9d07d148bf36a211a6f86802b313c4918377d1b/src/flatbuffers_runner.rs#L83)
in a `SmallVec` before building the final `MultiMessage`, but it was a painful process that I
believe contributed to poor serialization performance.

Second, streaming support in Flatbuffers seems to be something of an
[afterthought](https://github.com/google/flatbuffers/issues/3898). Where Cap'n Proto in Rust handles
reading messages from a stream as part of the API, Flatbuffers just sticks a `u32` at the front of
each message to indicate the size. Not specifically a problem, but calculating message size without
that tag is nigh on impossible.

Ultimately, I enjoyed using Flatbuffers, and had to do significantly less work to make it perform
well.

# Part 3: Simple Binary Encoding

Support for SBE was added by the author of one of my favorite
[Rust blog posts](https://web.archive.org/web/20190427124806/https://polysync.io/blog/session-types-for-hearty-codecs/).
I've [talked previously]({% post_url 2019-06-31-high-performance-systems %}) about how important
variance is in high-performance systems, so it was encouraging to read about a format that
[directly addressed](https://github.com/real-logic/simple-binary-encoding/wiki/Why-Low-Latency) my
concerns. SBE has by far the simplest binary format, but it does make some tradeoffs.

Both Cap'n Proto and Flatbuffers use [message offsets](https://capnproto.org/encoding.html#structs)
to handle variable-length data, [unions](https://capnproto.org/language.html#unions), and various
other features. In contrast, messages in SBE are essentially
[just structs](https://github.com/real-logic/simple-binary-encoding/blob/master/sbe-samples/src/main/resources/example-schema.xml);
variable-length data is supported, but there's no union type.

As mentioned in the beginning, the Rust port of SBE works well, but is
[essentially unmaintained](https://users.rust-lang.org/t/zero-cost-abstraction-frontier-no-copy-low-allocation-ordered-decoding/11515/9).
However, if you don't need union types, and can accept that schemas are XML documents, it's still
worth using. SBE's implementation had the best streaming support of all formats I tested, and
doesn't trigger allocation during de/serialization.

# Results

After building a test harness
[for](https://github.com/speice-io/marketdata-shootout/blob/master/src/capnp_runner.rs)
[each](https://github.com/speice-io/marketdata-shootout/blob/master/src/flatbuffers_runner.rs)
[format](https://github.com/speice-io/marketdata-shootout/blob/master/src/sbe_runner.rs), it was
time to actually take them for a spin. I used
[this script](https://github.com/speice-io/marketdata-shootout/blob/master/run_shootout.sh) to run
the benchmarks, and the raw results are
[here](https://github.com/speice-io/marketdata-shootout/blob/master/shootout.csv). All data reported
below is the average of 10 runs on a single day of IEX data. Results were validated to make sure
that each format parsed the data correctly.

## Serialization

This test measures, on a
[per-message basis](https://github.com/speice-io/marketdata-shootout/blob/master/src/main.rs#L268-L272),
how long it takes to serialize the IEX message into the desired format and write to a pre-allocated
buffer.

| Schema               | Median | 99th Pctl | 99.9th Pctl | Total  |
| :------------------- | :----- | :-------- | :---------- | :----- |
| Cap'n Proto Packed   | 413ns  | 1751ns    | 2943ns      | 14.80s |
| Cap'n Proto Unpacked | 273ns  | 1828ns    | 2836ns      | 10.65s |
| Flatbuffers          | 355ns  | 2185ns    | 3497ns      | 14.31s |
| SBE                  | 91ns   | 1535ns    | 2423ns      | 3.91s  |

## Deserialization

This test measures, on a
[per-message basis](https://github.com/speice-io/marketdata-shootout/blob/master/src/main.rs#L294-L298),
how long it takes to read the previously-serialized message and perform some basic aggregation. The
aggregation code is the same for each format, so any performance differences are due solely to the
format implementation.

| Schema               | Median | 99th Pctl | 99.9th Pctl | Total  |
| :------------------- | :----- | :-------- | :---------- | :----- |
| Cap'n Proto Packed   | 539ns  | 1216ns    | 2599ns      | 18.92s |
| Cap'n Proto Unpacked | 366ns  | 737ns     | 1583ns      | 12.32s |
| Flatbuffers          | 173ns  | 421ns     | 1007ns      | 6.00s  |
| SBE                  | 116ns  | 286ns     | 659ns       | 4.05s  |

# Conclusion

Building a benchmark turned out to be incredibly helpful in making a decision; because a "union"
type isn't important to me, I can be confident that SBE best addresses my needs.

While SBE was the fastest in terms of both median and worst-case performance, its worst case
performance was proportionately far higher than any other format. It seems to be that
de/serialization time scales with message size, but I'll need to do some more research to understand
what exactly is going on.
Start work on binary format shootout 2019-09-01 23:56:43 -04:00			`---`
			`layout: post`
Start a second pass on the article Also change the table formatting to actually be readable 2019-09-26 23:24:39 -04:00			`title: "Binary Format Shootout"`
Finish first draft 2019-09-28 00:18:20 -04:00			`description: "Cap'n Proto vs. Flatbuffers vs. SBE"`
Markdown auto formatting 2020-06-29 15:51:23 -04:00			`category:`
Renaming and cleanup 2019-09-26 23:25:42 -04:00			`tags: [rust]`
Start work on binary format shootout 2019-09-01 23:56:43 -04:00			`---`

Actually format everything 2020-06-29 16:00:26 -04:00			`I've found that in many personal projects,`
			`[analysis paralysis](https://en.wikipedia.org/wiki/Analysis_paralysis) is particularly deadly.`
			`Making good decisions in the beginning avoids pain and suffering later; if extra research prevents`
			`future problems, I'm happy to continue ~~procrastinating~~ researching indefinitely.`
Start a second pass on the article Also change the table formatting to actually be readable 2019-09-26 23:24:39 -04:00
Actually format everything 2020-06-29 16:00:26 -04:00			`So let's say you're in need of a binary serialization format. Data will be going over the network,`
			`not just in memory, so having a schema document and code generation is a must. Performance is`
			`crucial, so formats that support zero-copy de/serialization are given priority. And the more`
			`languages supported, the better; I use Rust, but can't predict what other languages this could`
			`interact with.`
Start a second pass on the article Also change the table formatting to actually be readable 2019-09-26 23:24:39 -04:00
First editing pass 2019-09-28 12:55:46 -04:00			`Given these requirements, the candidates I could find were:`
Start a second pass on the article Also change the table formatting to actually be readable 2019-09-26 23:24:39 -04:00
Second editing pass 2019-09-28 13:25:52 -04:00			`1. [Cap'n Proto](https://capnproto.org/) has been around the longest, and is the most established`
Actually format everything 2020-06-29 16:00:26 -04:00			`2. [Flatbuffers](https://google.github.io/flatbuffers/) is the newest, and claims to have a simpler`
			`encoding`
			`3. [Simple Binary Encoding](https://github.com/real-logic/simple-binary-encoding) has the simplest`
			`encoding, but the Rust implementation is unmaintained`
Start a second pass on the article Also change the table formatting to actually be readable 2019-09-26 23:24:39 -04:00
Actually format everything 2020-06-29 16:00:26 -04:00			`Any one of these will satisfy the project requirements: easy to transmit over a network, reasonably`
			`fast, and polyglot support. But how do you actually pick one? It's impossible to know what issues`
			`will follow that choice, so I tend to avoid commitment until the last possible moment.`
Start a second pass on the article Also change the table formatting to actually be readable 2019-09-26 23:24:39 -04:00
Actually format everything 2020-06-29 16:00:26 -04:00			`Still, a choice must be made. Instead of worrying about which is "the best," I decided to build a`
			`small proof-of-concept system in each format and pit them against each other. All code can be found`
			`in the [repository](https://github.com/speice-io/marketdata-shootout) for this post.`
Add a TLDR 2019-09-28 00:28:32 -04:00
First editing pass 2019-09-28 12:55:46 -04:00			`We'll discuss more in detail, but a quick preview of the results:`
Add a TLDR 2019-09-28 00:28:32 -04:00
Final editing pass 2019-09-28 13:43:47 -04:00			`- Cap'n Proto: Theoretically performs incredibly well, the implementation had issues`
			`- Flatbuffers: Has some quirks, but largely lived up to its "zero-copy" promises`
			`- SBE: Best median and worst-case performance, but the message structure has a limited feature set`
Start a second pass on the article Also change the table formatting to actually be readable 2019-09-26 23:24:39 -04:00
Final editing pass 2019-09-28 13:43:47 -04:00			`# Prologue: Binary Parsing with Nom`
Start a second pass on the article Also change the table formatting to actually be readable 2019-09-26 23:24:39 -04:00
Second editing pass 2019-09-28 13:25:52 -04:00			`Our benchmark system will be a simple data processor; given depth-of-book market data from`
Actually format everything 2020-06-29 16:00:26 -04:00			`[IEX](https://iextrading.com/trading/market-data/#deep), serialize each message into the schema`
			`format, read it back, and calculate total size of stock traded and the lowest/highest quoted prices.`
			`This test isn't complex, but is representative of the project I need a binary format for.`
Start a second pass on the article Also change the table formatting to actually be readable 2019-09-26 23:24:39 -04:00
Actually format everything 2020-06-29 16:00:26 -04:00			`But before we make it to that point, we have to actually read in the market data. To do so, I'm`
			using a library called [`nom`](https://github.com/Geal/nom). Version 5.0 was recently released and
			`brought some big changes, so this was an opportunity to build a non-trivial program and get`
			`familiar.`
Start a second pass on the article Also change the table formatting to actually be readable 2019-09-26 23:24:39 -04:00
Actually format everything 2020-06-29 16:00:26 -04:00			If you don't already know about `nom`, it's a "parser generator". By combining different smaller
			`parsers, you can assemble a parser to handle complex structures without writing tedious code by`
			`hand. For example, when parsing`
			`[PCAP files](https://www.winpcap.org/ntar/draft/PCAP-DumpFileFormat.html#rfc.section.3.3):`
Start a second pass on the article Also change the table formatting to actually be readable 2019-09-26 23:24:39 -04:00
			```
			`0 1 2 3`
			`0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1`
			`+---------------------------------------------------------------+`
			`0 \| Block Type = 0x00000006 \|`
			`+---------------------------------------------------------------+`
			`4 \| Block Total Length \|`
			`+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+`
			`8 \| Interface ID \|`
			`+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+`
			`12 \| Timestamp (High) \|`
			`+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+`
			`16 \| Timestamp (Low) \|`
			`+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+`
			`20 \| Captured Len \|`
			`+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+`
			`24 \| Packet Len \|`
			`+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+`
			`\| Packet Data \|`
			`\| ... \|`
			```

First editing pass 2019-09-28 12:55:46 -04:00			...you can build a parser in `nom` that looks like
Migrate links for marketdata shootout 2020-03-21 17:01:11 -04:00			`[this](https://github.com/speice-io/marketdata-shootout/blob/369613843d39cfdc728e1003123bf87f79422497/src/parsers.rs#L59-L93):`
Start a second pass on the article Also change the table formatting to actually be readable 2019-09-26 23:24:39 -04:00
			```rust
			`const ENHANCED_PACKET: [u8; 4] = [0x06, 0x00, 0x00, 0x00];`
			`pub fn enhanced_packet_block(input: &[u8]) -> IResult<&[u8], &[u8]> {`
			`let (`
			`remaining,`
			`(`
			`block_type,`
			`block_len,`
			`interface_id,`
			`timestamp_high,`
			`timestamp_low,`
			`captured_len,`
			`packet_len,`
			`),`
			`) = tuple((`
			`tag(ENHANCED_PACKET),`
			`le_u32,`
			`le_u32,`
			`le_u32,`
			`le_u32,`
			`le_u32,`
			`le_u32,`
			`))(input)?;`

			`let (remaining, packet_data) = take(captured_len)(remaining)?;`
			`Ok((remaining, packet_data))`
			`}`
			```

Second editing pass 2019-09-28 13:25:52 -04:00			`While this example isn't too interesting, more complex formats (like IEX market data) are where`
Migrate links for marketdata shootout 2020-03-21 17:01:11 -04:00			[`nom` really shines](https://github.com/speice-io/marketdata-shootout/blob/369613843d39cfdc728e1003123bf87f79422497/src/iex.rs).
Start a second pass on the article Also change the table formatting to actually be readable 2019-09-26 23:24:39 -04:00
Actually format everything 2020-06-29 16:00:26 -04:00			Ultimately, because the `nom` code in this shootout was the same for all formats, we're not too
			`interested in its performance. Still, it's worth mentioning that building the market data parser was`
			`actually fun; I didn't have to write tons of boring code by hand.`
Start a second pass on the article Also change the table formatting to actually be readable 2019-09-26 23:24:39 -04:00
			`# Part 1: Cap'n Proto`

Actually format everything 2020-06-29 16:00:26 -04:00			`Now it's time to get into the meaty part of the story. Cap'n Proto was the first format I tried`
			`because of how long it has supported Rust (thanks to [dwrensha](https://github.com/dwrensha) for`
			`maintaining the Rust port since`
			`[2014!](https://github.com/capnproto/capnproto-rust/releases/tag/rustc-0.10)). However, I had a ton`
			`of performance concerns once I started using it.`
Start a second pass on the article Also change the table formatting to actually be readable 2019-09-26 23:24:39 -04:00
Actually format everything 2020-06-29 16:00:26 -04:00			`To serialize new messages, Cap'n Proto uses a "builder" object. This builder allocates memory on the`
			`heap to hold the message content, but because builders`
			`[can't be re-used](https://github.com/capnproto/capnproto-rust/issues/111), we have to allocate a`
			`new buffer for every single message. I was able to work around this with a`
Migrate links for marketdata shootout 2020-03-21 17:01:11 -04:00			`[special builder](https://github.com/speice-io/marketdata-shootout/blob/369613843d39cfdc728e1003123bf87f79422497/src/capnp_runner.rs#L17-L51)`
Second editing pass 2019-09-28 13:25:52 -04:00			`that could re-use the buffer, but it required reading through Cap'n Proto's`
			`[benchmarks](https://github.com/capnproto/capnproto-rust/blob/master/benchmark/benchmark.rs#L124-L156)`
Actually format everything 2020-06-29 16:00:26 -04:00			`to find an example, and used`
			[`std::mem::transmute`](https://doc.rust-lang.org/std/mem/fn.transmute.html) to bypass Rust's borrow
			`checker.`

			`The process of reading messages was better, but still had issues. Cap'n Proto has two message`
			`encodings: a ["packed"](https://capnproto.org/encoding.html#packing) representation, and an`
			`"unpacked" version. When reading "packed" messages, we need a buffer to unpack the message into`
			`before we can use it; Cap'n Proto allocates a new buffer for each message we unpack, and I wasn't`
			`able to figure out a way around that. In contrast, the unpacked message format should be where Cap'n`
			`Proto shines; its main selling point is that there's [no decoding step](https://capnproto.org/).`
			`However, accomplishing zero-copy deserialization required code in the private API`
			`([since fixed](https://github.com/capnproto/capnproto-rust/issues/148)), and we allocate a vector on`
			`every read for the segment table.`

			`In the end, I put in significant work to make Cap'n Proto as fast as possible, but there were too`
			`many issues for me to feel comfortable using it long-term.`
Writeup for Flatbuffers 2019-09-27 23:20:46 -04:00
			`# Part 2: Flatbuffers`

Actually format everything 2020-06-29 16:00:26 -04:00			`This is the new kid on the block. After a`
			`[first attempt](https://github.com/google/flatbuffers/pull/3894) didn't pan out, official support`
			`was [recently launched](https://github.com/google/flatbuffers/pull/4898). Flatbuffers intends to`
			`address the same problems as Cap'n Proto: high-performance, polyglot, binary messaging. The`
			`difference is that Flatbuffers claims to have a simpler wire format and`
			`[more flexibility](https://google.github.io/flatbuffers/flatbuffers_benchmarks.html).`
Writeup for Flatbuffers 2019-09-27 23:20:46 -04:00
Actually format everything 2020-06-29 16:00:26 -04:00			`On the whole, I enjoyed using Flatbuffers; the [tooling](https://crates.io/crates/flatc-rust) is`
			`nice, and unlike Cap'n Proto, parsing messages was actually zero-copy and zero-allocation. However,`
			`there were still some issues.`
Writeup for Flatbuffers 2019-09-27 23:20:46 -04:00
Actually format everything 2020-06-29 16:00:26 -04:00			`First, Flatbuffers (at least in Rust) can't handle nested vectors. This is a problem for formats`
			`like the following:`
Writeup for Flatbuffers 2019-09-27 23:20:46 -04:00
First editing pass 2019-09-28 12:55:46 -04:00			```
Writeup for Flatbuffers 2019-09-27 23:20:46 -04:00			`table Message {`
			`symbol: string;`
			`}`
			`table MultiMessage {`
			`messages:[Message];`
			`}`
			```

Actually format everything 2020-06-29 16:00:26 -04:00			We want to create a `MultiMessage` which contains a vector of `Message`, and each `Message` itself
			contains a vector (the `string` type). I was able to work around this by
			[caching `Message` elements](https://github.com/speice-io/marketdata-shootout/blob/e9d07d148bf36a211a6f86802b313c4918377d1b/src/flatbuffers_runner.rs#L83)
			in a `SmallVec` before building the final `MultiMessage`, but it was a painful process that I
			`believe contributed to poor serialization performance.`
Writeup for Flatbuffers 2019-09-27 23:20:46 -04:00
Actually format everything 2020-06-29 16:00:26 -04:00			`Second, streaming support in Flatbuffers seems to be something of an`
			`[afterthought](https://github.com/google/flatbuffers/issues/3898). Where Cap'n Proto in Rust handles`
			reading messages from a stream as part of the API, Flatbuffers just sticks a `u32` at the front of
			`each message to indicate the size. Not specifically a problem, but calculating message size without`
			`that tag is nigh on impossible.`
Writeup for Flatbuffers 2019-09-27 23:20:46 -04:00
Actually format everything 2020-06-29 16:00:26 -04:00			`Ultimately, I enjoyed using Flatbuffers, and had to do significantly less work to make it perform`
			`well.`
Start a second pass on the article Also change the table formatting to actually be readable 2019-09-26 23:24:39 -04:00
Finish first draft 2019-09-28 00:18:20 -04:00			`# Part 3: Simple Binary Encoding`
Start a second pass on the article Also change the table formatting to actually be readable 2019-09-26 23:24:39 -04:00
Finish first draft 2019-09-28 00:18:20 -04:00			`Support for SBE was added by the author of one of my favorite`
			`[Rust blog posts](https://web.archive.org/web/20190427124806/https://polysync.io/blog/session-types-for-hearty-codecs/).`
Actually format everything 2020-06-29 16:00:26 -04:00			`I've [talked previously]({% post_url 2019-06-31-high-performance-systems %}) about how important`
			`variance is in high-performance systems, so it was encouraging to read about a format that`
			`[directly addressed](https://github.com/real-logic/simple-binary-encoding/wiki/Why-Low-Latency) my`
			`concerns. SBE has by far the simplest binary format, but it does make some tradeoffs.`

			`Both Cap'n Proto and Flatbuffers use [message offsets](https://capnproto.org/encoding.html#structs)`
			`to handle variable-length data, [unions](https://capnproto.org/language.html#unions), and various`
			`other features. In contrast, messages in SBE are essentially`
			`[just structs](https://github.com/real-logic/simple-binary-encoding/blob/master/sbe-samples/src/main/resources/example-schema.xml);`
Finish first draft 2019-09-28 00:18:20 -04:00			`variable-length data is supported, but there's no union type.`

Final editing pass 2019-09-28 13:43:47 -04:00			`As mentioned in the beginning, the Rust port of SBE works well, but is`
			`[essentially unmaintained](https://users.rust-lang.org/t/zero-cost-abstraction-frontier-no-copy-low-allocation-ordered-decoding/11515/9).`
Actually format everything 2020-06-29 16:00:26 -04:00			`However, if you don't need union types, and can accept that schemas are XML documents, it's still`
			`worth using. SBE's implementation had the best streaming support of all formats I tested, and`
			`doesn't trigger allocation during de/serialization.`
Finish first draft 2019-09-28 00:18:20 -04:00
			`# Results`

Actually format everything 2020-06-29 16:00:26 -04:00			`After building a test harness`
			`[for](https://github.com/speice-io/marketdata-shootout/blob/master/src/capnp_runner.rs)`
Migrate links for marketdata shootout 2020-03-21 17:01:11 -04:00			`[each](https://github.com/speice-io/marketdata-shootout/blob/master/src/flatbuffers_runner.rs)`
Actually format everything 2020-06-29 16:00:26 -04:00			`[format](https://github.com/speice-io/marketdata-shootout/blob/master/src/sbe_runner.rs), it was`
			`time to actually take them for a spin. I used`
			`[this script](https://github.com/speice-io/marketdata-shootout/blob/master/run_shootout.sh) to run`
			`the benchmarks, and the raw results are`
			`[here](https://github.com/speice-io/marketdata-shootout/blob/master/shootout.csv). All data reported`
			`below is the average of 10 runs on a single day of IEX data. Results were validated to make sure`
Second editing pass 2019-09-28 13:25:52 -04:00			`that each format parsed the data correctly.`
Finish first draft 2019-09-28 00:18:20 -04:00
			`## Serialization`

Second editing pass 2019-09-28 13:25:52 -04:00			`This test measures, on a`
Migrate links for marketdata shootout 2020-03-21 17:01:11 -04:00			`[per-message basis](https://github.com/speice-io/marketdata-shootout/blob/master/src/main.rs#L268-L272),`
Actually format everything 2020-06-29 16:00:26 -04:00			`how long it takes to serialize the IEX message into the desired format and write to a pre-allocated`
			`buffer.`
Start a second pass on the article Also change the table formatting to actually be readable 2019-09-26 23:24:39 -04:00
Continue formatting work 2019-09-26 23:35:53 -04:00			`\| Schema \| Median \| 99th Pctl \| 99.9th Pctl \| Total \|`
Markdown auto formatting 2020-06-29 15:51:23 -04:00			`\| :------------------- \| :----- \| :-------- \| :---------- \| :----- \|`
Start a second pass on the article Also change the table formatting to actually be readable 2019-09-26 23:24:39 -04:00			`\| Cap'n Proto Packed \| 413ns \| 1751ns \| 2943ns \| 14.80s \|`
			`\| Cap'n Proto Unpacked \| 273ns \| 1828ns \| 2836ns \| 10.65s \|`
			`\| Flatbuffers \| 355ns \| 2185ns \| 3497ns \| 14.31s \|`
			`\| SBE \| 91ns \| 1535ns \| 2423ns \| 3.91s \|`

Finish first draft 2019-09-28 00:18:20 -04:00			`## Deserialization`

Second editing pass 2019-09-28 13:25:52 -04:00			`This test measures, on a`
Migrate links for marketdata shootout 2020-03-21 17:01:11 -04:00			`[per-message basis](https://github.com/speice-io/marketdata-shootout/blob/master/src/main.rs#L294-L298),`
Actually format everything 2020-06-29 16:00:26 -04:00			`how long it takes to read the previously-serialized message and perform some basic aggregation. The`
			`aggregation code is the same for each format, so any performance differences are due solely to the`
			`format implementation.`
Start a second pass on the article Also change the table formatting to actually be readable 2019-09-26 23:24:39 -04:00
Continue formatting work 2019-09-26 23:35:53 -04:00			`\| Schema \| Median \| 99th Pctl \| 99.9th Pctl \| Total \|`
Markdown auto formatting 2020-06-29 15:51:23 -04:00			`\| :------------------- \| :----- \| :-------- \| :---------- \| :----- \|`
Start a second pass on the article Also change the table formatting to actually be readable 2019-09-26 23:24:39 -04:00			`\| Cap'n Proto Packed \| 539ns \| 1216ns \| 2599ns \| 18.92s \|`
			`\| Cap'n Proto Unpacked \| 366ns \| 737ns \| 1583ns \| 12.32s \|`
			`\| Flatbuffers \| 173ns \| 421ns \| 1007ns \| 6.00s \|`
			`\| SBE \| 116ns \| 286ns \| 659ns \| 4.05s \|`
Finish first draft 2019-09-28 00:18:20 -04:00
			`# Conclusion`

Actually format everything 2020-06-29 16:00:26 -04:00			`Building a benchmark turned out to be incredibly helpful in making a decision; because a "union"`
			`type isn't important to me, I can be confident that SBE best addresses my needs.`
Finish first draft 2019-09-28 00:18:20 -04:00
First editing pass 2019-09-28 12:55:46 -04:00			`While SBE was the fastest in terms of both median and worst-case performance, its worst case`
Actually format everything 2020-06-29 16:00:26 -04:00			`performance was proportionately far higher than any other format. It seems to be that`
			`de/serialization time scales with message size, but I'll need to do some more research to understand`
			`what exactly is going on.`