From c7f94b742600e000cfddc063b6f9eed3b9eb7d62 Mon Sep 17 00:00:00 2001 From: Bradlee Speice Date: Sun, 1 Sep 2019 23:56:43 -0400 Subject: [PATCH] Start work on binary format shootout --- _pages/about.md | 2 - _posts/2019-09-01-binary-shootout-part-0.md | 44 +++++++++++++++++++ _posts/2019-09-01-binary-shootout-part-1.md | 47 +++++++++++++++++++++ 3 files changed, 91 insertions(+), 2 deletions(-) create mode 100644 _posts/2019-09-01-binary-shootout-part-0.md create mode 100644 _posts/2019-09-01-binary-shootout-part-1.md diff --git a/_pages/about.md b/_pages/about.md index 782427d..699dc7d 100644 --- a/_pages/about.md +++ b/_pages/about.md @@ -10,5 +10,3 @@ Best ways to get in contact: - Email: [bradlee@speice.io](mailto:bradlee@speice.io) - LinkedIn: [bradleespeice](https://www.linkedin.com/in/bradleespeice/) -- Matrix (Chat): [@bspeice:matrix.com](https://matrix.to/#/@bspeice:matrix.com) -- Gitter (Chat): [bspeice](https://gitter.im/bspeice/Lobby) diff --git a/_posts/2019-09-01-binary-shootout-part-0.md b/_posts/2019-09-01-binary-shootout-part-0.md new file mode 100644 index 0000000..5e75b5a --- /dev/null +++ b/_posts/2019-09-01-binary-shootout-part-0.md @@ -0,0 +1,44 @@ +--- +layout: post +title: "Binary Format Shootout - Prologue: Nom" +description: "Making sense of binary streams" +category: +tags: [rust, binary-shootout] +--- + +I've been interested in using a binary protocol library for personal projects recently, +and found myself with a strong case of decision paralysis. Do I use +[Cap'n Proto](https://capnproto.org/), which has supported Rust the longest? +[Flatbuffers](https://google.github.io/flatbuffers) recently added support, +or I could take a look at [SBE](https://github.com/real-logic/simple-binary-encoding). +Or what about building something myself? A lot of these seem unnecessarily +complicated, when my personal use case is just providing views on top of +buffers with a relatively simple structure. + +Even in my personal projects, I want the choices to be the best possible; +I hate the feeling of looking back at anything I've built and saying "I regret +that decision and I could have done better." So after agonizing over the choice +of protocol library for too long, I decided it would be worth building a test +to get a feel for each. It would give me a way to build a proof-of-concept +and become familiar with how each library worked, what the performance +characteristics were of each, and evaluate whether it was worth putting +in the effort of building yet another binary protocol library myself. + +To that end, this is the summation of research into the binary protocol +systems that currently support Rust. The goal isn't to recommend "the best," +but to understand each well enough to make an informed decision. + +My use case is as follows: ingest binary market data from +[IEX](https://iextrading.com/trading/market-data/) and turn it into +a format understandable by each library being tested. We'll later +write a simple program to analyze the data. + +Note: Market data is the use case here +simply because IEX makes the data freely available; no code or analysis +in this blog is related to my past or present work. + +But before we can run any analysis, we need to read in the files +supplied by IEX. To do that, we'll use a library in Rust +called [`nom`](https://docs.rs/nom/5.0.1/nom/). + +# Ingesting Market Data diff --git a/_posts/2019-09-01-binary-shootout-part-1.md b/_posts/2019-09-01-binary-shootout-part-1.md new file mode 100644 index 0000000..c62eb04 --- /dev/null +++ b/_posts/2019-09-01-binary-shootout-part-1.md @@ -0,0 +1,47 @@ +--- +layout: post +title: "new post" +description: "" +category: +tags: [] +--- + +# Designing the Test + +My use case is as follows: ingest binary market data from +[IEX](https://iextrading.com/trading/market-data/) and turn it into +a format understandable by each library being tested. Then we'll +write a simple program to find total trade volume per ticker, +and the highest and lowest bid/ask price per ticker as well. + +Note: Market data is the use case here +simply because IEX makes the data freely available; no code or analysis +in this blog is related to my past or present work. + +Now, the basic criteria used to evaluate each library: + +1) The library must have cross-language support, and treat Rust as a +first-class citizen. + +2) The schema must be able to evolve and add new fields. The information +I'm gathering now is fairly simple, but would evolve in the future. + +3) Performance is a priority; material performance differences +(time to de/serialize, memory usage) matter. + +Under those three criteria, we're excluding a lot of systems that +may make sense in other contexts: + +- [Bincode](https://github.com/servo/bincode) has great Rust support +and a simple wire format (message structure) but isn't usable from +other languages and doesn't deal well with schema evolution. + +- [Protocol Buffers](https://developers.google.com/protocol-buffers/) have +great cross-language support, but material performance issues compared +to other systems like FlatBuffers. + +- JSON/Msgpack are schema-less; while the wire format is simple, +having code generated from a schema is too nice to pass up. + +While each of these have a niche they perform well in, they're not +suited for the system under consideration. \ No newline at end of file