Next draft

This commit is contained in:
Bradlee Speice 2018-06-19 22:50:08 -04:00
parent a69ed9f690
commit 632501ee1b
2 changed files with 44 additions and 38 deletions

View File

@ -21,29 +21,37 @@ what to do with your life (but you should totally keep reading).
OK, fine, I guess I should start with *why* someone would do this. OK, fine, I guess I should start with *why* someone would do this.
[Dateutil](https://github.com/dateutil/dateutil) is a Python library for handling dates. [Dateutil](https://github.com/dateutil/dateutil) is a Python library for handling dates.
While the standard library support for time in Python is kinda dope, there's a lot of pieces The standard library support for time in Python is kinda dope, but there are a lot of extras
that go into making it useful beyond just the [datetime](https://docs.python.org/3.6/library/datetime.html) that go into making it useful beyond just the [datetime](https://docs.python.org/3.6/library/datetime.html)
module. module. `dateutil.parser` specifically is code to take all the super-weird time formats people
come up with and turn them into something actually useful.
Specifically, `dateutil.parser` is code to take all the super-weird time formats people Date/time parsing, it turns out, is just like everything else(https://zachholman.com/talk/utc-is-enough-for-everyone-right)
come up with and turn them into something actually useful. Just like [everything else](https://zachholman.com/talk/utc-is-enough-for-everyone-right) involving(https://i.redd.it/syw7q6gc77f01.jpg) [computers](https://infiniteundo.com/post/25326999628/falsehoods-programmers-believe-about-time)
[involving](https://i.redd.it/syw7q6gc77f01.jpg) [computers](https://infiniteundo.com/post/25326999628/falsehoods-programmers-believe-about-time) and [time](https://infiniteundo.com/post/25509354022/more-falsehoods-programmers-believe-about-time):
and [time](https://infiniteundo.com/post/25509354022/more-falsehoods-programmers-believe-about-time),
it feels like it shouldn't be that difficult to do, until you try to do it, it feels like it shouldn't be that difficult to do, until you try to do it,
and you realize that people just suck and this is why can't we have nice things. and you realize that people suck and this is why [we can't we have nice things](https://zachholman.com/talk/utc-is-enough-for-everyone-right).
But alas, we can still try and make contemporary art out of the rubble. But alas, we'll try and make contemporary art out of the rubble and give it a
pretentious name like *Time*.
What makes `dateutil.parser` great is that there's single super-important function: `parse(time_string)`. ![A gravel mound](/assets/images/2018-06-25-gravel-mound.jpg)
> [Time](https://www.goodfreephotos.com/united-states/montana/elkhorn/remains-of-the-mining-operation-elkhorn.jpg.php)
What makes `dateutil.parser` great is that there's single function with a single argument that drives
what programmers interact with: [`parse(timestr)`](https://github.com/dateutil/dateutil/blob/6dde5d6298cfb81a4c594a38439462799ed2aef2/dateutil/parser/_parser.py#L1258).
It takes in the time as a string, and gives you back a reasonable "look, this is the best It takes in the time as a string, and gives you back a reasonable "look, this is the best
anyone can possibly do to make sense of your input" value. It doesn't expect much of you. anyone can possibly do to make sense of your input" value. It doesn't expect much of you.
Which is great. And now it's in Rust.
[And now it's in Rust.](https://github.com/bspeice/dtparse/blob/7d565d3a78876dbebd9711c9720364fe9eba7915/src/lib.rs#L1332)
# Lost in Translation # Lost in Translation
Having worked at Bank of America and seeing Java programmers try to be Python programmers, Having worked at a bulge-bracket bank watching Java programmers try to be Python programmers,
I'm admittedly hesitant to publish Python code that's pretending to be Rust. I'm admittedly hesitant to publish Python code that's trying to be Rust.
Interestingly, Rust code can actually do a great job of mimicking Python. Interestingly, Rust code can actually do a great job of mimicking Python.
It's certainly not idiomatic Rust, but [the Iterator pattern is the same](https://webcache.googleusercontent.com/search?q=cache:wkYMpktJtnUJ:https://jackstouffer.com/blog/porting_dateutil.html+&cd=3&hl=en&ct=clnk&gl=us). It's certainly not idiomatic Rust, but I've had better experiences
than [this guy](https://webcache.googleusercontent.com/search?q=cache:wkYMpktJtnUJ:https://jackstouffer.com/blog/porting_dateutil.html+&cd=3&hl=en&ct=clnk&gl=us)
who attempted the same thing for D. These are the actual take-aways:
When transcribing code, **stay as close to the original library as possible**. I'm talking When transcribing code, **stay as close to the original library as possible**. I'm talking
about using the same variable names, same access patterns, the whole shebang. about using the same variable names, same access patterns, the whole shebang.
@ -67,7 +75,7 @@ And while `dateutil` is pretty well-behaved about not skipping multiple stack fr
take a while to verify. take a while to verify.
As another Python quirk, **be very careful about [long nested if-elif-else blocks](https://github.com/dateutil/dateutil/blob/16561fc99361979e88cccbd135393b06b1af7e90/dateutil/parser/_parser.py#L494-L568)**. As another Python quirk, **be very careful about [long nested if-elif-else blocks](https://github.com/dateutil/dateutil/blob/16561fc99361979e88cccbd135393b06b1af7e90/dateutil/parser/_parser.py#L494-L568)**.
I used to think that [Python's whitespace](https://www.xkcd.com/353/) was just there I used to think that Python's whitespace was just there
to get you to format your code correctly. I think that no longer. It's way too easy to get you to format your code correctly. I think that no longer. It's way too easy
to close an extra block and have incredibly weird issues in the logic. to close an extra block and have incredibly weird issues in the logic.
@ -82,8 +90,9 @@ Finally, **I really miss list comprehensions and dictionary comprehensions.**
As a quick comparison, see As a quick comparison, see
[this dateutil code](https://github.com/dateutil/dateutil/blob/16561fc99361979e88cccbd135393b06b1af7e90/dateutil/parser/_parser.py#L476) [this dateutil code](https://github.com/dateutil/dateutil/blob/16561fc99361979e88cccbd135393b06b1af7e90/dateutil/parser/_parser.py#L476)
and [the implementation in Rust](https://github.com/bspeice/dtparse/blob/7d565d3a78876dbebd9711c9720364fe9eba7915/src/lib.rs#L619-L629). and [the implementation in Rust](https://github.com/bspeice/dtparse/blob/7d565d3a78876dbebd9711c9720364fe9eba7915/src/lib.rs#L619-L629).
Ultimately, I hope that these can be added through macros, but I have a feeling that they'd actually I probably wrote it wrong, and I'm sorry. Ultimately, I hope that these comprehensions can be added through macros,
need to be syntax extensions. Either way, they're expressive, save typing, and super-readable. Let's get more of that. but I have a feeling that they'd actually need to be syntax extensions. Either way, they're expressive, save typing,
and are super-readable. Let's get more of that.
# Using a young language # Using a young language
@ -91,11 +100,10 @@ Now, Rust is exciting and new, which means that there's opportunity to make a su
On more than one occasion I've had issues navigating the Rust ecosystem though. On more than one occasion I've had issues navigating the Rust ecosystem though.
What I'll call the "canonical library" is still being built. In Python, if you need datetime parsing, What I'll call the "canonical library" is still being built. In Python, if you need datetime parsing,
you use `dateutil`. If you want [Decimal](https://docs.python.org/3.6/library/decimal.html) types, you use `dateutil`. If you want `decimal` types, it's already in the
it's already in the standard library. It's probably [standard library](https://docs.python.org/3.6/library/decimal.html). The way `dateutil` uses decimals
[not strictly necessary in `dateutil`](https://github.com/dateutil/dateutil/blob/16561fc99361979e88cccbd135393b06b1af7e90/dateutil/parser/_parser.py#L1242), probably isn't strictly necessary, but I wanted to follow the principle of **stay as close to the original library as possible**.
but I wanted to follow the principle of **stay as close to the original library as possible** Thus began my quest to find a decimal library in Rust. What I quickly found was summarized
and thus began my quest to find a decimal library in Rust. What I quickly found was summarized
in a comment: in a comment:
> Writing a BigDecimal is easy. Writing a *good* BigDecimal is hard. > Writing a BigDecimal is easy. Writing a *good* BigDecimal is hard.
@ -105,15 +113,14 @@ in a comment:
In practice, this means that there are at least [4](https://crates.io/crates/bigdecimal) In practice, this means that there are at least [4](https://crates.io/crates/bigdecimal)
[different](https://crates.io/crates/rust_decimal) [implementations](https://crates.io/crates/decimal) [different](https://crates.io/crates/rust_decimal) [implementations](https://crates.io/crates/decimal)
[available](https://crates.io/crates/decimate). And that's a lot of decisions to worry about [available](https://crates.io/crates/decimate). And that's a lot of decisions to worry about
when all I'm thinking about is "I just want a reasonable Decimal library" and I'm forced to dig through a when all I'm thinking about is "why can't [calendar reform](https://en.wikipedia.org/wiki/Calendar_reform) be a thing"
[couple](https://github.com/rust-lang/rust/issues/8937#issuecomment-31661916) and I'm forced to dig through a [couple](https://github.com/rust-lang/rust/issues/8937#issuecomment-31661916)
[different](https://github.com/rust-lang/rfcs/issues/334) [different](https://github.com/rust-lang/rfcs/issues/334) [threads](https://github.com/rust-num/num/issues/8)
[threads](https://github.com/rust-num/num/issues/8) to figure out if the library I'm look at is DOA stable. to figure out if the library I'm look at is dead or just stable.
And even when the "canonical library" exists for something like timezones ([`pytz`](https://pythonhosted.org/pytz/) and And even when the "canonical library" exists, there's no guarantees that it will be well-maintained.
more recently [`dateutil.tz`](https://dateutil.readthedocs.io/en/stable/tz.html) in Python), there's no guarantees [Chrono](https://github.com/chronotope/chrono) is the *de facto* date/time library in Rust,
that it will be well-maintained. [Chrono](https://github.com/chronotope/chrono) is currently the canonical datetime and just released version 0.4.3 like a week ago. Meanwhile, [chrono-tz](https://github.com/chronotope/chrono-tz)
library in Rust, and just released version 0.4.3 like a week ago. Meanwhile, [chrono-tz](https://github.com/chronotope/chrono-tz)
appears to be dead in the water even though [there are people happy to help maintain it](https://github.com/chronotope/chrono-tz/issues/19). appears to be dead in the water even though [there are people happy to help maintain it](https://github.com/chronotope/chrono-tz/issues/19).
I know relatively little about it, but it appears that most of the release process is automated; keeping I know relatively little about it, but it appears that most of the release process is automated; keeping
that up to date should be a no-brainer. that up to date should be a no-brainer.
@ -134,21 +141,20 @@ then first congratulations on sustaining human life, and second I don't mind kee
I just want to try and balance keeping things moving with giving people the necessary time. I just want to try and balance keeping things moving with giving people the necessary time.
I should also note that I'm still getting some best practices in place - CONTRIBUTING and CONTRIBUTORS files I should also note that I'm still getting some best practices in place - CONTRIBUTING and CONTRIBUTORS files
need to be added, as well as issue/PR templates. In progress. need to be added, as well as issue/PR templates. In progress. None of us are perfect.
# Roadmap and Conclusion # Roadmap and Conclusion
So if I've now built a `dateutil`-compatible parser, we're done, right? Of course not! That's not So if I've now built a `dateutil`-compatible parser, we're done, right? Of course not! That's not
nearly ambitious enough. nearly ambitious enough.
Ultimately, I'd love to have a library that's capable of essentially everything the Linux `date` Ultimately, I'd love to have a library that's capable of parsing essentially everything the Linux `date`
command can do (and not `date` on OSX, because seriously, it's the worst). I know Rust has a command can do (and not `date` on OSX, because seriously, BSD coreutils are the worst). I know Rust has a
coreutils rewrite going on, and this would be potentially an interesting candidate since it coreutils rewrite going on, and `dtparse` would be potentially an interesting candidate since it
doesn't bring in a lot of extra dependencies for the functionality it provides. doesn't bring in a lot of extra dependencies. [`humantime`](https://crates.io/crates/humantime)
[`humantime`](https://crates.io/crates/humantime) also is able to parse durations, could help pick up some of the (current) slack in dtparse, so maybe we can share and care with each other?
so maybe we negotiate something to integrate it all together?
All in all, I'm really hoping that nobody's already done this and I've spent a bit over a month All in all, I'm mostly hoping that nobody's already done this and I've spent a bit over a month
on redundant code. So if it exists, tell me because I need to know, but be nice about it. on redundant code. So if it exists, tell me. I need to know, but be nice about it because I'm going to take it hard.
And in the mean time, I'm looking forward to building more. Onwards. And in the mean time, I'm looking forward to building more. Onwards.

Binary file not shown.

After

Width:  |  Height:  |  Size: 165 KiB