mirror of
https://github.com/bspeice/dtparse
synced 2025-07-03 23:05:18 -04:00
Compare commits
51 Commits
v0.9.1
...
6a5ec31d8e
Author | SHA1 | Date | |
---|---|---|---|
6a5ec31d8e | |||
23f50fb62b | |||
f1ca602e9f | |||
899cd88280 | |||
a08bb2d9d7 | |||
af6c3238c4 | |||
b098f54f8b | |||
61022c323e | |||
4079b3ce2f | |||
3e03b188b4 | |||
7147677926 | |||
22b6a321e6 | |||
9edc2a3102 | |||
245f746c8c | |||
5782a573bc | |||
e895fbd9f3 | |||
2a2f1e7fbd | |||
e9c994a755 | |||
d6fc72459e | |||
d7ff381d7f | |||
d5e0a5d46a | |||
9f1b8d4971 | |||
0f7ac8538c | |||
b81a8d9541 | |||
030ca4fced | |||
142712900f | |||
c310cbaa0d | |||
ef3ea38834 | |||
741afa3451 | |||
4d7c5dd995 | |||
afb7747cdf | |||
22e0300275 | |||
0ef35527d9 | |||
b5fa1d89ef | |||
246b389ac9 | |||
4d48885f4b | |||
48705339e6 | |||
01ebec84bb | |||
28b7bec91d | |||
b966c02d04 | |||
4133343e93 | |||
0d3b646749 | |||
50fe2c01d4 | |||
893cf6d40c | |||
8f8ba7887a | |||
256f937742 | |||
91a3a4a481 | |||
c884bc5842 | |||
44e37b364c | |||
c6feaebe71 | |||
0d18eb524b |
2
.gitignore
vendored
2
.gitignore
vendored
@ -4,3 +4,5 @@
|
|||||||
Cargo.lock
|
Cargo.lock
|
||||||
.vscode
|
.vscode
|
||||||
*.pyc
|
*.pyc
|
||||||
|
.idea/
|
||||||
|
*.swp
|
||||||
|
43
.travis.yml
43
.travis.yml
@ -1,5 +1,40 @@
|
|||||||
language: rust
|
language: rust
|
||||||
rust:
|
|
||||||
- stable
|
jobs:
|
||||||
- beta
|
include:
|
||||||
- nightly
|
- rust: stable
|
||||||
|
os: linux
|
||||||
|
- rust: 1.28.0
|
||||||
|
os: linux
|
||||||
|
env: DISABLE_TOOLS=true
|
||||||
|
- rust: stable
|
||||||
|
os: osx
|
||||||
|
- rust: stable-msvc
|
||||||
|
os: windows
|
||||||
|
- rust: stable
|
||||||
|
os: windows
|
||||||
|
|
||||||
|
cache:
|
||||||
|
- cargo
|
||||||
|
|
||||||
|
before_script:
|
||||||
|
- rustup show
|
||||||
|
# CMake doesn't like the `sh.exe` provided by Git being in PATH
|
||||||
|
- if [[ "$TRAVIS_OS_NAME" == "windows" ]]; then rm "C:/Program Files/Git/usr/bin/sh.exe"; fi
|
||||||
|
- if [[ "$DISABLE_TOOLS" == "" ]]; then rustup component add clippy; rustup component add rustfmt; fi
|
||||||
|
|
||||||
|
script:
|
||||||
|
- if [[ "$DISABLE_TOOLS" == "" ]]; then cargo clippy --all && cargo fmt --all -- --check; fi
|
||||||
|
|
||||||
|
# For default build, split up compilation and tests so we can track build times
|
||||||
|
- cargo test --no-run
|
||||||
|
- cargo test
|
||||||
|
- cargo test --release --no-run
|
||||||
|
- cargo test --release
|
||||||
|
|
||||||
|
branches:
|
||||||
|
only:
|
||||||
|
- master
|
||||||
|
- staging
|
||||||
|
- trying
|
||||||
|
|
||||||
|
34
CHANGELOG.md
Normal file
34
CHANGELOG.md
Normal file
@ -0,0 +1,34 @@
|
|||||||
|
Version 1.0.3 (2018-09-18)
|
||||||
|
==========================
|
||||||
|
|
||||||
|
Misc
|
||||||
|
----
|
||||||
|
|
||||||
|
- Changed the default `parse` function to use a static parser
|
||||||
|
|
||||||
|
Version 1.0.2 (2018-08-14)
|
||||||
|
==========================
|
||||||
|
|
||||||
|
Misc
|
||||||
|
----
|
||||||
|
|
||||||
|
- Add tests for WASM
|
||||||
|
|
||||||
|
Version 1.0.1 (2018-08-11)
|
||||||
|
==========================
|
||||||
|
|
||||||
|
Bugfixes
|
||||||
|
--------
|
||||||
|
|
||||||
|
- Fixed an issue with "GMT+3" not being handled correctly
|
||||||
|
|
||||||
|
Misc
|
||||||
|
----
|
||||||
|
|
||||||
|
- Upgrade `lazy_static` and `rust_decimal` dependencies
|
||||||
|
|
||||||
|
Version 1.0.0 (2018-08-03)
|
||||||
|
==========================
|
||||||
|
|
||||||
|
Initial release. Passes all relevant unit tests from Python's
|
||||||
|
`dateutil` project.
|
44
CONTRIBUTING.md
Normal file
44
CONTRIBUTING.md
Normal file
@ -0,0 +1,44 @@
|
|||||||
|
# Contributing
|
||||||
|
|
||||||
|
The `dtparse` crate is better for the contributions made by members of the open source community,
|
||||||
|
and seeks to make it easy to contribute back to the community it comes from. The goals are
|
||||||
|
fairly straight-forward, but here are the ways that would be most beneficial:
|
||||||
|
|
||||||
|
## Bug Reports
|
||||||
|
|
||||||
|
The testing suite for `dtparse` is built using tests derived from the [`dateutil`](https://github.com/dateutil/dateutil)
|
||||||
|
package in Python. Some Rust-specific behavior may show up though, for example in how
|
||||||
|
Rust handles nanoseconds where Python's standard library will only go to microseconds.
|
||||||
|
|
||||||
|
If you believe that behavior is improper, you are encouraged to file an issue; there are no dumb
|
||||||
|
issues or suggestions, and the world is a better place for having your input.
|
||||||
|
|
||||||
|
## Testing/Fuzzing
|
||||||
|
|
||||||
|
`dtparse`'s history as a port of Python software has led to some behavior being shown in Rust
|
||||||
|
that would not otherwise be an issue in Python. Testing for these issues to prevent panics
|
||||||
|
is greatly appreciated, and some great work has already happened surrounding fuzzing.
|
||||||
|
|
||||||
|
New test cases built either by fuzzers or humans are welcome.
|
||||||
|
|
||||||
|
## Feature Requests
|
||||||
|
|
||||||
|
Handling weird date formats and quirks is the name of the game. Any ideas on how to improve that
|
||||||
|
or utilities useful in handling the mapping of human time to computers is appreciated.
|
||||||
|
|
||||||
|
Writing code to implement the feature is never mandatory (though always appreciated); if there's
|
||||||
|
something you believe `dtparse` should do that it doesn't currently support, let's make that happen.
|
||||||
|
|
||||||
|
# Development Setup
|
||||||
|
|
||||||
|
The setup requirements for `dtparse` should be fairly straightforward - the project can be built
|
||||||
|
and deployed using only the `cargo` tool in Rust.
|
||||||
|
|
||||||
|
Much of the test coee is generated from Python code, and then the generated versions are stored
|
||||||
|
in version control. Thi is to ensure that all users can run the tests even without
|
||||||
|
installing Python or the other necessary packages.
|
||||||
|
|
||||||
|
To regenerate the tests, please use Python 3.6 with the `dateutil` package installed, and run:
|
||||||
|
|
||||||
|
- `python build_pycompat.py`
|
||||||
|
- `python build_pycompat_tokenizer.py`
|
7
CONTRIBUTORS.md
Normal file
7
CONTRIBUTORS.md
Normal file
@ -0,0 +1,7 @@
|
|||||||
|
This project benefits from the Rust and open source communities, but most specifically from these people:
|
||||||
|
|
||||||
|
# Contributors:
|
||||||
|
|
||||||
|
- [@messense](https://github.com/messense)
|
||||||
|
- [@mjmeehan](https://github.com/mjmeehan)
|
||||||
|
- [@neosilky](https://github.com/neosilky)
|
@ -1,6 +1,6 @@
|
|||||||
[package]
|
[package]
|
||||||
name = "dtparse"
|
name = "dtparse"
|
||||||
version = "0.9.1"
|
version = "1.1.0"
|
||||||
authors = ["Bradlee Speice <bradlee@speice.io>"]
|
authors = ["Bradlee Speice <bradlee@speice.io>"]
|
||||||
description = "A dateutil-compatible timestamp parser for Rust"
|
description = "A dateutil-compatible timestamp parser for Rust"
|
||||||
repository = "https://github.com/bspeice/dtparse.git"
|
repository = "https://github.com/bspeice/dtparse.git"
|
||||||
@ -10,12 +10,14 @@ license = "Apache-2.0"
|
|||||||
|
|
||||||
[badges]
|
[badges]
|
||||||
travis-ci = { repository = "bspeice/dtparse" }
|
travis-ci = { repository = "bspeice/dtparse" }
|
||||||
|
maintenance = { status = "passively-maintained" }
|
||||||
|
|
||||||
[lib]
|
[lib]
|
||||||
name = "dtparse"
|
name = "dtparse"
|
||||||
|
|
||||||
[dependencies]
|
[dependencies]
|
||||||
chrono = "0.4"
|
chrono = "0.4"
|
||||||
lazy_static = "1.0"
|
chrono-tz = "0.5"
|
||||||
|
lazy_static = "1.1"
|
||||||
num-traits = "0.2"
|
num-traits = "0.2"
|
||||||
rust_decimal = "0.9"
|
rust_decimal = "^0.10.1"
|
||||||
|
81
README.md
81
README.md
@ -1,27 +1,74 @@
|
|||||||
# dtparse
|
# dtparse
|
||||||
A [dateutil](https://github.com/dateutil/dateutil)-compatible timestamp parser for Rust
|
|
||||||
|
|
||||||
## Where it stands
|
[](https://travis-ci.org/bspeice/dtparse)
|
||||||
|
[](https://crates.io/crates/dtparse)
|
||||||
|
[](https://docs.rs/dtparse/)
|
||||||
|
|
||||||
The library works really well at the moment, and passes the vast majority of `dateutil`s parser
|
|
||||||
test suite. This isn't mission-critical ready, but is more than ready for hobbyist projects.
|
|
||||||
|
|
||||||
The issues to be resolved before version 1.0:
|
The fully-featured "even I couldn't understand that" time parser.
|
||||||
|
Designed to take in strings and give back sensible dates and times.
|
||||||
|
|
||||||
**Functionality**:
|
dtparse has its foundations in the [`dateutil`](dateutil) library for
|
||||||
|
Python, which excels at taking "interesting" strings and trying to make
|
||||||
|
sense of the dates and times they contain. A couple of quick examples
|
||||||
|
from the test cases should give some context:
|
||||||
|
|
||||||
1. ~~We don't support weekday parsing. In the Python side this is accomplished via `dateutil.relativedelta`~~
|
```rust
|
||||||
Supported in v0.8
|
extern crate chrono;
|
||||||
|
extern crate dtparse;
|
||||||
|
use chrono::prelude::*;
|
||||||
|
use dtparse::parse;
|
||||||
|
|
||||||
2. Named timezones aren't supported very well. [chrono_tz](https://github.com/chronotope/chrono-tz)
|
assert_eq!(
|
||||||
theoretically would provide support, but I'd also like some helper things available (e.g. "EST" is not a named zone in `chrono-tz`).
|
parse("2008.12.30"),
|
||||||
Explicit time zones (i.e. "00:00:00 -0300") are working as expected.
|
Ok((NaiveDate::from_ymd(2008, 12, 30).and_hms(0, 0, 0), None))
|
||||||
|
);
|
||||||
|
|
||||||
3. ~~"Fuzzy" and "Fuzzy with tokens" modes haven't been tested. The code should work, but I need to get the
|
// It can even handle timezones!
|
||||||
test cases added to the auto-generation suite~~
|
assert_eq!(
|
||||||
|
parse("January 4, 2024; 18:30:04 +02:00"),
|
||||||
|
Ok((
|
||||||
|
NaiveDate::from_ymd(2024, 1, 4).and_hms(18, 30, 4),
|
||||||
|
Some(FixedOffset::east(7200))
|
||||||
|
))
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
**Non-functional**: This library is intended to be a direct port from Python, and thus the code
|
And we can even handle fuzzy strings where dates/times aren't the
|
||||||
looks a lot more like Python than it does Rust. There are a ton of `TODO` comments in the code
|
only content if we dig into the implementation a bit!
|
||||||
that need cleaned up, things that could be converted to enums, etc.
|
|
||||||
|
|
||||||
In addition, some more documentation would be incredibly helpful. It's, uh, sparse at the moment.
|
```rust
|
||||||
|
extern crate chrono;
|
||||||
|
extern crate dtparse;
|
||||||
|
use chrono::prelude::*;
|
||||||
|
use dtparse::Parser;
|
||||||
|
use std::collections::HashMap;
|
||||||
|
|
||||||
|
let mut p = Parser::default();
|
||||||
|
assert_eq!(
|
||||||
|
p.parse(
|
||||||
|
"I first released this library on the 17th of June, 2018.",
|
||||||
|
None, None,
|
||||||
|
true /* turns on fuzzy mode */,
|
||||||
|
true /* gives us the tokens that weren't recognized */,
|
||||||
|
None, false, &HashMap::new()
|
||||||
|
),
|
||||||
|
Ok((
|
||||||
|
NaiveDate::from_ymd(2018, 6, 17).and_hms(0, 0, 0),
|
||||||
|
None,
|
||||||
|
Some(vec!["I first released this library on the ",
|
||||||
|
" of ", ", "].iter().map(|&s| s.into()).collect())
|
||||||
|
))
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
Further examples can be found in the [examples](examples) directory on international usage.
|
||||||
|
|
||||||
|
# Usage
|
||||||
|
|
||||||
|
`dtparse` requires a minimum Rust version of 1.28 to build, but is tested on Windows, OSX,
|
||||||
|
BSD, Linux, and WASM. The build is also compiled against the iOS and Android SDK's, but is not
|
||||||
|
tested against them.
|
||||||
|
|
||||||
|
[dateutil]: https://github.com/dateutil/dateutil
|
||||||
|
[examples]: https://github.com/bspeice/dtparse/tree/master/examples
|
||||||
|
4
bors.toml
Normal file
4
bors.toml
Normal file
@ -0,0 +1,4 @@
|
|||||||
|
status = [
|
||||||
|
"continuous-integration/travis-ci/push",
|
||||||
|
]
|
||||||
|
delete_merged_branches = true
|
44
build_pycompat.py
Normal file → Executable file
44
build_pycompat.py
Normal file → Executable file
@ -1,4 +1,6 @@
|
|||||||
|
#!/usr/bin/python3
|
||||||
from dateutil.parser import parse
|
from dateutil.parser import parse
|
||||||
|
from dateutil.tz import tzutc
|
||||||
from datetime import datetime
|
from datetime import datetime
|
||||||
|
|
||||||
tests = {
|
tests = {
|
||||||
@ -48,7 +50,9 @@ tests = {
|
|||||||
'test_parse_offset': [
|
'test_parse_offset': [
|
||||||
'Thu, 25 Sep 2003 10:49:41 -0300', '2003-09-25T10:49:41.5-03:00',
|
'Thu, 25 Sep 2003 10:49:41 -0300', '2003-09-25T10:49:41.5-03:00',
|
||||||
'2003-09-25T10:49:41-03:00', '20030925T104941.5-0300',
|
'2003-09-25T10:49:41-03:00', '20030925T104941.5-0300',
|
||||||
'20030925T104941-0300'
|
'20030925T104941-0300',
|
||||||
|
# dtparse-specific
|
||||||
|
"2018-08-10 10:00:00 UTC+3", "2018-08-10 03:36:47 PM GMT-4", "2018-08-10 04:15:00 AM Z-02:00"
|
||||||
],
|
],
|
||||||
'test_parse_dayfirst': [
|
'test_parse_dayfirst': [
|
||||||
'10-09-2003', '10.09.2003', '10/09/2003', '10 09 2003',
|
'10-09-2003', '10.09.2003', '10/09/2003', '10 09 2003',
|
||||||
@ -77,7 +81,7 @@ tests = {
|
|||||||
'Thu Sep 25 10:36:28 BRST 2003', '1996.07.10 AD at 15:08:56 PDT',
|
'Thu Sep 25 10:36:28 BRST 2003', '1996.07.10 AD at 15:08:56 PDT',
|
||||||
'Tuesday, April 12, 1952 AD 3:30:42pm PST',
|
'Tuesday, April 12, 1952 AD 3:30:42pm PST',
|
||||||
'November 5, 1994, 8:15:30 am EST', '1994-11-05T08:15:30-05:00',
|
'November 5, 1994, 8:15:30 am EST', '1994-11-05T08:15:30-05:00',
|
||||||
'1994-11-05T08:15:30Z', '1976-07-04T00:01:02Z',
|
'1994-11-05T08:15:30Z', '1976-07-04T00:01:02Z', '1986-07-05T08:15:30z',
|
||||||
'Tue Apr 4 00:22:12 PDT 1995'
|
'Tue Apr 4 00:22:12 PDT 1995'
|
||||||
],
|
],
|
||||||
'test_fuzzy_tzinfo': [
|
'test_fuzzy_tzinfo': [
|
||||||
@ -189,6 +193,10 @@ def test_fuzzy_simple(i, s):
|
|||||||
|
|
||||||
# Here lies all the ugly junk.
|
# Here lies all the ugly junk.
|
||||||
TEST_HEADER = '''
|
TEST_HEADER = '''
|
||||||
|
//! This code has been generated by running the `build_pycompat.py` script
|
||||||
|
//! in the repository root. Please do not edit it, as your edits will be destroyed
|
||||||
|
//! upon re-running code generation.
|
||||||
|
|
||||||
extern crate chrono;
|
extern crate chrono;
|
||||||
|
|
||||||
use chrono::Datelike;
|
use chrono::Datelike;
|
||||||
@ -222,10 +230,10 @@ fn parse_and_assert(
|
|||||||
fuzzy_with_tokens: bool,
|
fuzzy_with_tokens: bool,
|
||||||
default: Option<&NaiveDateTime>,
|
default: Option<&NaiveDateTime>,
|
||||||
ignoretz: bool,
|
ignoretz: bool,
|
||||||
tzinfos: HashMap<String, i32>,
|
tzinfos: &HashMap<String, i32>,
|
||||||
) {
|
) {
|
||||||
|
|
||||||
let mut parser = Parser::new(info);
|
let parser = Parser::new(info);
|
||||||
let rs_parsed = parser.parse(
|
let rs_parsed = parser.parse(
|
||||||
s,
|
s,
|
||||||
dayfirst,
|
dayfirst,
|
||||||
@ -272,10 +280,10 @@ fn parse_fuzzy_and_assert(
|
|||||||
fuzzy_with_tokens: bool,
|
fuzzy_with_tokens: bool,
|
||||||
default: Option<&NaiveDateTime>,
|
default: Option<&NaiveDateTime>,
|
||||||
ignoretz: bool,
|
ignoretz: bool,
|
||||||
tzinfos: HashMap<String, i32>,
|
tzinfos: &HashMap<String, i32>,
|
||||||
) {
|
) {
|
||||||
|
|
||||||
let mut parser = Parser::new(info);
|
let parser = Parser::new(info);
|
||||||
let rs_parsed = parser.parse(
|
let rs_parsed = parser.parse(
|
||||||
s,
|
s,
|
||||||
dayfirst,
|
dayfirst,
|
||||||
@ -316,7 +324,7 @@ fn test_parse_default{i}() {{
|
|||||||
micros: {d.microsecond}, tzo: None
|
micros: {d.microsecond}, tzo: None
|
||||||
}};
|
}};
|
||||||
parse_and_assert(pdt, info, "{s}", None, None, false, false,
|
parse_and_assert(pdt, info, "{s}", None, None, false, false,
|
||||||
Some(default_rsdate), false, HashMap::new());
|
Some(default_rsdate), false, &HashMap::new());
|
||||||
}}\n'''
|
}}\n'''
|
||||||
|
|
||||||
TEST_PARSE_SIMPLE = '''
|
TEST_PARSE_SIMPLE = '''
|
||||||
@ -340,7 +348,7 @@ fn test_parse_tzinfo{i}() {{
|
|||||||
micros: {d.microsecond}, tzo: Some({offset}),
|
micros: {d.microsecond}, tzo: Some({offset}),
|
||||||
}};
|
}};
|
||||||
parse_and_assert(pdt, info, "{s}", None, None, false, false,
|
parse_and_assert(pdt, info, "{s}", None, None, false, false,
|
||||||
None, false, rs_tzinfo_map!());
|
None, false, &rs_tzinfo_map!());
|
||||||
}}\n'''
|
}}\n'''
|
||||||
|
|
||||||
TEST_PARSE_OFFSET = '''
|
TEST_PARSE_OFFSET = '''
|
||||||
@ -353,7 +361,7 @@ fn test_parse_offset{i}() {{
|
|||||||
micros: {d.microsecond}, tzo: Some({offset}),
|
micros: {d.microsecond}, tzo: Some({offset}),
|
||||||
}};
|
}};
|
||||||
parse_and_assert(pdt, info, "{s}", None, None, false, false,
|
parse_and_assert(pdt, info, "{s}", None, None, false, false,
|
||||||
None, false, HashMap::new());
|
None, false, &HashMap::new());
|
||||||
}}\n'''
|
}}\n'''
|
||||||
|
|
||||||
TEST_PARSE_DAYFIRST = '''
|
TEST_PARSE_DAYFIRST = '''
|
||||||
@ -366,7 +374,7 @@ fn test_parse_dayfirst{i}() {{
|
|||||||
micros: {d.microsecond}, tzo: None,
|
micros: {d.microsecond}, tzo: None,
|
||||||
}};
|
}};
|
||||||
parse_and_assert(pdt, info, "{s}", Some(true), None, false, false,
|
parse_and_assert(pdt, info, "{s}", Some(true), None, false, false,
|
||||||
None, false, HashMap::new());
|
None, false, &HashMap::new());
|
||||||
}}\n'''
|
}}\n'''
|
||||||
|
|
||||||
TEST_PARSE_YEARFIRST = '''
|
TEST_PARSE_YEARFIRST = '''
|
||||||
@ -379,7 +387,7 @@ fn test_parse_yearfirst{i}() {{
|
|||||||
micros: {d.microsecond}, tzo: None,
|
micros: {d.microsecond}, tzo: None,
|
||||||
}};
|
}};
|
||||||
parse_and_assert(pdt, info, "{s}", None, Some(true), false, false,
|
parse_and_assert(pdt, info, "{s}", None, Some(true), false, false,
|
||||||
None, false, HashMap::new());
|
None, false, &HashMap::new());
|
||||||
}}\n'''
|
}}\n'''
|
||||||
|
|
||||||
TEST_PARSE_DFYF = '''
|
TEST_PARSE_DFYF = '''
|
||||||
@ -392,7 +400,7 @@ fn test_parse_dfyf{i}() {{
|
|||||||
micros: {d.microsecond}, tzo: None,
|
micros: {d.microsecond}, tzo: None,
|
||||||
}};
|
}};
|
||||||
parse_and_assert(pdt, info, "{s}", Some(true), Some(true), false, false,
|
parse_and_assert(pdt, info, "{s}", Some(true), Some(true), false, false,
|
||||||
None, false, HashMap::new());
|
None, false, &HashMap::new());
|
||||||
}}\n'''
|
}}\n'''
|
||||||
|
|
||||||
TEST_UNSPECIFIED_FALLBACK = '''
|
TEST_UNSPECIFIED_FALLBACK = '''
|
||||||
@ -406,7 +414,7 @@ fn test_unspecified_fallback{i}() {{
|
|||||||
micros: {d.microsecond}, tzo: None
|
micros: {d.microsecond}, tzo: None
|
||||||
}};
|
}};
|
||||||
parse_and_assert(pdt, info, "{s}", None, None, false, false,
|
parse_and_assert(pdt, info, "{s}", None, None, false, false,
|
||||||
Some(default_rsdate), false, HashMap::new());
|
Some(default_rsdate), false, &HashMap::new());
|
||||||
}}\n'''
|
}}\n'''
|
||||||
|
|
||||||
TEST_PARSE_IGNORETZ = '''
|
TEST_PARSE_IGNORETZ = '''
|
||||||
@ -419,7 +427,7 @@ fn test_parse_ignoretz{i}() {{
|
|||||||
micros: {d.microsecond}, tzo: None
|
micros: {d.microsecond}, tzo: None
|
||||||
}};
|
}};
|
||||||
parse_and_assert(pdt, info, "{s}", None, None, false, false,
|
parse_and_assert(pdt, info, "{s}", None, None, false, false,
|
||||||
None, true, HashMap::new());
|
None, true, &HashMap::new());
|
||||||
}}\n'''
|
}}\n'''
|
||||||
|
|
||||||
TEST_PARSE_DEFAULT_IGNORE = '''
|
TEST_PARSE_DEFAULT_IGNORE = '''
|
||||||
@ -434,7 +442,7 @@ fn test_parse_default_ignore{i}() {{
|
|||||||
micros: {d.microsecond}, tzo: None
|
micros: {d.microsecond}, tzo: None
|
||||||
}};
|
}};
|
||||||
parse_and_assert(pdt, info, "{s}", None, None, false, false,
|
parse_and_assert(pdt, info, "{s}", None, None, false, false,
|
||||||
Some(default_rsdate), false, HashMap::new());
|
Some(default_rsdate), false, &HashMap::new());
|
||||||
}}\n'''
|
}}\n'''
|
||||||
|
|
||||||
TEST_FUZZY_TZINFO = '''
|
TEST_FUZZY_TZINFO = '''
|
||||||
@ -447,7 +455,7 @@ fn test_fuzzy_tzinfo{i}() {{
|
|||||||
micros: {d.microsecond}, tzo: Some({offset})
|
micros: {d.microsecond}, tzo: Some({offset})
|
||||||
}};
|
}};
|
||||||
parse_fuzzy_and_assert(pdt, None, info, "{s}", None, None, true, false,
|
parse_fuzzy_and_assert(pdt, None, info, "{s}", None, None, true, false,
|
||||||
None, false, HashMap::new());
|
None, false, &HashMap::new());
|
||||||
}}\n'''
|
}}\n'''
|
||||||
|
|
||||||
TEST_FUZZY_TOKENS_TZINFO = '''
|
TEST_FUZZY_TOKENS_TZINFO = '''
|
||||||
@ -461,7 +469,7 @@ fn test_fuzzy_tokens_tzinfo{i}() {{
|
|||||||
}};
|
}};
|
||||||
let tokens = vec![{tokens}];
|
let tokens = vec![{tokens}];
|
||||||
parse_fuzzy_and_assert(pdt, Some(tokens), info, "{s}", None, None, true, true,
|
parse_fuzzy_and_assert(pdt, Some(tokens), info, "{s}", None, None, true, true,
|
||||||
None, false, HashMap::new());
|
None, false, &HashMap::new());
|
||||||
}}\n'''
|
}}\n'''
|
||||||
|
|
||||||
TEST_FUZZY_SIMPLE = '''
|
TEST_FUZZY_SIMPLE = '''
|
||||||
@ -474,7 +482,7 @@ fn test_fuzzy_simple{i}() {{
|
|||||||
micros: {d.microsecond}, tzo: None
|
micros: {d.microsecond}, tzo: None
|
||||||
}};
|
}};
|
||||||
parse_fuzzy_and_assert(pdt, None, info, "{s}", None, None, true, false,
|
parse_fuzzy_and_assert(pdt, None, info, "{s}", None, None, true, false,
|
||||||
None, false, HashMap::new());
|
None, false, &HashMap::new());
|
||||||
}}\n'''
|
}}\n'''
|
||||||
|
|
||||||
|
|
||||||
|
5
build_pycompat_tokenizer.py
Normal file → Executable file
5
build_pycompat_tokenizer.py
Normal file → Executable file
@ -1,3 +1,4 @@
|
|||||||
|
#!/usr/bin/python3
|
||||||
from dateutil.parser import _timelex
|
from dateutil.parser import _timelex
|
||||||
|
|
||||||
from build_pycompat import tests
|
from build_pycompat import tests
|
||||||
@ -24,6 +25,10 @@ fn test_tokenize{i}() {{
|
|||||||
|
|
||||||
|
|
||||||
TEST_HEADER = '''
|
TEST_HEADER = '''
|
||||||
|
//! This code has been generated by running the `build_pycompat_tokenizer.py` script
|
||||||
|
//! in the repository root. Please do not edit it, as your edits will be destroyed
|
||||||
|
//! upon re-running code generation.
|
||||||
|
|
||||||
use tokenize::Tokenizer;
|
use tokenize::Tokenizer;
|
||||||
|
|
||||||
fn tokenize_assert(test_str: &str, comparison: Vec<&str>) {
|
fn tokenize_assert(test_str: &str, comparison: Vec<&str>) {
|
||||||
|
47
ci/install.sh
Executable file
47
ci/install.sh
Executable file
@ -0,0 +1,47 @@
|
|||||||
|
set -ex
|
||||||
|
|
||||||
|
main() {
|
||||||
|
local target=
|
||||||
|
if [ $TRAVIS_OS_NAME = linux ]; then
|
||||||
|
target=x86_64-unknown-linux-musl
|
||||||
|
sort=sort
|
||||||
|
else
|
||||||
|
target=x86_64-apple-darwin
|
||||||
|
sort=gsort # for `sort --sort-version`, from brew's coreutils.
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Builds for iOS are done on OSX, but require the specific target to be
|
||||||
|
# installed.
|
||||||
|
case $TARGET in
|
||||||
|
aarch64-apple-ios)
|
||||||
|
rustup target install aarch64-apple-ios
|
||||||
|
;;
|
||||||
|
armv7-apple-ios)
|
||||||
|
rustup target install armv7-apple-ios
|
||||||
|
;;
|
||||||
|
armv7s-apple-ios)
|
||||||
|
rustup target install armv7s-apple-ios
|
||||||
|
;;
|
||||||
|
i386-apple-ios)
|
||||||
|
rustup target install i386-apple-ios
|
||||||
|
;;
|
||||||
|
x86_64-apple-ios)
|
||||||
|
rustup target install x86_64-apple-ios
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
|
||||||
|
# This fetches latest stable release
|
||||||
|
local tag=$(git ls-remote --tags --refs --exit-code https://github.com/japaric/cross \
|
||||||
|
| cut -d/ -f3 \
|
||||||
|
| grep -E '^v[0.1.0-9.]+$' \
|
||||||
|
| $sort --version-sort \
|
||||||
|
| tail -n1)
|
||||||
|
curl -LSfs https://japaric.github.io/trust/install.sh | \
|
||||||
|
sh -s -- \
|
||||||
|
--force \
|
||||||
|
--git japaric/cross \
|
||||||
|
--tag $tag \
|
||||||
|
--target $target
|
||||||
|
}
|
||||||
|
|
||||||
|
main
|
40
ci/script.sh
Normal file
40
ci/script.sh
Normal file
@ -0,0 +1,40 @@
|
|||||||
|
# This script takes care of testing your crate
|
||||||
|
|
||||||
|
set -ex
|
||||||
|
|
||||||
|
main() {
|
||||||
|
cross build --target $TARGET
|
||||||
|
cross build --target $TARGET --release
|
||||||
|
|
||||||
|
if [ ! -z $DISABLE_TESTS ]; then
|
||||||
|
return
|
||||||
|
fi
|
||||||
|
|
||||||
|
cross test --target $TARGET
|
||||||
|
cross test --target $TARGET --release
|
||||||
|
}
|
||||||
|
|
||||||
|
main_web() {
|
||||||
|
CARGO_WEB_RELEASE="$(curl -L -s -H 'Accept: application/json' https://github.com/koute/cargo-web/releases/latest)"
|
||||||
|
CARGO_WEB_VERSION="$(echo $CARGO_WEB_RELEASE | sed -e 's/.*"tag_name":"\([^"]*\)".*/\1/')"
|
||||||
|
CARGO_WEB_URL="https://github.com/koute/cargo-web/releases/download/$CARGO_WEB_VERSION/cargo-web-x86_64-unknown-linux-gnu.gz"
|
||||||
|
|
||||||
|
echo "Downloading cargo-web from: $CARGO_WEB_URL"
|
||||||
|
curl -L "$CARGO_WEB_URL" | gzip -d > cargo-web
|
||||||
|
chmod +x cargo-web
|
||||||
|
|
||||||
|
mkdir -p ~/.cargo/bin
|
||||||
|
mv cargo-web ~/.cargo/bin
|
||||||
|
|
||||||
|
cargo web build --target $TARGET
|
||||||
|
cargo web test --target $TARGET --release
|
||||||
|
}
|
||||||
|
|
||||||
|
# we don't run the "test phase" when doing deploys
|
||||||
|
if [ -z $TRAVIS_TAG ]; then
|
||||||
|
if [ -z "$USE_CARGO_WEB" ]; then
|
||||||
|
main
|
||||||
|
else
|
||||||
|
main_web
|
||||||
|
fi
|
||||||
|
fi
|
48
examples/russian.rs
Normal file
48
examples/russian.rs
Normal file
@ -0,0 +1,48 @@
|
|||||||
|
extern crate chrono;
|
||||||
|
extern crate dtparse;
|
||||||
|
|
||||||
|
use chrono::NaiveDate;
|
||||||
|
use dtparse::parse_info;
|
||||||
|
use dtparse::Parser;
|
||||||
|
use dtparse::ParserInfo;
|
||||||
|
use std::collections::HashMap;
|
||||||
|
|
||||||
|
fn main() {
|
||||||
|
// In this example, we'll just swap the default "months" parameter
|
||||||
|
// with a version in Russian. Lovingly taken from:
|
||||||
|
// https://github.com/dateutil/dateutil/blob/99f5770e7c63aa049b28abe465d7f1cc25b63fd2/dateutil/test/test_parser.py#L244
|
||||||
|
|
||||||
|
let mut info = ParserInfo::default();
|
||||||
|
info.months = parse_info(vec![
|
||||||
|
vec!["янв", "Январь"],
|
||||||
|
vec!["фев", "Февраль"],
|
||||||
|
vec!["мар", "Март"],
|
||||||
|
vec!["апр", "Апрель"],
|
||||||
|
vec!["май", "Май"],
|
||||||
|
vec!["июн", "Июнь"],
|
||||||
|
vec!["июл", "Июль"],
|
||||||
|
vec!["авг", "Август"],
|
||||||
|
vec!["сен", "Сентябрь"],
|
||||||
|
vec!["окт", "Октябрь"],
|
||||||
|
vec!["ноя", "Ноябрь"],
|
||||||
|
vec!["дек", "Декабрь"],
|
||||||
|
]);
|
||||||
|
|
||||||
|
let p = Parser::new(info);
|
||||||
|
|
||||||
|
assert_eq!(
|
||||||
|
p.parse(
|
||||||
|
"10 Сентябрь 2015 10:20",
|
||||||
|
None,
|
||||||
|
None,
|
||||||
|
false,
|
||||||
|
false,
|
||||||
|
None,
|
||||||
|
false,
|
||||||
|
&HashMap::new()
|
||||||
|
)
|
||||||
|
.unwrap()
|
||||||
|
.0,
|
||||||
|
NaiveDate::from_ymd(2015, 9, 10).and_hms(10, 20, 0)
|
||||||
|
);
|
||||||
|
}
|
5
fuzz/.gitignore
vendored
5
fuzz/.gitignore
vendored
@ -1,5 +0,0 @@
|
|||||||
|
|
||||||
target
|
|
||||||
libfuzzer
|
|
||||||
corpus
|
|
||||||
artifacts
|
|
@ -1,22 +0,0 @@
|
|||||||
|
|
||||||
[package]
|
|
||||||
name = "dtparse-fuzz"
|
|
||||||
version = "0.0.1"
|
|
||||||
authors = ["Automatically generated"]
|
|
||||||
publish = false
|
|
||||||
|
|
||||||
[package.metadata]
|
|
||||||
cargo-fuzz = true
|
|
||||||
|
|
||||||
[dependencies.dtparse]
|
|
||||||
path = ".."
|
|
||||||
[dependencies.libfuzzer-sys]
|
|
||||||
git = "https://github.com/rust-fuzz/libfuzzer-sys.git"
|
|
||||||
|
|
||||||
# Prevent this from interfering with workspaces
|
|
||||||
[workspace]
|
|
||||||
members = ["."]
|
|
||||||
|
|
||||||
[[bin]]
|
|
||||||
name = "fuzzer_script_1"
|
|
||||||
path = "fuzzers/fuzzer_script_1.rs"
|
|
@ -1,10 +0,0 @@
|
|||||||
#![no_main]
|
|
||||||
extern crate libfuzzer_sys;
|
|
||||||
extern crate dtparse;
|
|
||||||
use dtparse::parse;
|
|
||||||
#[export_name="rust_fuzzer_test_input"]
|
|
||||||
pub extern fn go(data: &[u8]) {
|
|
||||||
if let Ok(s) = std::str::from_utf8(data) {
|
|
||||||
parse(s);
|
|
||||||
}
|
|
||||||
}
|
|
676
src/lib.rs
676
src/lib.rs
File diff suppressed because it is too large
Load Diff
@ -3,22 +3,48 @@ use std::collections::HashMap;
|
|||||||
|
|
||||||
use parse;
|
use parse;
|
||||||
use ParseError;
|
use ParseError;
|
||||||
use ParseInternalError;
|
|
||||||
use Parser;
|
use Parser;
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_fuzz() {
|
fn test_fuzz() {
|
||||||
|
assert_eq!(
|
||||||
|
parse("\x2D\x38\x31\x39\x34\x38\x34"),
|
||||||
|
Err(ParseError::ImpossibleTimestamp("Invalid month"))
|
||||||
|
);
|
||||||
|
|
||||||
assert_eq!(parse("\x2D\x38\x31\x39\x34\x38\x34"), Err(ParseError::InvalidMonth));
|
|
||||||
// Garbage in the third delimited field
|
// Garbage in the third delimited field
|
||||||
assert_eq!(parse("2..\x00\x000d\x00+\x010d\x01\x00\x00\x00+"),
|
assert_eq!(
|
||||||
Err(ParseError::InternalError(ParseInternalError::ValueError("Unknown string format".to_owned()))));
|
parse("2..\x00\x000d\x00+\x010d\x01\x00\x00\x00+"),
|
||||||
// OverflowError: Python int too large to convert to C long
|
Err(ParseError::UnrecognizedFormat)
|
||||||
// assert_eq!(parse("8888884444444888444444444881"), Err(ParseError::AmPmWithoutHour));
|
);
|
||||||
let default = NaiveDate::from_ymd(2016, 6, 29).and_hms(0, 0, 0);
|
|
||||||
let mut p = Parser::default();
|
|
||||||
let res = p.parse("\x0D\x31", None, None, false, false, Some(&default), false, HashMap::new()).unwrap();
|
|
||||||
assert_eq!(res.0, default);
|
|
||||||
|
|
||||||
assert_eq!(parse("\x2D\x2D\x32\x31\x38\x6D"), Err(ParseError::ImpossibleTimestamp("Invalid minute")));
|
let default = NaiveDate::from_ymd(2016, 6, 29).and_hms(0, 0, 0);
|
||||||
|
let p = Parser::default();
|
||||||
|
let res = p.parse(
|
||||||
|
"\x0D\x31",
|
||||||
|
None,
|
||||||
|
None,
|
||||||
|
false,
|
||||||
|
false,
|
||||||
|
Some(&default),
|
||||||
|
false,
|
||||||
|
&HashMap::new(),
|
||||||
|
);
|
||||||
|
assert_eq!(res, Err(ParseError::NoDate));
|
||||||
|
|
||||||
|
assert_eq!(
|
||||||
|
parse("\x2D\x2D\x32\x31\x38\x6D"),
|
||||||
|
Err(ParseError::ImpossibleTimestamp("Invalid minute"))
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn large_int() {
|
||||||
|
let parse_result = parse("1412409095009.jpg");
|
||||||
|
assert!(parse_result.is_err());
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn empty_string() {
|
||||||
|
assert_eq!(parse(""), Err(ParseError::NoDate))
|
||||||
}
|
}
|
||||||
|
File diff suppressed because it is too large
Load Diff
@ -1,3 +1,6 @@
|
|||||||
|
//! This code has been generated by running the `build_pycompat_tokenizer.py` script
|
||||||
|
//! in the repository root. Please do not edit it, as your edits will be destroyed
|
||||||
|
//! upon re-running code generation.
|
||||||
|
|
||||||
use tokenize::Tokenizer;
|
use tokenize::Tokenizer;
|
||||||
|
|
||||||
@ -8,7 +11,9 @@ fn tokenize_assert(test_str: &str, comparison: Vec<&str>) {
|
|||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize0() {
|
fn test_tokenize0() {
|
||||||
let comp = vec!["Thu", " ", "Sep", " ", "25", " ", "10", ":", "36", ":", "28"];
|
let comp = vec![
|
||||||
|
"Thu", " ", "Sep", " ", "25", " ", "10", ":", "36", ":", "28",
|
||||||
|
];
|
||||||
tokenize_assert("Thu Sep 25 10:36:28", comp);
|
tokenize_assert("Thu Sep 25 10:36:28", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -290,7 +295,9 @@ fn test_tokenize46() {
|
|||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize47() {
|
fn test_tokenize47() {
|
||||||
let comp = vec!["Thu", " ", "Sep", " ", "25", " ", "10", ":", "36", ":", "28", " ", "2003"];
|
let comp = vec![
|
||||||
|
"Thu", " ", "Sep", " ", "25", " ", "10", ":", "36", ":", "28", " ", "2003",
|
||||||
|
];
|
||||||
tokenize_assert("Thu Sep 25 10:36:28 2003", comp);
|
tokenize_assert("Thu Sep 25 10:36:28 2003", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -302,7 +309,9 @@ fn test_tokenize48() {
|
|||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize49() {
|
fn test_tokenize49() {
|
||||||
let comp = vec!["2003", "-", "09", "-", "25", "T", "10", ":", "49", ":", "41"];
|
let comp = vec![
|
||||||
|
"2003", "-", "09", "-", "25", "T", "10", ":", "49", ":", "41",
|
||||||
|
];
|
||||||
tokenize_assert("2003-09-25T10:49:41", comp);
|
tokenize_assert("2003-09-25T10:49:41", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -350,7 +359,9 @@ fn test_tokenize56() {
|
|||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize57() {
|
fn test_tokenize57() {
|
||||||
let comp = vec!["2003", "-", "09", "-", "25", " ", "10", ":", "49", ":", "41.502"];
|
let comp = vec![
|
||||||
|
"2003", "-", "09", "-", "25", " ", "10", ":", "49", ":", "41.502",
|
||||||
|
];
|
||||||
tokenize_assert("2003-09-25 10:49:41,502", comp);
|
tokenize_assert("2003-09-25 10:49:41,502", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -506,7 +517,10 @@ fn test_tokenize82() {
|
|||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize83() {
|
fn test_tokenize83() {
|
||||||
let comp = vec![" ", " ", "July", " ", " ", " ", "4", " ", ",", " ", " ", "1976", " ", " ", " ", "12", ":", "01", ":", "02", " ", " ", " ", "am", " ", " "];
|
let comp = vec![
|
||||||
|
" ", " ", "July", " ", " ", " ", "4", " ", ",", " ", " ", "1976", " ", " ", " ", "12", ":",
|
||||||
|
"01", ":", "02", " ", " ", " ", "am", " ", " ",
|
||||||
|
];
|
||||||
tokenize_assert(" July 4 , 1976 12:01:02 am ", comp);
|
tokenize_assert(" July 4 , 1976 12:01:02 am ", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -518,7 +532,9 @@ fn test_tokenize84() {
|
|||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize85() {
|
fn test_tokenize85() {
|
||||||
let comp = vec!["1996", ".", "July", ".", "10", " ", "AD", " ", "12", ":", "08", " ", "PM"];
|
let comp = vec![
|
||||||
|
"1996", ".", "July", ".", "10", " ", "AD", " ", "12", ":", "08", " ", "PM",
|
||||||
|
];
|
||||||
tokenize_assert("1996.July.10 AD 12:08 PM", comp);
|
tokenize_assert("1996.July.10 AD 12:08 PM", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -554,25 +570,33 @@ fn test_tokenize90() {
|
|||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize91() {
|
fn test_tokenize91() {
|
||||||
let comp = vec!["0", ":", "01", ":", "02", " ", "on", " ", "July", " ", "4", ",", " ", "1976"];
|
let comp = vec![
|
||||||
|
"0", ":", "01", ":", "02", " ", "on", " ", "July", " ", "4", ",", " ", "1976",
|
||||||
|
];
|
||||||
tokenize_assert("0:01:02 on July 4, 1976", comp);
|
tokenize_assert("0:01:02 on July 4, 1976", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize92() {
|
fn test_tokenize92() {
|
||||||
let comp = vec!["0", ":", "01", ":", "02", " ", "on", " ", "July", " ", "4", ",", " ", "1976"];
|
let comp = vec![
|
||||||
|
"0", ":", "01", ":", "02", " ", "on", " ", "July", " ", "4", ",", " ", "1976",
|
||||||
|
];
|
||||||
tokenize_assert("0:01:02 on July 4, 1976", comp);
|
tokenize_assert("0:01:02 on July 4, 1976", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize93() {
|
fn test_tokenize93() {
|
||||||
let comp = vec!["July", " ", "4", ",", " ", "1976", " ", "12", ":", "01", ":", "02", " ", "am"];
|
let comp = vec![
|
||||||
|
"July", " ", "4", ",", " ", "1976", " ", "12", ":", "01", ":", "02", " ", "am",
|
||||||
|
];
|
||||||
tokenize_assert("July 4, 1976 12:01:02 am", comp);
|
tokenize_assert("July 4, 1976 12:01:02 am", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize94() {
|
fn test_tokenize94() {
|
||||||
let comp = vec!["Mon", " ", "Jan", " ", " ", "2", " ", "04", ":", "24", ":", "27", " ", "1995"];
|
let comp = vec![
|
||||||
|
"Mon", " ", "Jan", " ", " ", "2", " ", "04", ":", "24", ":", "27", " ", "1995",
|
||||||
|
];
|
||||||
tokenize_assert("Mon Jan 2 04:24:27 1995", comp);
|
tokenize_assert("Mon Jan 2 04:24:27 1995", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -584,7 +608,9 @@ fn test_tokenize95() {
|
|||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize96() {
|
fn test_tokenize96() {
|
||||||
let comp = vec!["Jan", " ", "1", " ", "1999", " ", "11", ":", "23", ":", "34.578"];
|
let comp = vec![
|
||||||
|
"Jan", " ", "1", " ", "1999", " ", "11", ":", "23", ":", "34.578",
|
||||||
|
];
|
||||||
tokenize_assert("Jan 1 1999 11:23:34.578", comp);
|
tokenize_assert("Jan 1 1999 11:23:34.578", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -614,13 +640,17 @@ fn test_tokenize100() {
|
|||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize101() {
|
fn test_tokenize101() {
|
||||||
let comp = vec!["0099", "-", "01", "-", "01", "T", "00", ":", "00", ":", "00"];
|
let comp = vec![
|
||||||
|
"0099", "-", "01", "-", "01", "T", "00", ":", "00", ":", "00",
|
||||||
|
];
|
||||||
tokenize_assert("0099-01-01T00:00:00", comp);
|
tokenize_assert("0099-01-01T00:00:00", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize102() {
|
fn test_tokenize102() {
|
||||||
let comp = vec!["0031", "-", "01", "-", "01", "T", "00", ":", "00", ":", "00"];
|
let comp = vec![
|
||||||
|
"0031", "-", "01", "-", "01", "T", "00", ":", "00", ":", "00",
|
||||||
|
];
|
||||||
tokenize_assert("0031-01-01T00:00:00", comp);
|
tokenize_assert("0031-01-01T00:00:00", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -662,31 +692,42 @@ fn test_tokenize108() {
|
|||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize109() {
|
fn test_tokenize109() {
|
||||||
let comp = vec!["Thu", " ", "Sep", " ", "25", " ", "10", ":", "36", ":", "28", " ", "BRST", " ", "2003"];
|
let comp = vec![
|
||||||
|
"Thu", " ", "Sep", " ", "25", " ", "10", ":", "36", ":", "28", " ", "BRST", " ", "2003",
|
||||||
|
];
|
||||||
tokenize_assert("Thu Sep 25 10:36:28 BRST 2003", comp);
|
tokenize_assert("Thu Sep 25 10:36:28 BRST 2003", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize110() {
|
fn test_tokenize110() {
|
||||||
let comp = vec!["2003", " ", "10", ":", "36", ":", "28", " ", "BRST", " ", "25", " ", "Sep", " ", "Thu"];
|
let comp = vec![
|
||||||
|
"2003", " ", "10", ":", "36", ":", "28", " ", "BRST", " ", "25", " ", "Sep", " ", "Thu",
|
||||||
|
];
|
||||||
tokenize_assert("2003 10:36:28 BRST 25 Sep Thu", comp);
|
tokenize_assert("2003 10:36:28 BRST 25 Sep Thu", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize111() {
|
fn test_tokenize111() {
|
||||||
let comp = vec!["Thu", ",", " ", "25", " ", "Sep", " ", "2003", " ", "10", ":", "49", ":", "41", " ", "-", "0300"];
|
let comp = vec![
|
||||||
|
"Thu", ",", " ", "25", " ", "Sep", " ", "2003", " ", "10", ":", "49", ":", "41", " ", "-",
|
||||||
|
"0300",
|
||||||
|
];
|
||||||
tokenize_assert("Thu, 25 Sep 2003 10:49:41 -0300", comp);
|
tokenize_assert("Thu, 25 Sep 2003 10:49:41 -0300", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize112() {
|
fn test_tokenize112() {
|
||||||
let comp = vec!["2003", "-", "09", "-", "25", "T", "10", ":", "49", ":", "41.5", "-", "03", ":", "00"];
|
let comp = vec![
|
||||||
|
"2003", "-", "09", "-", "25", "T", "10", ":", "49", ":", "41.5", "-", "03", ":", "00",
|
||||||
|
];
|
||||||
tokenize_assert("2003-09-25T10:49:41.5-03:00", comp);
|
tokenize_assert("2003-09-25T10:49:41.5-03:00", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize113() {
|
fn test_tokenize113() {
|
||||||
let comp = vec!["2003", "-", "09", "-", "25", "T", "10", ":", "49", ":", "41", "-", "03", ":", "00"];
|
let comp = vec![
|
||||||
|
"2003", "-", "09", "-", "25", "T", "10", ":", "49", ":", "41", "-", "03", ":", "00",
|
||||||
|
];
|
||||||
tokenize_assert("2003-09-25T10:49:41-03:00", comp);
|
tokenize_assert("2003-09-25T10:49:41-03:00", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -704,198 +745,346 @@ fn test_tokenize115() {
|
|||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize116() {
|
fn test_tokenize116() {
|
||||||
|
let comp = vec![
|
||||||
|
"2018", "-", "08", "-", "10", " ", "10", ":", "00", ":", "00", " ", "UTC", "+", "3",
|
||||||
|
];
|
||||||
|
tokenize_assert("2018-08-10 10:00:00 UTC+3", comp);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_tokenize117() {
|
||||||
|
let comp = vec![
|
||||||
|
"2018", "-", "08", "-", "10", " ", "03", ":", "36", ":", "47", " ", "PM", " ", "GMT", "-",
|
||||||
|
"4",
|
||||||
|
];
|
||||||
|
tokenize_assert("2018-08-10 03:36:47 PM GMT-4", comp);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_tokenize118() {
|
||||||
|
let comp = vec![
|
||||||
|
"2018", "-", "08", "-", "10", " ", "04", ":", "15", ":", "00", " ", "AM", " ", "Z", "-",
|
||||||
|
"02", ":", "00",
|
||||||
|
];
|
||||||
|
tokenize_assert("2018-08-10 04:15:00 AM Z-02:00", comp);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_tokenize119() {
|
||||||
let comp = vec!["10", "-", "09", "-", "2003"];
|
let comp = vec!["10", "-", "09", "-", "2003"];
|
||||||
tokenize_assert("10-09-2003", comp);
|
tokenize_assert("10-09-2003", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize117() {
|
fn test_tokenize120() {
|
||||||
let comp = vec!["10", ".", "09", ".", "2003"];
|
let comp = vec!["10", ".", "09", ".", "2003"];
|
||||||
tokenize_assert("10.09.2003", comp);
|
tokenize_assert("10.09.2003", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize118() {
|
fn test_tokenize121() {
|
||||||
let comp = vec!["10", "/", "09", "/", "2003"];
|
let comp = vec!["10", "/", "09", "/", "2003"];
|
||||||
tokenize_assert("10/09/2003", comp);
|
tokenize_assert("10/09/2003", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize119() {
|
fn test_tokenize122() {
|
||||||
let comp = vec!["10", " ", "09", " ", "2003"];
|
let comp = vec!["10", " ", "09", " ", "2003"];
|
||||||
tokenize_assert("10 09 2003", comp);
|
tokenize_assert("10 09 2003", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize120() {
|
fn test_tokenize123() {
|
||||||
let comp = vec!["090107"];
|
let comp = vec!["090107"];
|
||||||
tokenize_assert("090107", comp);
|
tokenize_assert("090107", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize121() {
|
fn test_tokenize124() {
|
||||||
let comp = vec!["2015", " ", "09", " ", "25"];
|
let comp = vec!["2015", " ", "09", " ", "25"];
|
||||||
tokenize_assert("2015 09 25", comp);
|
tokenize_assert("2015 09 25", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize122() {
|
fn test_tokenize125() {
|
||||||
let comp = vec!["10", "-", "09", "-", "03"];
|
let comp = vec!["10", "-", "09", "-", "03"];
|
||||||
tokenize_assert("10-09-03", comp);
|
tokenize_assert("10-09-03", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize123() {
|
fn test_tokenize126() {
|
||||||
let comp = vec!["10", ".", "09", ".", "03"];
|
let comp = vec!["10", ".", "09", ".", "03"];
|
||||||
tokenize_assert("10.09.03", comp);
|
tokenize_assert("10.09.03", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize124() {
|
fn test_tokenize127() {
|
||||||
let comp = vec!["10", "/", "09", "/", "03"];
|
let comp = vec!["10", "/", "09", "/", "03"];
|
||||||
tokenize_assert("10/09/03", comp);
|
tokenize_assert("10/09/03", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize125() {
|
fn test_tokenize128() {
|
||||||
let comp = vec!["10", " ", "09", " ", "03"];
|
let comp = vec!["10", " ", "09", " ", "03"];
|
||||||
tokenize_assert("10 09 03", comp);
|
tokenize_assert("10 09 03", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
|
||||||
fn test_tokenize126() {
|
|
||||||
let comp = vec!["090107"];
|
|
||||||
tokenize_assert("090107", comp);
|
|
||||||
}
|
|
||||||
|
|
||||||
#[test]
|
|
||||||
fn test_tokenize127() {
|
|
||||||
let comp = vec!["2015", " ", "09", " ", "25"];
|
|
||||||
tokenize_assert("2015 09 25", comp);
|
|
||||||
}
|
|
||||||
|
|
||||||
#[test]
|
|
||||||
fn test_tokenize128() {
|
|
||||||
let comp = vec!["090107"];
|
|
||||||
tokenize_assert("090107", comp);
|
|
||||||
}
|
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize129() {
|
fn test_tokenize129() {
|
||||||
let comp = vec!["2015", " ", "09", " ", "25"];
|
let comp = vec!["090107"];
|
||||||
tokenize_assert("2015 09 25", comp);
|
tokenize_assert("090107", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize130() {
|
fn test_tokenize130() {
|
||||||
|
let comp = vec!["2015", " ", "09", " ", "25"];
|
||||||
|
tokenize_assert("2015 09 25", comp);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_tokenize131() {
|
||||||
|
let comp = vec!["090107"];
|
||||||
|
tokenize_assert("090107", comp);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_tokenize132() {
|
||||||
|
let comp = vec!["2015", " ", "09", " ", "25"];
|
||||||
|
tokenize_assert("2015 09 25", comp);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_tokenize133() {
|
||||||
let comp = vec!["April", " ", "2009"];
|
let comp = vec!["April", " ", "2009"];
|
||||||
tokenize_assert("April 2009", comp);
|
tokenize_assert("April 2009", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize131() {
|
fn test_tokenize134() {
|
||||||
let comp = vec!["Feb", " ", "2007"];
|
let comp = vec!["Feb", " ", "2007"];
|
||||||
tokenize_assert("Feb 2007", comp);
|
tokenize_assert("Feb 2007", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize132() {
|
fn test_tokenize135() {
|
||||||
let comp = vec!["Feb", " ", "2008"];
|
let comp = vec!["Feb", " ", "2008"];
|
||||||
tokenize_assert("Feb 2008", comp);
|
tokenize_assert("Feb 2008", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize133() {
|
fn test_tokenize136() {
|
||||||
let comp = vec!["Thu", " ", "Sep", " ", "25", " ", "10", ":", "36", ":", "28", " ", "BRST", " ", "2003"];
|
let comp = vec![
|
||||||
|
"Thu", " ", "Sep", " ", "25", " ", "10", ":", "36", ":", "28", " ", "BRST", " ", "2003",
|
||||||
|
];
|
||||||
tokenize_assert("Thu Sep 25 10:36:28 BRST 2003", comp);
|
tokenize_assert("Thu Sep 25 10:36:28 BRST 2003", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize134() {
|
fn test_tokenize137() {
|
||||||
let comp = vec!["1996", ".", "07", ".", "10", " ", "AD", " ", "at", " ", "15", ":", "08", ":", "56", " ", "PDT"];
|
let comp = vec![
|
||||||
|
"1996", ".", "07", ".", "10", " ", "AD", " ", "at", " ", "15", ":", "08", ":", "56", " ",
|
||||||
|
"PDT",
|
||||||
|
];
|
||||||
tokenize_assert("1996.07.10 AD at 15:08:56 PDT", comp);
|
tokenize_assert("1996.07.10 AD at 15:08:56 PDT", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize135() {
|
fn test_tokenize138() {
|
||||||
let comp = vec!["Tuesday", ",", " ", "April", " ", "12", ",", " ", "1952", " ", "AD", " ", "3", ":", "30", ":", "42", "pm", " ", "PST"];
|
let comp = vec![
|
||||||
|
"Tuesday", ",", " ", "April", " ", "12", ",", " ", "1952", " ", "AD", " ", "3", ":", "30",
|
||||||
|
":", "42", "pm", " ", "PST",
|
||||||
|
];
|
||||||
tokenize_assert("Tuesday, April 12, 1952 AD 3:30:42pm PST", comp);
|
tokenize_assert("Tuesday, April 12, 1952 AD 3:30:42pm PST", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize136() {
|
fn test_tokenize139() {
|
||||||
let comp = vec!["November", " ", "5", ",", " ", "1994", ",", " ", "8", ":", "15", ":", "30", " ", "am", " ", "EST"];
|
let comp = vec![
|
||||||
|
"November", " ", "5", ",", " ", "1994", ",", " ", "8", ":", "15", ":", "30", " ", "am",
|
||||||
|
" ", "EST",
|
||||||
|
];
|
||||||
tokenize_assert("November 5, 1994, 8:15:30 am EST", comp);
|
tokenize_assert("November 5, 1994, 8:15:30 am EST", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize137() {
|
fn test_tokenize140() {
|
||||||
let comp = vec!["1994", "-", "11", "-", "05", "T", "08", ":", "15", ":", "30", "-", "05", ":", "00"];
|
let comp = vec![
|
||||||
|
"1994", "-", "11", "-", "05", "T", "08", ":", "15", ":", "30", "-", "05", ":", "00",
|
||||||
|
];
|
||||||
tokenize_assert("1994-11-05T08:15:30-05:00", comp);
|
tokenize_assert("1994-11-05T08:15:30-05:00", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize138() {
|
fn test_tokenize141() {
|
||||||
let comp = vec!["1994", "-", "11", "-", "05", "T", "08", ":", "15", ":", "30", "Z"];
|
let comp = vec![
|
||||||
|
"1994", "-", "11", "-", "05", "T", "08", ":", "15", ":", "30", "Z",
|
||||||
|
];
|
||||||
tokenize_assert("1994-11-05T08:15:30Z", comp);
|
tokenize_assert("1994-11-05T08:15:30Z", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize139() {
|
fn test_tokenize142() {
|
||||||
let comp = vec!["1976", "-", "07", "-", "04", "T", "00", ":", "01", ":", "02", "Z"];
|
let comp = vec![
|
||||||
|
"1976", "-", "07", "-", "04", "T", "00", ":", "01", ":", "02", "Z",
|
||||||
|
];
|
||||||
tokenize_assert("1976-07-04T00:01:02Z", comp);
|
tokenize_assert("1976-07-04T00:01:02Z", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize140() {
|
fn test_tokenize143() {
|
||||||
let comp = vec!["Tue", " ", "Apr", " ", "4", " ", "00", ":", "22", ":", "12", " ", "PDT", " ", "1995"];
|
let comp = vec![
|
||||||
|
"Tue", " ", "Apr", " ", "4", " ", "00", ":", "22", ":", "12", " ", "PDT", " ", "1995",
|
||||||
|
];
|
||||||
tokenize_assert("Tue Apr 4 00:22:12 PDT 1995", comp);
|
tokenize_assert("Tue Apr 4 00:22:12 PDT 1995", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
|
||||||
fn test_tokenize141() {
|
|
||||||
let comp = vec!["Today", " ", "is", " ", "25", " ", "of", " ", "September", " ", "of", " ", "2003", ",", " ", "exactly", " ", "at", " ", "10", ":", "49", ":", "41", " ", "with", " ", "timezone", " ", "-", "03", ":", "00", "."];
|
|
||||||
tokenize_assert("Today is 25 of September of 2003, exactly at 10:49:41 with timezone -03:00.", comp);
|
|
||||||
}
|
|
||||||
|
|
||||||
#[test]
|
|
||||||
fn test_tokenize142() {
|
|
||||||
let comp = vec!["Today", " ", "is", " ", "25", " ", "of", " ", "September", " ", "of", " ", "2003", ",", " ", "exactly", " ", "at", " ", "10", ":", "49", ":", "41", " ", "with", " ", "timezone", " ", "-", "03", ":", "00", "."];
|
|
||||||
tokenize_assert("Today is 25 of September of 2003, exactly at 10:49:41 with timezone -03:00.", comp);
|
|
||||||
}
|
|
||||||
|
|
||||||
#[test]
|
|
||||||
fn test_tokenize143() {
|
|
||||||
let comp = vec!["I", " ", "have", " ", "a", " ", "meeting", " ", "on", " ", "March", " ", "1", ",", " ", "1974"];
|
|
||||||
tokenize_assert("I have a meeting on March 1, 1974", comp);
|
|
||||||
}
|
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize144() {
|
fn test_tokenize144() {
|
||||||
let comp = vec!["On", " ", "June", " ", "8", "th", ",", " ", "2020", ",", " ", "I", " ", "am", " ", "going", " ", "to", " ", "be", " ", "the", " ", "first", " ", "man", " ", "on", " ", "Mars"];
|
let comp = vec![
|
||||||
tokenize_assert("On June 8th, 2020, I am going to be the first man on Mars", comp);
|
"Today",
|
||||||
|
" ",
|
||||||
|
"is",
|
||||||
|
" ",
|
||||||
|
"25",
|
||||||
|
" ",
|
||||||
|
"of",
|
||||||
|
" ",
|
||||||
|
"September",
|
||||||
|
" ",
|
||||||
|
"of",
|
||||||
|
" ",
|
||||||
|
"2003",
|
||||||
|
",",
|
||||||
|
" ",
|
||||||
|
"exactly",
|
||||||
|
" ",
|
||||||
|
"at",
|
||||||
|
" ",
|
||||||
|
"10",
|
||||||
|
":",
|
||||||
|
"49",
|
||||||
|
":",
|
||||||
|
"41",
|
||||||
|
" ",
|
||||||
|
"with",
|
||||||
|
" ",
|
||||||
|
"timezone",
|
||||||
|
" ",
|
||||||
|
"-",
|
||||||
|
"03",
|
||||||
|
":",
|
||||||
|
"00",
|
||||||
|
".",
|
||||||
|
];
|
||||||
|
tokenize_assert(
|
||||||
|
"Today is 25 of September of 2003, exactly at 10:49:41 with timezone -03:00.",
|
||||||
|
comp,
|
||||||
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize145() {
|
fn test_tokenize145() {
|
||||||
let comp = vec!["Meet", " ", "me", " ", "at", " ", "the", " ", "AM", "/", "PM", " ", "on", " ", "Sunset", " ", "at", " ", "3", ":", "00", " ", "AM", " ", "on", " ", "December", " ", "3", "rd", ",", " ", "2003"];
|
let comp = vec![
|
||||||
tokenize_assert("Meet me at the AM/PM on Sunset at 3:00 AM on December 3rd, 2003", comp);
|
"Today",
|
||||||
|
" ",
|
||||||
|
"is",
|
||||||
|
" ",
|
||||||
|
"25",
|
||||||
|
" ",
|
||||||
|
"of",
|
||||||
|
" ",
|
||||||
|
"September",
|
||||||
|
" ",
|
||||||
|
"of",
|
||||||
|
" ",
|
||||||
|
"2003",
|
||||||
|
",",
|
||||||
|
" ",
|
||||||
|
"exactly",
|
||||||
|
" ",
|
||||||
|
"at",
|
||||||
|
" ",
|
||||||
|
"10",
|
||||||
|
":",
|
||||||
|
"49",
|
||||||
|
":",
|
||||||
|
"41",
|
||||||
|
" ",
|
||||||
|
"with",
|
||||||
|
" ",
|
||||||
|
"timezone",
|
||||||
|
" ",
|
||||||
|
"-",
|
||||||
|
"03",
|
||||||
|
":",
|
||||||
|
"00",
|
||||||
|
".",
|
||||||
|
];
|
||||||
|
tokenize_assert(
|
||||||
|
"Today is 25 of September of 2003, exactly at 10:49:41 with timezone -03:00.",
|
||||||
|
comp,
|
||||||
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize146() {
|
fn test_tokenize146() {
|
||||||
let comp = vec!["Meet", " ", "me", " ", "at", " ", "3", ":", "00", " ", "AM", " ", "on", " ", "December", " ", "3", "rd", ",", " ", "2003", " ", "at", " ", "the", " ", "AM", "/", "PM", " ", "on", " ", "Sunset"];
|
let comp = vec![
|
||||||
tokenize_assert("Meet me at 3:00 AM on December 3rd, 2003 at the AM/PM on Sunset", comp);
|
"I", " ", "have", " ", "a", " ", "meeting", " ", "on", " ", "March", " ", "1", ",", " ",
|
||||||
|
"1974",
|
||||||
|
];
|
||||||
|
tokenize_assert("I have a meeting on March 1, 1974", comp);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize147() {
|
fn test_tokenize147() {
|
||||||
let comp = vec!["Jan", " ", "29", ",", " ", "1945", " ", "14", ":", "45", " ", "AM", " ", "I", " ", "going", " ", "to", " ", "see", " ", "you", " ", "there", "?"];
|
let comp = vec![
|
||||||
tokenize_assert("Jan 29, 1945 14:45 AM I going to see you there?", comp);
|
"On", " ", "June", " ", "8", "th", ",", " ", "2020", ",", " ", "I", " ", "am", " ",
|
||||||
|
"going", " ", "to", " ", "be", " ", "the", " ", "first", " ", "man", " ", "on", " ",
|
||||||
|
"Mars",
|
||||||
|
];
|
||||||
|
tokenize_assert(
|
||||||
|
"On June 8th, 2020, I am going to be the first man on Mars",
|
||||||
|
comp,
|
||||||
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_tokenize148() {
|
fn test_tokenize148() {
|
||||||
|
let comp = vec![
|
||||||
|
"Meet", " ", "me", " ", "at", " ", "the", " ", "AM", "/", "PM", " ", "on", " ", "Sunset",
|
||||||
|
" ", "at", " ", "3", ":", "00", " ", "AM", " ", "on", " ", "December", " ", "3", "rd", ",",
|
||||||
|
" ", "2003",
|
||||||
|
];
|
||||||
|
tokenize_assert(
|
||||||
|
"Meet me at the AM/PM on Sunset at 3:00 AM on December 3rd, 2003",
|
||||||
|
comp,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_tokenize149() {
|
||||||
|
let comp = vec![
|
||||||
|
"Meet", " ", "me", " ", "at", " ", "3", ":", "00", " ", "AM", " ", "on", " ", "December",
|
||||||
|
" ", "3", "rd", ",", " ", "2003", " ", "at", " ", "the", " ", "AM", "/", "PM", " ", "on",
|
||||||
|
" ", "Sunset",
|
||||||
|
];
|
||||||
|
tokenize_assert(
|
||||||
|
"Meet me at 3:00 AM on December 3rd, 2003 at the AM/PM on Sunset",
|
||||||
|
comp,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_tokenize150() {
|
||||||
|
let comp = vec![
|
||||||
|
"Jan", " ", "29", ",", " ", "1945", " ", "14", ":", "45", " ", "AM", " ", "I", " ",
|
||||||
|
"going", " ", "to", " ", "see", " ", "you", " ", "there", "?",
|
||||||
|
];
|
||||||
|
tokenize_assert("Jan 29, 1945 14:45 AM I going to see you there?", comp);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_tokenize151() {
|
||||||
let comp = vec!["2017", "-", "07", "-", "17", " ", "06", ":", "15", ":"];
|
let comp = vec!["2017", "-", "07", "-", "17", " ", "06", ":", "15", ":"];
|
||||||
tokenize_assert("2017-07-17 06:15:", comp);
|
tokenize_assert("2017-07-17 06:15:", comp);
|
||||||
}
|
}
|
||||||
|
@ -14,7 +14,6 @@ pub(crate) enum ParseState {
|
|||||||
}
|
}
|
||||||
|
|
||||||
impl Tokenizer {
|
impl Tokenizer {
|
||||||
|
|
||||||
pub(crate) fn new(parse_string: &str) -> Self {
|
pub(crate) fn new(parse_string: &str) -> Self {
|
||||||
Tokenizer {
|
Tokenizer {
|
||||||
token_stack: vec![],
|
token_stack: vec![],
|
||||||
@ -92,7 +91,7 @@ impl Iterator for Tokenizer {
|
|||||||
} else {
|
} else {
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
},
|
}
|
||||||
ParseState::Alpha => {
|
ParseState::Alpha => {
|
||||||
seenletters = true;
|
seenletters = true;
|
||||||
if self.isword(nextchar) {
|
if self.isword(nextchar) {
|
||||||
@ -105,19 +104,21 @@ impl Iterator for Tokenizer {
|
|||||||
self.parse_string.push(nextchar);
|
self.parse_string.push(nextchar);
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
},
|
}
|
||||||
ParseState::Numeric => {
|
ParseState::Numeric => {
|
||||||
if self.isnum(nextchar) {
|
if self.isnum(nextchar) {
|
||||||
// UNWRAP: Because we're in non-empty parse state, we're guaranteed to have a token
|
// UNWRAP: Because we're in non-empty parse state, we're guaranteed to have a token
|
||||||
token.as_mut().unwrap().push(nextchar);
|
token.as_mut().unwrap().push(nextchar);
|
||||||
} else if nextchar == '.' || (nextchar == ',' && token.as_ref().unwrap().len() >= 2) {
|
} else if nextchar == '.'
|
||||||
|
|| (nextchar == ',' && token.as_ref().unwrap().len() >= 2)
|
||||||
|
{
|
||||||
token.as_mut().unwrap().push(nextchar);
|
token.as_mut().unwrap().push(nextchar);
|
||||||
state = ParseState::NumericDecimal;
|
state = ParseState::NumericDecimal;
|
||||||
} else {
|
} else {
|
||||||
self.parse_string.push(nextchar);
|
self.parse_string.push(nextchar);
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
},
|
}
|
||||||
ParseState::AlphaDecimal => {
|
ParseState::AlphaDecimal => {
|
||||||
seenletters = true;
|
seenletters = true;
|
||||||
if nextchar == '.' || self.isword(nextchar) {
|
if nextchar == '.' || self.isword(nextchar) {
|
||||||
@ -130,7 +131,7 @@ impl Iterator for Tokenizer {
|
|||||||
self.parse_string.push(nextchar);
|
self.parse_string.push(nextchar);
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
},
|
}
|
||||||
ParseState::NumericDecimal => {
|
ParseState::NumericDecimal => {
|
||||||
if nextchar == '.' || self.isnum(nextchar) {
|
if nextchar == '.' || self.isnum(nextchar) {
|
||||||
// UNWRAP: Because we're in non-empty parse state, we're guaranteed to have a token
|
// UNWRAP: Because we're in non-empty parse state, we're guaranteed to have a token
|
||||||
@ -150,7 +151,12 @@ impl Iterator for Tokenizer {
|
|||||||
// We do something slightly different to express the same logic
|
// We do something slightly different to express the same logic
|
||||||
if state == ParseState::AlphaDecimal || state == ParseState::NumericDecimal {
|
if state == ParseState::AlphaDecimal || state == ParseState::NumericDecimal {
|
||||||
// UNWRAP: The state check guarantees that we have a value
|
// UNWRAP: The state check guarantees that we have a value
|
||||||
let dot_count = token.as_ref().unwrap().chars().filter(|c| *c == '.').count();
|
let dot_count = token
|
||||||
|
.as_ref()
|
||||||
|
.unwrap()
|
||||||
|
.chars()
|
||||||
|
.filter(|c| *c == '.')
|
||||||
|
.count();
|
||||||
let last_char = token.as_ref().unwrap().chars().last();
|
let last_char = token.as_ref().unwrap().chars().last();
|
||||||
let last_splittable = last_char == Some('.') || last_char == Some(',');
|
let last_splittable = last_char == Some('.') || last_char == Some(',');
|
||||||
|
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
use ParseResult;
|
|
||||||
use ParseError;
|
use ParseError;
|
||||||
|
use ParseResult;
|
||||||
|
|
||||||
#[derive(Debug, PartialEq)]
|
#[derive(Debug, PartialEq)]
|
||||||
pub enum DayOfWeek {
|
pub enum DayOfWeek {
|
||||||
@ -9,13 +9,12 @@ pub enum DayOfWeek {
|
|||||||
Wednesday,
|
Wednesday,
|
||||||
Thursday,
|
Thursday,
|
||||||
Friday,
|
Friday,
|
||||||
Saturday
|
Saturday,
|
||||||
}
|
}
|
||||||
|
|
||||||
impl DayOfWeek {
|
impl DayOfWeek {
|
||||||
|
|
||||||
pub fn to_numeral(&self) -> u32 {
|
pub fn to_numeral(&self) -> u32 {
|
||||||
match self {
|
match *self {
|
||||||
DayOfWeek::Sunday => 0,
|
DayOfWeek::Sunday => 0,
|
||||||
DayOfWeek::Monday => 1,
|
DayOfWeek::Monday => 1,
|
||||||
DayOfWeek::Tuesday => 2,
|
DayOfWeek::Tuesday => 2,
|
||||||
@ -35,12 +34,12 @@ impl DayOfWeek {
|
|||||||
4 => DayOfWeek::Thursday,
|
4 => DayOfWeek::Thursday,
|
||||||
5 => DayOfWeek::Friday,
|
5 => DayOfWeek::Friday,
|
||||||
6 => DayOfWeek::Saturday,
|
6 => DayOfWeek::Saturday,
|
||||||
_ => panic!("Unreachable.")
|
_ => panic!("Unreachable."),
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Given the current day of the week, how many days until the next day?
|
/// Given the current day of the week, how many days until the next day?
|
||||||
pub fn difference(&self, other: DayOfWeek) -> u32 {
|
pub fn difference(&self, other: &DayOfWeek) -> u32 {
|
||||||
// Have to use i32 because of wraparound issues
|
// Have to use i32 because of wraparound issues
|
||||||
let s_num = self.to_numeral() as i32;
|
let s_num = self.to_numeral() as i32;
|
||||||
let o_num = other.to_numeral() as i32;
|
let o_num = other.to_numeral() as i32;
|
||||||
@ -59,12 +58,12 @@ pub fn day_of_week(year: u32, month: u32, day: u32) -> ParseResult<DayOfWeek> {
|
|||||||
3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 => {
|
3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 => {
|
||||||
let c = year / 100;
|
let c = year / 100;
|
||||||
(c, year - 100 * c)
|
(c, year - 100 * c)
|
||||||
},
|
}
|
||||||
1 | 2 => {
|
1 | 2 => {
|
||||||
let c = (year - 1) / 100;
|
let c = (year - 1) / 100;
|
||||||
(c, year - 1 - 100 * c)
|
(c, year - 1 - 100 * c)
|
||||||
},
|
}
|
||||||
_ => return Err(ParseError::InvalidMonth)
|
_ => return Err(ParseError::ImpossibleTimestamp("Invalid month")),
|
||||||
};
|
};
|
||||||
|
|
||||||
let e = match month {
|
let e = match month {
|
||||||
@ -75,7 +74,7 @@ pub fn day_of_week(year: u32, month: u32, day: u32) -> ParseResult<DayOfWeek> {
|
|||||||
8 => 1,
|
8 => 1,
|
||||||
9 | 12 => 4,
|
9 | 12 => 4,
|
||||||
10 => 6,
|
10 => 6,
|
||||||
_ => panic!("Unreachable.")
|
_ => panic!("Unreachable."),
|
||||||
};
|
};
|
||||||
|
|
||||||
// This implementation is Gregorian-only.
|
// This implementation is Gregorian-only.
|
||||||
@ -84,7 +83,7 @@ pub fn day_of_week(year: u32, month: u32, day: u32) -> ParseResult<DayOfWeek> {
|
|||||||
1 => 5,
|
1 => 5,
|
||||||
2 => 3,
|
2 => 3,
|
||||||
3 => 1,
|
3 => 1,
|
||||||
_ => panic!("Unreachable.")
|
_ => panic!("Unreachable."),
|
||||||
};
|
};
|
||||||
|
|
||||||
match (day + e + f + g + g / 4) % 7 {
|
match (day + e + f + g + g / 4) % 7 {
|
||||||
@ -95,7 +94,7 @@ pub fn day_of_week(year: u32, month: u32, day: u32) -> ParseResult<DayOfWeek> {
|
|||||||
4 => Ok(DayOfWeek::Thursday),
|
4 => Ok(DayOfWeek::Thursday),
|
||||||
5 => Ok(DayOfWeek::Friday),
|
5 => Ok(DayOfWeek::Friday),
|
||||||
6 => Ok(DayOfWeek::Saturday),
|
6 => Ok(DayOfWeek::Saturday),
|
||||||
_ => panic!("Unreachable.")
|
_ => panic!("Unreachable."),
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -114,19 +113,18 @@ mod test {
|
|||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn weekday_difference() {
|
fn weekday_difference() {
|
||||||
|
assert_eq!(DayOfWeek::Sunday.difference(&DayOfWeek::Sunday), 0);
|
||||||
assert_eq!(DayOfWeek::Sunday.difference(DayOfWeek::Sunday), 0);
|
assert_eq!(DayOfWeek::Sunday.difference(&DayOfWeek::Monday), 1);
|
||||||
assert_eq!(DayOfWeek::Sunday.difference(DayOfWeek::Monday), 1);
|
assert_eq!(DayOfWeek::Sunday.difference(&DayOfWeek::Tuesday), 2);
|
||||||
assert_eq!(DayOfWeek::Sunday.difference(DayOfWeek::Tuesday), 2);
|
assert_eq!(DayOfWeek::Sunday.difference(&DayOfWeek::Wednesday), 3);
|
||||||
assert_eq!(DayOfWeek::Sunday.difference(DayOfWeek::Wednesday), 3);
|
assert_eq!(DayOfWeek::Sunday.difference(&DayOfWeek::Thursday), 4);
|
||||||
assert_eq!(DayOfWeek::Sunday.difference(DayOfWeek::Thursday), 4);
|
assert_eq!(DayOfWeek::Sunday.difference(&DayOfWeek::Friday), 5);
|
||||||
assert_eq!(DayOfWeek::Sunday.difference(DayOfWeek::Friday), 5);
|
assert_eq!(DayOfWeek::Sunday.difference(&DayOfWeek::Saturday), 6);
|
||||||
assert_eq!(DayOfWeek::Sunday.difference(DayOfWeek::Saturday), 6);
|
assert_eq!(DayOfWeek::Monday.difference(&DayOfWeek::Sunday), 6);
|
||||||
assert_eq!(DayOfWeek::Monday.difference(DayOfWeek::Sunday), 6);
|
assert_eq!(DayOfWeek::Tuesday.difference(&DayOfWeek::Sunday), 5);
|
||||||
assert_eq!(DayOfWeek::Tuesday.difference(DayOfWeek::Sunday), 5);
|
assert_eq!(DayOfWeek::Wednesday.difference(&DayOfWeek::Sunday), 4);
|
||||||
assert_eq!(DayOfWeek::Wednesday.difference(DayOfWeek::Sunday), 4);
|
assert_eq!(DayOfWeek::Thursday.difference(&DayOfWeek::Sunday), 3);
|
||||||
assert_eq!(DayOfWeek::Thursday.difference(DayOfWeek::Sunday), 3);
|
assert_eq!(DayOfWeek::Friday.difference(&DayOfWeek::Sunday), 2);
|
||||||
assert_eq!(DayOfWeek::Friday.difference(DayOfWeek::Sunday), 2);
|
assert_eq!(DayOfWeek::Saturday.difference(&DayOfWeek::Sunday), 1);
|
||||||
assert_eq!(DayOfWeek::Saturday.difference(DayOfWeek::Sunday), 1);
|
|
||||||
}
|
}
|
||||||
}
|
}
|
Reference in New Issue
Block a user