test(format): expand date-time suite to 132 exhaustive RFC 3339 cases#891
test(format): expand date-time suite to 132 exhaustive RFC 3339 cases#891AcEKaycgR wants to merge 1 commit into
Conversation
|
Please respect the existing formatting of these tests and avoid making unnecessary whitespace changes. |
c20e88b to
339a28e
Compare
|
Thanks for the review, @karenetheridge. All three points have been addressed in the latest commit:
|
|
Let me try to post some comments here, as GitHub is refusing to let me comment on specific diffs (internal errors)
|
|
Other than that, I can't find any case that is not RFC 3339 compliant. It's an interesting one, as we check RFC 3339, which is stricter than ISO 8601, potentially tripping up some implementations. |
339a28e to
9e3f971
Compare
|
Thanks for the review, @jviotti. I’ve pushed the updates to address your feedback:
I'll also keep the note on RFC 3339 vs. ISO 8601 strictness in mind ,keeping the tests strict will definitely help surface implementation gaps. |
jdesrosiers
left a comment
There was a problem hiding this comment.
This is the third PR I've looked at today that's too large to review properly. I ran the tests against my implementation and didn't find any issues, but in cases like this I question the value of these tests. Are all of these tests really providing value? It's hard to say.
I'm approving because I didn't find anything wrong, but I'm anxious about pressing the merge button.
I think there is bias in that. We from the TSC read these things more carefully than other people and our implementations are likely correct or close to correct. But not every implementor is in the TSC. For
If anything is off, we will realise when running more implementations on it. If any, at least we get to discuss these things. The current suites are not very thorough and I bet there is divergence on these things already, that we would never know about it. |
|
Lots of tests are great and we should have tests to make sure implementations aren't just using ISO validators. My concern is the quality of the tests. Are they covering really covering unique and valuable cases? I don't have the bandwidth to review that many tests in detail to verify quality. It could be the case that 100 of them are testing the a slightly different variation of the same edge case and we could get the same value out of 32 new tests. I don't know because I don't have all day to sit with the spec and go through every test in detail. Every test we add incurs a maintenance cost and I don't want the test suite to get unmaintainable because we keep adding a bunch of tests without sufficient care. |
|
Yeah, I get your point too. We probably need to decide on a good strategy for the GSoC Format Assertion project in general. Some thoughts and looking forward to your input:
Any thoughts? |
👍
That's the part that's hard to do at this scale. Ideally, it's trimmed down before it gets to us. Maybe we can use AI to flag issues. I don't imagine it's going to do very well at that task, but it might help a little.
I hope it doesn't come to that, but that might be what we have to do for the format tests.
I was thinking more like only allowing one category of tests to be added in each PR. For example, this PR indicated seven categories of tests they added. That could be seven separate PRs, which would each be easier to review not just because they're smaller, but because they're related. Yes, it could increase git conflicts. But, I think it's worth a try and we can always try something else if it gets to be too big a problem. I'm pretty comfortable with dealing with merge conflicts, but I know not everyone is. |
I like this approach! cc @AcEKaycgR @vtushar06 |
|
@jdesrosiers @jviotti I understand and completely agree with the category-based approach as well ,It will make the reviews much more focused and help ensure every case provides unique value. I’ll follow this for all future formats and submit them in smaller, related batches. |
Addresses #965. This PR expands the
date-timeformat test suite from 29 to 132 cases across all drafts.The expansion is designed to provide complete coverage of RFC 3339 §5.6 ABNF and §5.7 prose constraints, serving as a standalone compliance target for implementors.Technical Coverage Details
The new cases cover every structural dimension of the grammar:
Tabsent, space/tab/duplicate instead ofT; colons absent/replaced/duplicated in time; hyphens absent/replaced in date.Z-only, numeric offsets (+/-HH:MM), missing offset, offset without colon (+0000),Zfollowed by numeric offset, offset sign variants (±,++,+-,-+), non-padded and 3-digit offset fields, and non-ASCII digits./and:), non-ASCII Bengali digits, negative prefixes, and year0000(valid) and9999(valid) limits.00rejection.secfracvalidation covering 1-digit, 19-digit (arbitrary precision), all-zeros, and non-ASCII mid-frac characters.Standards & Traceability
Following the style established in the
ipv4.jsonsuite[cite: 41], this PR adds a top-levelcommentfield citing the full RFC 3339 §5.6 ABNF and §5.7 prose rules. Additionally, it corrects one existing description: "an invalid date-time past leap second, UTC" is updated to "second 61 is above the absolute maximum of 60".Triangulation Results
The suite was validated against the following implementations:
These mismatches represent implementation differences from strict RFC 3339 compliance rather than errors in the test vectors.
Feedback on the coverage matrix and test structure is appreciated @jviotti @jdesrosiers