fix: strptime boost format rejects non-month words#113
Conversation
| # default C++ boost timestamp is effectively %Y-%b-%d %H:%M:%S.%f | ||
| # details: https://svn.boost.org/trac/boost/ticket/8839 | ||
| if ($dtstr =~ s/\s(\d{4})([-:]?)(\w{3,})\2(\d\d?)(?:[-Tt ](\d\d?)(?:([-:]?)(\d\d?)(?:\6(\d\d?)(?:[.,](\d+))?)?)?)?(?=\D)/ /) { | ||
| if ($dtstr =~ s/\s(\d{4})([-:]?)($monpat)\2(\d\d?)(?:[-Tt ](\d\d?)(?:([-:]?)(\d\d?)(?:\6(\d\d?)(?:[.,](\d+))?)?)?)?(?=\D)/ /o) { |
There was a problem hiding this comment.
it should probably accept the month name case insensitive
|
@Koan-Bot rebase |
| "boost format: non-month word 'foo' is rejected"); | ||
|
|
||
| # Valid boost-format dates must still parse correctly | ||
| my $t = str2time("2024-Jan-15 12:00:00 UTC"); |
There was a problem hiding this comment.
let's also add other tests for JAN, jan, JaN
Greptile SummaryThis PR fixes a silent wrong-date bug in the Confidence Score: 5/5Safe to merge — minimal, targeted fix with no side-effects and adequate regression coverage. The change is a one-line regex substitution that aligns the boost-format branch with every other branch in the same function. No files require special attention.
|
| Filename | Overview |
|---|---|
| lib/Date/Parse.pm | Single-line regex fix: replaces \w{3,} with $monpat in the boost-timestamp branch and adds the /o flag; consistent with all other date-pattern branches in the same closure. |
| t/cpanrt-parse.t | Adds 4 targeted regression tests for the boost-format fix (two rejections, one acceptance, one day-value assertion); test count updated from 26 to 30. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["str2time(input)"] --> B["strptime: lc(input)"]
B --> C{"ISO compact\n\\d{4}\\d\\d\\d\\d ?"}
C -- matches --> D["Extract year/month/day\n(numeric)"]
C -- no match --> E{"YYYY[-:]MM[-:]DD ?"}
E -- matches --> F["Extract year/month/day\n(numeric, 0-indexed month)"]
E -- no match --> G{"boost format\n\\d{4}([-:]?)($monpat)\\2\\d\\d? ?"}
G -- matches with valid month --> H["Extract year/month{$3}/day\n(month name → index)"]
G -- 'abc','foo', etc. → NO MATCH --> I["No date extracted\n→ return undef"]
D --> J["Parse time/zone fields"]
F --> J
H --> J
J --> K["Return epoch or undef"]
style G fill:#d4edda,stroke:#28a745
style I fill:#f8d7da,stroke:#dc3545
Reviews (1): Last reviewed commit: "fix: strptime boost format rejects non-m..." | Re-trigger Greptile
The C++ boost timestamp pattern (%Y-%b-%d ...) used \w{3,} to match
the month field, accepting any 3+ character word. When a non-month
word was matched (e.g. "2024-abc-15"), the month lookup returned
undef and str2time silently defaulted to the current month —
producing a wrong date instead of rejecting the input.
Replace \w{3,} with $monpat (the pre-built alternation of valid
month names), consistent with every other date pattern in strptime.
Add /o flag since $monpat is constant after closure creation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Here's what I changed and why: - **Added `/i` flag to boost format regex** in `lib/Date/Parse.pm:125` — per reviewer request, the month match is now explicitly case-insensitive. While `strptime` already lowercases input via `lc`, the `/i` flag makes the intent clear and ensures robustness if the function is ever called through a different code path. - **Added 6 tests for case variations** (`JAN`, `jan`, `JaN`) in `t/cpanrt-parse.t` — per reviewer request, each case variant is tested for both successful parsing and correct day extraction. Test count updated from 30 to 36.
Rebase with requested adjustmentsBranch Changes applied
StatsActions performed
CI statusCI will be checked asynchronously. Automated by Kōan |
046a65f to
ea226ec
Compare
What
The strptime boost-timestamp pattern (
%Y-%b-%d) now only matches valid month names instead of any 3+ character word.Why
The regex used
\w{3,}for the month field. When a non-month word appeared in that position (e.g."2024-abc-15"), the month lookup returnedundefandstr2timesilently defaulted to the current month — returning a wrong date instead ofundef.Every other date pattern in
strptimealready uses$monpat(the pre-built alternation of valid month names). The boost format was the only one using\w{3,}.How
Replace
\w{3,}with$monpatin the boost format regex (line 103). Add/oflag for consistency with the other patterns —$monpatis constant after closure creation.Testing
"2024-abc-15"and"2024-foo-01"now correctly returnundef"2024-Jan-15"and"2024-January-15"still parse correctlygen_parser) also benefit since they share the same code🤖 Generated with Claude Code
Quality Report
Changes: 2 files changed, 17 insertions(+), 2 deletions(-)
Code scan: clean
Tests: skipped
Branch hygiene: clean
Generated by Kōan post-mission quality pipeline