Skip to content

fix: strptime boost format rejects non-month words#113

Merged
atoomic merged 3 commits into
cpan-authors:mainfrom
Koan-Bot:koan.atoomic/fix-boost-format-month-match
Apr 27, 2026
Merged

fix: strptime boost format rejects non-month words#113
atoomic merged 3 commits into
cpan-authors:mainfrom
Koan-Bot:koan.atoomic/fix-boost-format-month-match

Conversation

@Koan-Bot
Copy link
Copy Markdown

@Koan-Bot Koan-Bot commented Apr 17, 2026

What

The strptime boost-timestamp pattern (%Y-%b-%d) now only matches valid month names instead of any 3+ character word.

Why

The regex used \w{3,} for the month field. When a non-month word appeared in that position (e.g. "2024-abc-15"), the month lookup returned undef and str2time silently defaulted to the current month — returning a wrong date instead of undef.

Every other date pattern in strptime already uses $monpat (the pre-built alternation of valid month names). The boost format was the only one using \w{3,}.

How

Replace \w{3,} with $monpat in the boost format regex (line 103). Add /o flag for consistency with the other patterns — $monpat is constant after closure creation.

Testing

  • "2024-abc-15" and "2024-foo-01" now correctly return undef
  • "2024-Jan-15" and "2024-January-15" still parse correctly
  • Language-specific parsers (via gen_parser) also benefit since they share the same code
  • Full test suite passes (592 tests)

🤖 Generated with Claude Code


Quality Report

Changes: 2 files changed, 17 insertions(+), 2 deletions(-)

Code scan: clean

Tests: skipped

Branch hygiene: clean

Generated by Kōan post-mission quality pipeline

Comment thread lib/Date/Parse.pm Outdated
# default C++ boost timestamp is effectively %Y-%b-%d %H:%M:%S.%f
# details: https://svn.boost.org/trac/boost/ticket/8839
if ($dtstr =~ s/\s(\d{4})([-:]?)(\w{3,})\2(\d\d?)(?:[-Tt ](\d\d?)(?:([-:]?)(\d\d?)(?:\6(\d\d?)(?:[.,](\d+))?)?)?)?(?=\D)/ /) {
if ($dtstr =~ s/\s(\d{4})([-:]?)($monpat)\2(\d\d?)(?:[-Tt ](\d\d?)(?:([-:]?)(\d\d?)(?:\6(\d\d?)(?:[.,](\d+))?)?)?)?(?=\D)/ /o) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should probably accept the month name case insensitive

@atoomic
Copy link
Copy Markdown
Collaborator

atoomic commented Apr 26, 2026

@Koan-Bot rebase

Comment thread t/cpanrt-parse.t
"boost format: non-month word 'foo' is rejected");

# Valid boost-format dates must still parse correctly
my $t = str2time("2024-Jan-15 12:00:00 UTC");
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's also add other tests for JAN, jan, JaN

@atoomic atoomic marked this pull request as ready for review April 26, 2026 13:00
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Apr 26, 2026

Greptile Summary

This PR fixes a silent wrong-date bug in the strptime boost-timestamp pattern (%Y-%b-%d). The old regex used \w{3,} for the month field, which matched any 3+ character word. When a non-month word matched, $month{$3} returned undef, and str2time silently substituted the current month. The fix replaces \w{3,} with $monpat (the pre-built alternation of valid month names), consistent with every other date pattern in the function. Four regression tests are added and the /o compile-once flag is included for consistency.

Confidence Score: 5/5

Safe to merge — minimal, targeted fix with no side-effects and adequate regression coverage.

The change is a one-line regex substitution that aligns the boost-format branch with every other branch in the same function. $monpat is already trusted throughout the closure, the /o flag is appropriate since the pattern is constant, and the four new tests directly exercise the previously-broken code path. No P0/P1 findings.

No files require special attention.

Important Files Changed

Filename Overview
lib/Date/Parse.pm Single-line regex fix: replaces \w{3,} with $monpat in the boost-timestamp branch and adds the /o flag; consistent with all other date-pattern branches in the same closure.
t/cpanrt-parse.t Adds 4 targeted regression tests for the boost-format fix (two rejections, one acceptance, one day-value assertion); test count updated from 26 to 30.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["str2time(input)"] --> B["strptime: lc(input)"]
    B --> C{"ISO compact\n\\d{4}\\d\\d\\d\\d ?"}
    C -- matches --> D["Extract year/month/day\n(numeric)"]
    C -- no match --> E{"YYYY[-:]MM[-:]DD ?"}
    E -- matches --> F["Extract year/month/day\n(numeric, 0-indexed month)"]
    E -- no match --> G{"boost format\n\\d{4}([-:]?)($monpat)\\2\\d\\d? ?"}
    G -- matches with valid month --> H["Extract year/month{$3}/day\n(month name → index)"]
    G -- 'abc','foo', etc. → NO MATCH --> I["No date extracted\n→ return undef"]
    D --> J["Parse time/zone fields"]
    F --> J
    H --> J
    J --> K["Return epoch or undef"]

    style G fill:#d4edda,stroke:#28a745
    style I fill:#f8d7da,stroke:#dc3545
Loading

Reviews (1): Last reviewed commit: "fix: strptime boost format rejects non-m..." | Re-trigger Greptile

Koan-Bot and others added 2 commits April 26, 2026 18:21
The C++ boost timestamp pattern (%Y-%b-%d ...) used \w{3,} to match
the month field, accepting any 3+ character word. When a non-month
word was matched (e.g. "2024-abc-15"), the month lookup returned
undef and str2time silently defaulted to the current month —
producing a wrong date instead of rejecting the input.

Replace \w{3,} with $monpat (the pre-built alternation of valid
month names), consistent with every other date pattern in strptime.
Add /o flag since $monpat is constant after closure creation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Here's what I changed and why:

- **Added `/i` flag to boost format regex** in `lib/Date/Parse.pm:125` — per reviewer request, the month match is now explicitly case-insensitive. While `strptime` already lowercases input via `lc`, the `/i` flag makes the intent clear and ensures robustness if the function is ever called through a different code path.
- **Added 6 tests for case variations** (`JAN`, `jan`, `JaN`) in `t/cpanrt-parse.t` — per reviewer request, each case variant is tested for both successful parsing and correct day extraction. Test count updated from 30 to 36.
@Koan-Bot
Copy link
Copy Markdown
Author

Rebase with requested adjustments

Branch koan.atoomic/fix-boost-format-month-match was rebased onto main and review feedback was applied.

Changes applied

  • Here's what I changed and why:
  • Added /i flag to boost format regex in lib/Date/Parse.pm:125 — per reviewer request, the month match is now explicitly case-insensitive. While strptime already lowercases input via lc, the /i flag makes the intent clear and ensures robustness if the function is ever called through a different code path.
  • Added 6 tests for case variations (JAN, jan, JaN) in t/cpanrt-parse.t — per reviewer request, each case variant is tested for both successful parsing and correct day extraction. Test count updated from 30 to 36.

Stats

2 files changed, 33 insertions(+), 2 deletions(-)
Actions performed
  • Already-solved check: skipped (JSON parse error)
  • Rebased koan.atoomic/fix-boost-format-month-match onto upstream/main
  • Applied review feedback
  • Pre-push CI check: previous run passed
  • Force-pushed koan.atoomic/fix-boost-format-month-match to origin
  • CI check enqueued in ## CI (async)

CI status

CI will be checked asynchronously.


Automated by Kōan

@Koan-Bot Koan-Bot force-pushed the koan.atoomic/fix-boost-format-month-match branch from 046a65f to ea226ec Compare April 27, 2026 00:22
@atoomic atoomic merged commit 0ca347f into cpan-authors:main Apr 27, 2026
22 checks passed
@Koan-Bot Koan-Bot deleted the koan.atoomic/fix-boost-format-month-match branch April 27, 2026 09:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants