fix: normalize rollout timestamps before deriving started_at/ended_at by eric-tramel · Pull Request #556 · NVIDIA-NeMo/DataDesigner

eric-tramel · 2026-04-17T14:13:42Z

📋 Summary

Parse ISO 8601 timestamps before comparing them in the Claude Code, Codex, and ATIF rollout ingesters. Raw string min()/max() can order chronologically-earlier timestamps as later when offsets or precisions differ (e.g. 2025-01-01T00:30:00+01:00 vs 2025-01-01T00:00:00Z), which could produce incorrect started_at / ended_at on otherwise valid rollouts.

🔗 Related Issue

Fixes #497

🔄 Changes

Add min_max_timestamps() + parse_iso8601() helpers in agent_rollout/utils.py. Naive timestamps are treated as UTC; unparseable values are silently skipped (matching the existing tolerance of coerce_optional_str). Winners are returned in their original string form, so the output contract is unchanged.
Swap min(timestamps) / max(timestamps) for the helper in claude_code.py, codex.py, and atif.py. Codex's session_meta["timestamp"] short-circuit for started_at is preserved.
Add a parameterized unit test covering the four interesting cases: mixed offsets, mixed precisions, naive-vs-aware, and unparseable-skipping.

🧪 Testing

make test passes (engine suite, 1867 tests; pre-existing interface failures unrelated to this change)
Unit tests added/updated
E2E tests added/updated (if applicable)

✅ Checklist

Follows commit message conventions
Commits are signed off (DCO)
Architecture docs updated (if applicable)

Claude Code, Codex, and ATIF ingesters called min()/max() on raw timestamp strings. That assumes lexicographic order matches chronological order, which breaks for ISO 8601 timestamps with mixed UTC offsets or precisions (e.g. 2025-01-01T00:30:00+01:00 is earlier than 2025-01-01T00:00:00Z but sorts later as a string). Introduce min_max_timestamps() in the shared rollout utils that parses each value as ISO 8601 (naive values treated as UTC, unparseable values skipped) and picks the chronological extremes, returning them in their original string form. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Eric W. Tramel <eric.tramel@gmail.com>

github-actions · 2026-04-17T14:15:23Z

Code Review: PR #556 — fix: normalize rollout timestamps before deriving started_at/ended_at

Summary

This PR fixes a correctness bug in the agent rollout ingesters (Claude Code, Codex, ATIF) where started_at and ended_at were derived by calling min()/max() on raw ISO 8601 timestamp strings. Lexicographic comparison produces wrong results when timestamps have differing UTC offsets or sub-second precisions — e.g. "2025-01-01T00:30:00+01:00" sorts after "2025-01-01T00:00:00Z" as a string, but is chronologically 30 minutes earlier.

The fix introduces two helpers in utils.py — parse_iso8601() and min_max_timestamps() — that parse timestamps into datetime objects before comparison, then return the winners in their original string form so the output contract is unchanged. All three ingesters are updated to use the new helper. A parameterized test covers the key edge cases.

Files changed: 5 (3 ingesters, 1 utility module, 1 new test file)
Lines: +88 / -7

Findings

Correctness

Fix is sound. Parsing to datetime before comparison correctly handles mixed offsets, mixed precisions, and naive-vs-aware timestamps. The Z → +00:00 replacement before fromisoformat() is the standard workaround for Python < 3.11 compatibility and is safe here since Z only appears as a trailing timezone designator in ISO 8601.
Output contract preserved. min_max_timestamps() returns the original string values, not stringified datetime objects, so downstream consumers are unaffected.
Codex short-circuit preserved. In codex.py, the session_meta.get("timestamp") fallback for started_at is correctly maintained (started_at=coerce_optional_str(session_meta.get("timestamp")) or earliest).

Design

Centralised helper is the right call. The same parsing logic was duplicated across three files; moving it to utils.py eliminates the duplication cleanly.
Silent skip of unparseable values is consistent. The PR description notes this mirrors the existing tolerance of coerce_optional_str. If all timestamps are unparseable, (None, None) is returned, which matches the previous behavior of an empty list producing None. No log warning is emitted for skipped values — this is a minor observation, not a blocker, since the ingestion layer already tolerates missing timestamps.

Style & Conventions

from __future__ import annotations present in both new/modified files — compliant.
SPDX license header present in the new test file.
Absolute imports only — no relative imports introduced.
Type annotations on all new functions and variables.
Naming: earliest/latest in codex.py avoids shadowing and reads naturally.

Tests

Coverage is good. The parameterized test (test_utils.py) covers 6 cases:
1. Empty list → (None, None)
2. Mixed offsets where lexicographic ordering disagrees with chronological
3. Mixed sub-second precision
4. Naive timestamp treated as UTC compared against aware
5. Unparseable values skipped, valid ones retained
6. Only unparseable values → (None, None)
No negative-offset test case. A case like "2025-01-01T00:00:00-05:00" vs "2025-01-01T04:00:00Z" (same instant, different representation) would exercise negative offsets, but this is not a gap since fromisoformat handles negative offsets identically to positive ones — nice-to-have at most.

Potential Concerns (non-blocking)

value.replace("Z", "+00:00") — This replaces all occurrences of the character Z in the string, not just a trailing one. In practice, ISO 8601 timestamps won't contain a capital Z anywhere except as the UTC designator at the end, so this is safe. A more defensive approach would be value[:-1] + "+00:00" if value.endswith("Z") else value, but the current approach matches widespread convention and is fine.
No logging on unparseable skip — If a rollout file contains garbled timestamps, the helper silently discards them. For debugging production ingestion issues, a logger.debug on skip could be useful, but this is a future enhancement, not a requirement for this fix.

Verdict

Approve. This is a clean, well-scoped bug fix with good test coverage. The implementation is correct, follows project conventions, and doesn't introduce unnecessary complexity. No blocking issues found.

greptile-apps · 2026-04-17T14:16:16Z

Greptile Summary

This PR replaces raw string min()/max() comparisons on ISO 8601 timestamp lists with a proper chronological sort in the Claude Code, Codex, and ATIF rollout ingesters. The new min_max_timestamps() / parse_iso8601() helpers in utils.py parse timestamps before comparing, treat naive timestamps as UTC, skip unparseable values, and return the winning entries in their original string form — preserving the downstream output contract while fixing the ordering bug.

Confidence Score: 5/5

Safe to merge — the fix is correct, well-tested, and output contract is unchanged.

All three ingesters are updated consistently, parse_iso8601 correctly normalises naive/aware datetimes, the Z-suffix workaround is cross-version compatible, and the test suite covers all advertised edge cases. No P0/P1 findings.

No files require special attention.

Important Files Changed

Filename	Overview
packages/data-designer-engine/src/data_designer/engine/resources/agent_rollout/utils.py	Adds `parse_iso8601()` and `min_max_timestamps()` helpers; Z→+00:00 normalisation is correct, naive timestamps are coerced to UTC, unparseable values return None.
packages/data-designer-engine/src/data_designer/engine/resources/agent_rollout/atif.py	Replaces `min`/`max` string comparisons with `min_max_timestamps()`; change is minimal and correct.
packages/data-designer-engine/src/data_designer/engine/resources/agent_rollout/claude_code.py	Replaces `min`/`max` string comparisons with `min_max_timestamps()`; change is minimal and correct.
packages/data-designer-engine/src/data_designer/engine/resources/agent_rollout/codex.py	Replaces `min`/`max` string comparisons with `min_max_timestamps()`; `session_meta["timestamp"]` short-circuit for `started_at` is correctly preserved.
packages/data-designer-engine/tests/engine/resources/agent_rollout/test_utils.py	New parameterised test covering empty list, mixed UTC offsets, mixed precision, naive-vs-aware, unparseable-skip, and all-unparseable cases — all correct.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["timestamps list[str]"] --> B{for each timestamp}
    B --> C["parse_iso8601(value)"]
    C --> D{ValueError?}
    D -- Yes --> E["skip - return None"]
    D -- No --> F{tzinfo present?}
    F -- No --> G["add UTC tzinfo"]
    F -- Yes --> H["keep as-is"]
    G --> I["append datetime + original string"]
    H --> I
    I --> B
    B --> J{list empty?}
    J -- Yes --> K["return None, None"]
    J -- No --> L["min by datetime - earliest string"]
    J -- No --> M["max by datetime - latest string"]
    L --> N["return earliest, latest"]
    M --> N
    N --> O["started_at / ended_at in NormalizedAgentRolloutRecord"]

_{Reviews (1): Last reviewed commit: "fix: normalize rollout timestamps before..." | Re-trigger Greptile}

andreatgretel · 2026-04-17T14:54:59Z

packages/data-designer-engine/src/data_designer/engine/resources/agent_rollout/pi_coding_agent.py:203-204
started_at=min(timestamps) if timestamps else None,
ended_at=max(timestamps) if timestamps else None,

looks like this file has the same min(timestamps) / max(timestamps) pattern that was fixed in the other three ingesters - worth including in this PR since the helper already exists?

eric-tramel requested a review from a team as a code owner April 17, 2026 14:13

eric-tramel temporarily deployed to agentic-ci April 17, 2026 14:13 — with GitHub Actions Inactive

github-actions Bot mentioned this pull request Apr 20, 2026

Agentic CI: Issue & PR Triage Tracker #562

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: normalize rollout timestamps before deriving started_at/ended_at#556

fix: normalize rollout timestamps before deriving started_at/ended_at#556
eric-tramel wants to merge 1 commit intomainfrom
worktree-issue-497-timestamp-normalization

eric-tramel commented Apr 17, 2026

Uh oh!

github-actions Bot commented Apr 17, 2026

Uh oh!

greptile-apps Bot commented Apr 17, 2026

Confidence Score: 5/5

Flowchart

Uh oh!

andreatgretel commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

eric-tramel commented Apr 17, 2026

📋 Summary

🔗 Related Issue

🔄 Changes

🧪 Testing

✅ Checklist

Uh oh!

github-actions Bot commented Apr 17, 2026

Code Review: PR #556 — fix: normalize rollout timestamps before deriving started_at/ended_at

Summary

Findings

Correctness

Design

Style & Conventions

Tests

Potential Concerns (non-blocking)

Verdict

Uh oh!

greptile-apps Bot commented Apr 17, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

andreatgretel commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants