Skip to content

Conversation

@asddongmen
Copy link
Collaborator

@asddongmen asddongmen commented Jan 29, 2026

What problem does this PR solve?

Issue Number: close #4084

What is changed and how it works?

This PR fixes a case where the logpuller’s resolved-ts could stop advancing when there are large amount (eg. 400k) regions in a single span(or table). handleResolvedTs now advances:

  • immediately when advanceInterval == 0 (heap-based min, updated per region update)
  • periodically when advanceInterval > 0 (compute min via RangeLock.ResolvedTs())

It also changes the default KVClient advance-interval-in-ms from 0 to 100 to throttle resolved-ts advancement and reduce per-event overhead.

Check List

Tests

  • Unit test
  • Manual test (add detailed scripts or steps below)

Questions

Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?

Release note

None

Summary by CodeRabbit

  • Chores

    • Made timestamp advancement conditional with a zero-interval shortcut to avoid unnecessary work and reduce advancement-related logging to only when progress occurs.
    • Updated default KV client advance interval from 0ms to 100ms for a moderate default cadence.
  • Tests

    • Added a test validating throttled/resolved-timestamp advancement behavior with locked ranges.

✏️ Tip: You can customize this high-level summary in your review settings.

@ti-chi-bot ti-chi-bot bot added the release-note-none Denotes a PR that doesn't merit a release note. label Jan 29, 2026
@gemini-code-assist
Copy link

Summary of Changes

Hello @asddongmen, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical issue where the resolvedTs (resolved timestamp) in the log puller could become stuck, hindering the progress of change data capture. The changes refactor the resolvedTs advancement mechanism to ensure it correctly updates based on a configured interval or immediately when no interval is set, and establishes a default advancement interval to prevent the system from entering a stuck state.

Highlights

  • Resolved Timestamp Advancement Logic Refinement: The logic for advancing the resolvedTs in the handleResolvedTs function has been refactored. It now conditionally updates the rangeLock and determines the resolvedTs based on the advanceInterval.
  • Immediate vs. Interval-Based Advancement: Distinct logic paths are introduced: if advanceInterval is 0, the rangeLock is immediately updated, and GetHeapMinTs() is used for advancement. If advanceInterval is greater than 0, a time-based check is performed, and ResolvedTs() is used for advancement.
  • Default Advance Interval Configuration: The default value for AdvanceIntervalInMs in the KVClientConfig has been changed from 0 to 100 milliseconds, ensuring that the time-based advancement mechanism is active by default.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@ti-chi-bot ti-chi-bot bot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Jan 29, 2026
@coderabbitai
Copy link

coderabbitai bot commented Jan 29, 2026

📝 Walkthrough

Walkthrough

Refactors handleResolvedTs to use a shouldAdvance guard: zero-interval forces heap-min TS and immediate advancement; non-zero intervals use time-based checks and RangeLock.ResolvedTs() as candidate. Also changes default KV client AdvanceIntervalInMs from 0 to 100 and adds a test for throttled behavior.

Changes

Cohort / File(s) Summary
Event Handler Logic
logservice/logpuller/region_event_handler.go
Reworked handleResolvedTs to compute shouldAdvance: if advanceInterval==0 fetch heap minimum TS and force advance; otherwise use time-based check with lastAdvance and rangeLock.ResolvedTs() as candidate. Subsequent logging, push/update, lag calc, and resolved-state updates run only when advancing.
Default Configuration
pkg/config/kvclient.go
Changed default AdvanceIntervalInMs in NewDefaultKVClientConfig from 0 to 100, adjusting default throttling cadence for resolved TS advancement.
Tests
logservice/logpuller/region_event_handler_test.go
Added TestHandleResolvedTsThrottled to validate throttled advancement with a region lock heap; updated imports and deterministic heap ordering for assertions.

Sequence Diagram(s)

sequenceDiagram
    participant TiKV
    participant LogPuller
    participant RangeLock
    participant Heap
    participant KVClient

    TiKV->>LogPuller: send batched resolvedTs events
    LogPuller->>RangeLock: update locked-range state
    alt advanceInterval == 0
        LogPuller->>Heap: GetHeapMinTs()
        Heap-->>LogPuller: heapMinTs
        LogPuller->>LogPuller: force advancement (use heapMinTs)
        LogPuller->>KVClient: push new resolved TS / update state
    else advanceInterval > 0
        LogPuller->>LogPuller: check time since lastAdvance
        LogPuller->>RangeLock: ResolvedTs()
        RangeLock-->>LogPuller: candidateResolvedTs
        alt shouldAdvance
            LogPuller->>KVClient: push new resolved TS / update state
        end
    end
    LogPuller-->>TiKV: continue processing / ack
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hopped through spans at break of day,
I peeked the heap when waits gave way.
A hundred millis now pace my dance,
No frantic btree, just measured prance.
Hop—resolved timestamps find their chance.

🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically identifies the main fix: addressing resolvedTs advancement issues in the log puller, which is the core objective of this PR.
Description check ✅ Passed The PR description includes the required issue reference (close #4084), explains the problem and solution with specific implementation details, identifies tests added (unit and manual), and provides a release note.
Linked Issues check ✅ Passed The code changes directly address the root cause identified in issue #4084: the handleResolvedTs optimization with conditional logic prevents excessive GetHeapMinTs calls, and the default interval increase throttles advancement to reduce processing overhead.
Out of Scope Changes check ✅ Passed All three modified files are directly related to the stated objective: fixing resolvedTs advancement in logpuller. Changes include core logic updates, configuration defaults, and corresponding unit tests.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses an issue where the logpuller's resolved timestamp could get stuck. The fix involves changing the default AdvanceIntervalInMs to 100ms and modifying the logic in handleResolvedTs. The new logic correctly uses rangeLock.ResolvedTs() when advanceInterval is greater than 0, which avoids dependency on a potentially partially updated heap and correctly calculates the minimum resolved timestamp. This is a good fix for the default case.

However, I've noticed that the code path for advanceInterval == 0 still uses GetHeapMinTs(), which can lead to the same 'stuck resolved-ts' issue under certain conditions. I've left a specific comment with a suggestion to make this path robust as well.

Signed-off-by: dongmen <[email protected]>
@asddongmen
Copy link
Collaborator Author

/test all

Signed-off-by: dongmen <[email protected]>
@ti-chi-bot ti-chi-bot bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jan 29, 2026
@asddongmen
Copy link
Collaborator Author

/test all

@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Jan 29, 2026
@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Jan 29, 2026
@ti-chi-bot
Copy link

ti-chi-bot bot commented Jan 29, 2026

[LGTM Timeline notifier]

Timeline:

  • 2026-01-29 09:15:05.662491435 +0000 UTC m=+1262933.276448291: ☑️ agreed by wk989898.
  • 2026-01-29 14:10:49.994911893 +0000 UTC m=+1280677.608868739: ☑️ agreed by lidezhu.

@ti-chi-bot
Copy link

ti-chi-bot bot commented Jan 29, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: flowbehappy, lidezhu, wk989898

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the approved label Jan 29, 2026
@ti-chi-bot ti-chi-bot bot merged commit df1a8fd into pingcap:master Jan 29, 2026
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved lgtm release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Log Puller Memory Quota Exhaustion Leads to All Changefeed Getting Stuck

4 participants