Skip to content

Conversation

@wk989898
Copy link
Collaborator

@wk989898 wk989898 commented Jan 28, 2026

What problem does this PR solve?

Issue Number: ref #3888

What is changed and how it works?

Before this PR, using the latest cdc binary to handle the old version of the CDC would result in an error: call cdc api failed. This is because the API is considered to be compatible with the old architecture.

  • API Compatibility Testing: A new integration test suite, http_api_compatibility, has been introduced to specifically verify the API's behavior and ensure backward compatibility with the 'old architecture'.
  • Enhanced Table Information in API: The Tables API model has been extended to include an AllTables field, providing a comprehensive list of all tables in addition to the existing eligible and ineligible tables.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Questions

Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?

Release note

Please refer to [Release Notes Language Style Guide](https://pingcap.github.io/tidb-dev-guide/contribute-to-tidb/release-notes-style-guide.html) to write a quality release note.

If you don't think this PR needs a release note then fill it with `None`.

Summary by CodeRabbit

  • New Features

    • Changefeed API v2 exposes AllTables alongside eligible and ineligible tables.
    • CLI shows improved, human-friendly table name lists.
  • Bug Fixes

    • Resume/update flows only show ineligible-table warnings when table data is successfully retrieved.
  • Tests

    • Added HTTP API compatibility wrapper and expanded light-integration tests.
    • Updated tests to use nested replica_config.mounter.worker_num and to send an empty JSON body when resuming.

✏️ Tip: You can customize this high-level summary in your review settings.

Signed-off-by: wk989898 <[email protected]>
@ti-chi-bot ti-chi-bot bot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Jan 28, 2026
@ti-chi-bot
Copy link

ti-chi-bot bot commented Jan 28, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign charlescheung96 for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 28, 2026
@gemini-code-assist
Copy link

Summary of Changes

Hello @wk989898, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request primarily focuses on enhancing the API's compatibility with older architectures and improving the clarity of table information provided through the API. It introduces new testing infrastructure to validate this compatibility and refactors existing utility functions for better code structure. The changes aim to ensure stability across different architectural versions and provide more complete data representations.

Highlights

  • API Compatibility Testing: A new integration test suite, http_api_compatibility, has been introduced to specifically verify the API's behavior and ensure backward compatibility with the 'old architecture'.
  • Enhanced Table Information in API: The Tables API model has been extended to include an AllTables field, providing a comprehensive list of all tables in addition to the existing eligible and ineligible tables.
  • Code Refactoring for CLI Utilities: Table name formatting utility functions (formatTableNames, formatTableName) and their corresponding tests have been refactored and moved from cmd/cdc/cli/helper.go to cmd/cdc/cli/cli_changefeed_helper.go, improving code organization within the CLI package.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@wk989898
Copy link
Collaborator Author

/test light

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to make the API compatible with an older architecture. The changes include adding an AllTables field to the Tables API response struct, which is a good enhancement for providing more complete information. The PR also refactors some helper functions for better code organization and adds a new integration test to ensure compatibility.

My review focuses on improving code consistency. I've pointed out an incomplete comment in an interface definition and an inconsistent logging message style. Addressing these points will improve the maintainability and readability of the code.

wk989898 and others added 3 commits January 28, 2026 06:17
Signed-off-by: wk989898 <[email protected]>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@wk989898
Copy link
Collaborator Author

/test light

Signed-off-by: wk989898 <[email protected]>
@coderabbitai
Copy link

coderabbitai bot commented Jan 28, 2026

📝 Walkthrough

Walkthrough

Localized table-name formatting helpers added and old helpers removed; ineligible-table warnings in changefeed resume/update are now conditional on successful GetAllTables; Tables model extended with AllTables; integration test scripts and HTTP API test payloads updated for compatibility.

Changes

Cohort / File(s) Summary
Table formatting helpers
cmd/cdc/cli/cli_changefeed_helper.go, cmd/cdc/cli/cli_changefeed_helper_test.go, cmd/cdc/cli/helper.go, cmd/cdc/cli/helper_test.go
New unexported helpers formatTableNames / formatTableName introduced in cli_changefeed_helper.go with tests; the prior helper.go and its test were removed.
Changefeed resume/update error handling
cmd/cdc/cli/cli_changefeed_resume.go, cmd/cdc/cli/cli_changefeed_update.go
GetAllTables error captured into a separate err1; ineligible-table warning/confirmation logic now runs only when err1 == nil; err1 used in final warning output.
API and model changes
pkg/api/v2/changefeed.go, tests/integration_tests/api_v2/model.go
Doc comment for GetAllTables updated; Tables struct gained AllTables []TableName json:"all_tables,omitempty"``.
Integration test compatibility & payloads
tests/integration_tests/http_api_compatibility/run.sh, tests/integration_tests/run_light_it_in_ci.sh, tests/integration_tests/http_api*/util/test_case.py
Added compatibility wrapper script and included it in G06; updated HTTP API tests to use nested replica_config.mounter.worker_num instead of flat mounter_worker_num; resume requests now send an empty JSON body with Content-Type header.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant CLI
participant APIv2 as API_v2
participant User
CLI->>APIv2: GetAllTables(changefeed)
alt API returns tables successfully
APIv2-->>CLI: eligible, ineligible, all tables
CLI->>User: show ineligible tables & prompt for confirmation
alt User confirms
User-->>CLI: confirm
CLI->>APIv2: proceed with update/resume (with ignoreIneligibleTables as set)
else User cancels
User-->>CLI: cancel
CLI-->>User: abort
end
else API returns error
APIv2-->>CLI: error (err1)
CLI-->>User: continue without ineligible-table warning (use err1 in summary)
end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 I hopped through names both new and old,

I tucked helpers neat and made warnings hold.
Tables counted, tests refreshed with care,
Scripts and payloads dancing in the air.
A munch, a twitch — changes done with flair!

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 9.09% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive PR description includes issue reference and high-level changes, but lacks concrete implementation details and specific technical explanations. Provide specific details on code changes: which files were modified, how table formatting helpers were relocated, and how API compatibility is being verified through integration tests.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title directly relates to the main changeset objective—making the API compatible with the old architecture—as evidenced by API changes, helper functions reorganization, and compatibility test additions.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
cmd/cdc/cli/cli_changefeed_update.go (1)

135-183: Nil pointer dereference if GetAllTables fails.

When GetAllTables returns an error, tables will be nil. Lines 178 and 180-183 access tables.IneligibleTables, tables.EligibleTables, and tables.AllTables unconditionally, causing a panic.

Initialize tables to an empty struct and preserve it when the API call fails:

🐛 Proposed fix
+	tables := &v2.Tables{}
-	tables, err1 := o.apiV2Client.Changefeeds().GetAllTables(ctx, &v2.VerifyTableConfig{
+	tablesResult, err1 := o.apiV2Client.Changefeeds().GetAllTables(ctx, &v2.VerifyTableConfig{
 		ReplicaConfig: changefeedConfig.ReplicaConfig,
 		StartTs:       newInfo.CheckpointTs,
 	}, o.keyspace)
 	if err1 == nil {
+		tables = tablesResult
 		ignoreIneligibleTables := false

Alternatively, guard all tables accesses with an err1 == nil check, or move the success message inside the success path.

cmd/cdc/cli/cli_changefeed_resume.go (1)

210-249: Nil pointer dereference if GetAllTables fails.

Although tables is initialized to &v2.Tables{} at line 211, it gets reassigned at line 219. If GetAllTables returns an error, tables will be nil, causing a panic at lines 244 and 246-249 when accessing its fields.

Preserve the empty struct when the API call fails:

🐛 Proposed fix
 	var (
 		tables = &v2.Tables{}
 		err1   error
 	)
 	if o.checkpointTs != 0 {
 		cf, err := o.apiClient.Changefeeds().Get(ctx, o.keyspace, o.changefeedID)
 		if err != nil {
 			return err
 		}
-		tables, err1 = o.apiClient.Changefeeds().GetAllTables(ctx, &v2.VerifyTableConfig{
+		tablesResult, err1 := o.apiClient.Changefeeds().GetAllTables(ctx, &v2.VerifyTableConfig{
 			ReplicaConfig: cf.Config,
 			StartTs:       cf.CheckpointTs,
 		}, o.keyspace)
-		if err1 == nil {
+		if err1 == nil && tablesResult != nil {
+			tables = tablesResult
 			if len(tables.IneligibleTables) != 0 {

This ensures tables remains a valid empty struct when the API call fails.

🧹 Nitpick comments (3)
cmd/cdc/cli/cli_changefeed_helper_test.go (1)

91-109: Good test coverage, consider adding edge case.

The tests cover nil, empty, single, and multiple entries well. The IsPartition: true case correctly verifies the partition flag doesn't affect formatting.

Consider adding a test case for when Schema is empty to exercise the else branch in formatTableName:

📝 Suggested additional test case
 	require.Equal(t,
 		"[test.t1, test.t2]",
 		formatTableNames([]v2.TableName{
 			{Schema: "test", Table: "t1", TableID: 110},
 			{Schema: "test", Table: "t2", TableID: 111, IsPartition: true},
 		}),
 	)
+
+	// Test with empty schema
+	require.Equal(t,
+		"[t1]",
+		formatTableNames([]v2.TableName{{Schema: "", Table: "t1", TableID: 112}}),
+	)
 }
cmd/cdc/cli/cli_changefeed_update.go (1)

177-178: Consider improving user-facing warning output.

When err1 is nil, the output will show Warning: <nil>, which is confusing. When non-nil, a raw Go error string is displayed. Consider conditionally formatting the warning:

♻️ Suggested improvement
+	warningMsg := ""
+	if err1 != nil {
+		warningMsg = err1.Error()
+	}
 	cmd.Printf("Update changefeed config successfully! "+
-		"\nID: %s\nInfo: %s\nIneligibleTablesCount: %d\nEligibleTablesCount: %d\nAllTablesCount: %d\nWarning: %s\n", info.ID, infoStr, len(tables.IneligibleTables), len(tables.EligibleTables), len(tables.AllTables), err1)
+		"\nID: %s\nInfo: %s\nIneligibleTablesCount: %d\nEligibleTablesCount: %d\nAllTablesCount: %d\nWarning: %s\n", info.ID, infoStr, len(tables.IneligibleTables), len(tables.EligibleTables), len(tables.AllTables), warningMsg)
cmd/cdc/cli/cli_changefeed_resume.go (1)

243-244: Consider improving user-facing warning output.

Same issue as in cli_changefeed_update.go: when err1 is nil, the output shows Warning: <nil>. Consider conditionally formatting the warning message for better UX.

Signed-off-by: wk989898 <[email protected]>
@ti-chi-bot ti-chi-bot bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 28, 2026
@wk989898
Copy link
Collaborator Author

/test light

Signed-off-by: wk989898 <[email protected]>
@wk989898
Copy link
Collaborator Author

/test light

Signed-off-by: wk989898 <[email protected]>
@wk989898
Copy link
Collaborator Author

/test light

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py`:
- Around line 211-221: The update_changefeed function's rq.put calls can hang
indefinitely; update every rq.put within update_changefeed to include a per-call
timeout parameter (timeout=5) so each PUT (the ones that pass data, auth=Auth,
headers, cert=CERT, verify=VERIFY) will fail fast on stalls—locate the rq.put
invocations in update_changefeed and add timeout=5 to each call.
- Around line 275-279: The resume error test omits the JSON body and
Content-Type header so the handler's WithBody(cfg) path is exercised
differently; update the error-case POST to include the same data and headers
used in the success case (use the existing data variable json.dumps({}) and
headers = {"Content-Type": "application/json"}) and pass them into rq.post
(alongside auth, cert, verify) so the resume endpoint receives an empty JSON
body consistent with the success test.

In `@tests/integration_tests/http_api_tls/util/test_case.py`:
- Around line 272-275: The failure path for the "changefeed not exists" request
is missing the JSON body, Content-Type header, and a timeout, causing
inconsistent behavior; update the second rq.post call (the changefeed not-exists
request) to pass the same data=json.dumps({}) and headers={"Content-Type":
"application/json"} and add timeout=5, and also add timeout=5 to the earlier
success-path rq.post call that uses data, headers, CERT, VERIFY so both requests
use the same payload/headers and a 5s timeout (refer to the variables data,
headers, rq.post, CERT, VERIFY).
🧹 Nitpick comments (1)
tests/integration_tests/http_api_tls/util/test_case.py (1)

211-221: Add per-call timeouts to the update_changefeed PUT requests.

Both rq.put() calls at lines 220 and 234 lack timeout parameters and can block indefinitely. Add timeout=5 to prevent integration tests from hanging.

⚙️ Suggested patch
-    resp = rq.put(url, data=data, headers=headers, cert=CERT, verify=VERIFY)
+    resp = rq.put(url, data=data, headers=headers, cert=CERT, verify=VERIFY, timeout=5)

Also applies to: line 234

Comment on lines +211 to 221
data = json.dumps({
"replica_config": {
"ignore_ineligible_table": True,
"mounter": {
"worker_num": 32
}
}
})
headers = {"Content-Type": "application/json"}
resp = rq.put(url, data=data, auth=Auth,
headers=headers, cert=CERT, verify=VERIFY)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, verify the file exists and check its structure
find . -type f -name "test_case.py" | grep -E "http_api_tls_with_user_auth.*util" | head -5

Repository: pingcap/ticdc

Length of output: 128


🏁 Script executed:

# Get total line count and examine the area around lines 211-236
wc -l tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py

Repository: pingcap/ticdc

Length of output: 130


🏁 Script executed:

# Read the specific lines mentioned (211-236) to verify the code
sed -n '200,250p' tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py | cat -n

Repository: pingcap/ticdc

Length of output: 2126


🏁 Script executed:

# Search for all rq.put and rq.post calls in the file to see timeout patterns
rg -n "rq\.(put|post|get)" tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py

Repository: pingcap/ticdc

Length of output: 1838


🏁 Script executed:

# Check more carefully around the lines mentioned for exact context
sed -n '205,240p' tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py | cat -n

Repository: pingcap/ticdc

Length of output: 1392


🏁 Script executed:

# Search for ANY timeout usage in the entire file
rg "timeout" tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py

Repository: pingcap/ticdc

Length of output: 120


🏁 Script executed:

# Get a more complete view of all request calls with context to see timeout patterns
sed -n '1,100p' tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py | rg -A2 "rq\.(get|post|put)"

Repository: pingcap/ticdc

Length of output: 527


🏁 Script executed:

# Find which line has the timeout=5 that was found
rg -n "timeout" tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py -B2 -A2

Repository: pingcap/ticdc

Length of output: 329


🏁 Script executed:

# Read lines 220-240 to see both PUT calls mentioned in the review
sed -n '210,245p' tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py | cat -n

Repository: pingcap/ticdc

Length of output: 1573


🏁 Script executed:

# Check if there's a session or config that sets default timeout
head -50 tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py | cat -n

Repository: pingcap/ticdc

Length of output: 1547


🏁 Script executed:

# Count how many PUT calls are in update_changefeed function
sed -n '208,300p' tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py | rg -n "rq\.put"

Repository: pingcap/ticdc

Length of output: 248


🏁 Script executed:

# Get exact line numbers of all PUT calls in the update_changefeed function
rg -n "def update_changefeed" tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py

Repository: pingcap/ticdc

Length of output: 85


🏁 Script executed:

# Now find the next function to know where update_changefeed ends
sed -n '208,350p' tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py | cat -n

Repository: pingcap/ticdc

Length of output: 5790


Add per-call timeouts to all update_changefeed PUT requests.
These PUTs can hang CI if the endpoint stalls; a small timeout keeps the test suite bounded.

The update_changefeed function has four PUT calls (lines 220, 235, 244, and 265) that lack timeout parameters. Apply timeout=5 to all four to prevent indefinite hangs during test execution.

⚙️ Suggested patch
    resp = rq.put(url, data=data, auth=Auth,
-                  headers=headers, cert=CERT, verify=VERIFY)
+                  headers=headers, cert=CERT, verify=VERIFY, timeout=5)

Apply to all four PUT calls at lines 220, 235, 244, and 265.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
data = json.dumps({
"replica_config": {
"ignore_ineligible_table": True,
"mounter": {
"worker_num": 32
}
}
})
headers = {"Content-Type": "application/json"}
resp = rq.put(url, data=data, auth=Auth,
headers=headers, cert=CERT, verify=VERIFY)
data = json.dumps({
"replica_config": {
"ignore_ineligible_table": True,
"mounter": {
"worker_num": 32
}
}
})
headers = {"Content-Type": "application/json"}
resp = rq.put(url, data=data, auth=Auth,
headers=headers, cert=CERT, verify=VERIFY, timeout=5)
🧰 Tools
🪛 Ruff (0.14.14)

220-220: Probable use of requests call without timeout

(S113)

🤖 Prompt for AI Agents
In `@tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py` around
lines 211 - 221, The update_changefeed function's rq.put calls can hang
indefinitely; update every rq.put within update_changefeed to include a per-call
timeout parameter (timeout=5) so each PUT (the ones that pass data, auth=Auth,
headers, cert=CERT, verify=VERIFY) will fail fast on stalls—locate the rq.put
invocations in update_changefeed and add timeout=5 to each call.

Comment on lines +275 to 279
data = json.dumps({})
headers = {"Content-Type": "application/json"}
resp = rq.post(url, data=data, auth=Auth, headers=headers,
cert=CERT, verify=VERIFY)
assert resp.status_code == rq.codes.ok
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, let's find and examine the test file
git ls-files | grep -E "test_case\.py"

Repository: pingcap/ticdc

Length of output: 232


🏁 Script executed:

#!/bin/bash
# Read the specific lines from the test file to understand the context
sed -n '270,300p' tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py

Repository: pingcap/ticdc

Length of output: 1178


🏁 Script executed:

#!/bin/bash
# Get more context around the mentioned lines to understand what endpoints are being called
sed -n '260,310p' tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py

Repository: pingcap/ticdc

Length of output: 1960


🏁 Script executed:

#!/bin/bash
# Search for references to the resume endpoint and what it expects
rg -B 5 -A 5 "resume" tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py | head -100

Repository: pingcap/ticdc

Length of output: 1881


🏁 Script executed:

#!/bin/bash
# Check if timeout is used elsewhere in this test file or similar tests
rg "timeout=" tests/integration_tests/http_api_tls_with_user_auth/ | head -20

Repository: pingcap/ticdc

Length of output: 190


🏁 Script executed:

#!/bin/bash
# Check timeout patterns across test files
rg "timeout=" tests/integration_tests/http_api* | head -30

Repository: pingcap/ticdc

Length of output: 309


🏁 Script executed:

#!/bin/bash
# Look for API documentation or handler for the resume endpoint
rg -B 5 -A 10 "resume" cdc/ | grep -A 10 -B 5 "json" | head -60

Repository: pingcap/ticdc

Length of output: 101


🏁 Script executed:

#!/bin/bash
# Check if there's documentation about POST endpoint requirements
find . -name "*.go" -o -name "*.md" | xargs grep -l "resume.*changefeed" 2>/dev/null | head -10

Repository: pingcap/ticdc

Length of output: 391


🏁 Script executed:

#!/bin/bash
# Let's get the exact line numbers where timeout=5 appears in the tls_with_user_auth test
rg -n "timeout=5" tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py

Repository: pingcap/ticdc

Length of output: 124


🏁 Script executed:

#!/bin/bash
# Let's examine the actual API handler for the resume endpoint
sed -n '1,100p' pkg/api/v2/changefeed.go | head -50

Repository: pingcap/ticdc

Length of output: 2109


🏁 Script executed:

#!/antml:bash
# Search for the resume handler specifically
rg -B 5 -A 20 "func.*[Rr]esume" pkg/api/v2/changefeed.go

Repository: pingcap/ticdc

Length of output: 682


🏁 Script executed:

#!/bin/bash
# Check what's at line 383 (where timeout=5 appears)
sed -n '375,390p' tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py

Repository: pingcap/ticdc

Length of output: 629


🏁 Script executed:

#!/an/bash
# Let's check what the ResumeChangefeedConfig looks like
rg -B 2 -A 10 "type ResumeChangefeedConfig"

Repository: pingcap/ticdc

Length of output: 1641


🏁 Script executed:

#!/an/bash
# Check the entire resume_changefeed function with exact line numbers
sed -n '269,310p' tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py | cat -n

Repository: pingcap/ticdc

Length of output: 1985


🏁 Script executed:

#!/bin/bash
# Let me verify the exact line numbers by looking at the raw file with line numbers
nl -ba tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py | sed -n '270,300p'

Repository: pingcap/ticdc

Length of output: 97


🏁 Script executed:

#!/an/bash
# Check how the API server actually handles requests without body
rg -B 10 -A 5 "WithBody" pkg/api/v2/changefeed.go

Repository: pingcap/ticdc

Length of output: 1784


🏁 Script executed:

#!/an/bash
# Let me look at the exact lines in the file to get precise line numbers
cat -n tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py | sed -n '269,310p'

Repository: pingcap/ticdc

Length of output: 1985


🏁 Script executed:

#!/an/bash
# Let me check if there are other places where timeout is consistently added for POST requests
rg -B 3 -A 1 'rq\.post.*timeout' tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py | head -40

Repository: pingcap/ticdc

Length of output: 39


🏁 Script executed:

#!/an/bash
# Check the pattern in the other test files (http_api_tls without user auth)
rg -B 3 -A 1 'rq\.post.*timeout' tests/integration_tests/http_api_tls/util/test_case.py | head -40

Repository: pingcap/ticdc

Length of output: 39


🏁 Script executed:

#!/an/bash
# Look for the actual error when JSON body is missing - check if there's any server-side validation
rg -B 5 -A 10 "Resume.*error\|ErrChangeFeedNotExists" api/v2/ pkg/api/

Repository: pingcap/ticdc

Length of output: 39


🏁 Script executed:

#!/bash
# Check if there are consistent patterns - look at all POST calls in the resume_changefeed and surrounding functions
sed -n '269,320p' tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py | grep -A 2 "rq.post\|rq.put\|rq.delete"

Repository: pingcap/ticdc

Length of output: 573


🏁 Script executed:

#!/bash
# Let me check the http_api (non-tls) test to see if there's a similar pattern or if it was already fixed
sed -n '269,310p' tests/integration_tests/http_api/util/test_case.py | cat -n

Repository: pingcap/ticdc

Length of output: 1751


🏁 Script executed:

#!/bash
# Find the resume_changefeed function in the http_api test file
rg -n "def resume_changefeed" tests/integration_tests/http_api/util/test_case.py

Repository: pingcap/ticdc

Length of output: 85


🏁 Script executed:

#!/bash
# Get the resume_changefeed function from http_api test file
sed -n '203,250p' tests/integration_tests/http_api/util/test_case.py

Repository: pingcap/ticdc

Length of output: 1654


🏁 Script executed:

#!/bash
# Check what line the resume_changefeed starts at in http_api test
grep -n "def resume_changefeed" tests/integration_tests/http_api/util/test_case.py

Repository: pingcap/ticdc

Length of output: 85


🏁 Script executed:

#!/bash
# Get the actual resume_changefeed function from http_api test file
sed -n '324,360p' tests/integration_tests/http_api/util/test_case.py

Repository: pingcap/ticdc

Length of output: 1465


🏁 Script executed:

#!/bash
# Let me check if there's a recent commit or comment explaining the change to the resume endpoint
rg -B 3 -A 3 "resume.*json\|json.*resume" api/v2/changefeed.go pkg/api/v2/

Repository: pingcap/ticdc

Length of output: 39


🏁 Script executed:

#!/bash
# Check how other error cases in the same test file handle the JSON body for consistency
sed -n '200,340p' tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py | grep -B 5 "error\|bad_request\|ErrChangeFeed"

Repository: pingcap/ticdc

Length of output: 1106


🏁 Script executed:

#!/bash
# Let me check the pause_changefeed function to see the pattern for other error cases
sed -n '230,270p' tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py

Repository: pingcap/ticdc

Length of output: 1466


🏁 Script executed:

#!/bash
# Check if pause endpoint also has similar pattern (success vs error case)
grep -B 10 "pause.*not-exists" tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py

Repository: pingcap/ticdc

Length of output: 39


🏁 Script executed:

#!/bash
# Check if there are any recent changes to how the resume endpoint is being tested
git log --oneline -20 tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py

Repository: pingcap/ticdc

Length of output: 69


Add JSON body and headers to the resume error test case for consistency.

The resume endpoint handler calls WithBody(cfg), expecting a JSON body. The error test case at lines 293-295 omits the JSON body, unlike the success case at lines 275-279. This inconsistency can cause different server behavior and mask the intended error. Add the empty JSON body and headers to match the success case.

⚙️ Suggested patch
     # test resume changefeed failed
     url = BASE_URL0 + "/changefeeds/changefeed-not-exists/resume?keyspace=keyspace1"
+    data = json.dumps({})
+    headers = {"Content-Type": "application/json"}
-    resp = rq.post(url, auth=Auth, cert=CERT, verify=VERIFY)
+    resp = rq.post(url, data=data, auth=Auth, headers=headers,
+                   cert=CERT, verify=VERIFY)

Also applies to: 293-295

🧰 Tools
🪛 Ruff (0.14.14)

277-277: Probable use of requests call without timeout

(S113)

🤖 Prompt for AI Agents
In `@tests/integration_tests/http_api_tls_with_user_auth/util/test_case.py` around
lines 275 - 279, The resume error test omits the JSON body and Content-Type
header so the handler's WithBody(cfg) path is exercised differently; update the
error-case POST to include the same data and headers used in the success case
(use the existing data variable json.dumps({}) and headers = {"Content-Type":
"application/json"}) and pass them into rq.post (alongside auth, cert, verify)
so the resume endpoint receives an empty JSON body consistent with the success
test.

Comment on lines +272 to 275
data = json.dumps({})
headers = {"Content-Type": "application/json"}
resp = rq.post(url, data=data, headers=headers, cert=CERT, verify=VERIFY)
assert resp.status_code == rq.codes.ok
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cat -n tests/integration_tests/http_api_tls/util/test_case.py | sed -n '265,300p'

Repository: pingcap/ticdc

Length of output: 1564


🌐 Web query:

Python requests library timeout parameter best practices POST

💡 Result:

Requests timeout best practices for POST

  1. Always set a timeout (don’t rely on the default). By default, Requests won’t time out unless you pass timeout=..., so calls can hang for a long time on bad networks/hosts. [1]

  2. Prefer a 2-tuple timeout for production: use timeout=(connect_timeout, read_timeout) so “can’t connect” and “server is slow to respond” are controlled separately. [1][2]

    • Connect timeout: time to establish the TCP connection. Requests explicitly recommends setting it to slightly larger than a multiple of 3 (TCP retransmission behavior). [1]
    • Read timeout: time between bytes from the server (usually “time to first byte”). [1]
  3. Don’t treat timeout as a total wall-clock limit. Connect/read timeouts are not guaranteed to match real elapsed time, and read timeouts measure gaps between bytes, not total download time. [1][5]

    • If you need a hard overall deadline, add an outer watchdog (process/thread/async timeout) rather than relying on timeout= alone. [5]
  4. Use retries deliberately—especially for POST.

    • Requests’ HTTPAdapter(max_retries=...) is backed by urllib3 and (by default) Requests does not retry failed connections unless you configure it. [3]
    • urllib3 retries are method-aware: by default it retries only idempotent methods (e.g., GET/PUT/DELETE), not POST. [4]
    • For POST, only retry when it’s truly safe (e.g., you use an idempotency key, or the operation is effectively idempotent, or you only retry connection failures where the server likely didn’t process the request). [4]

Practical pattern for POST

import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

session = requests.Session()

retry = Retry(
    total=3,
    connect=3,
    read=0,
    status=0,
    backoff_factor=0.5,
    allowed_methods=False,  # be explicit about POST behavior
)

session.mount("https://", HTTPAdapter(max_retries=retry))

resp = session.post(
    url,
    json=payload,
    timeout=(3.05, 30),  # (connect, read)
)
resp.raise_for_status()

Notes:

  • timeout=(3.05, 30) follows Requests’ documented tuple form and example pattern. [1][2]
  • If you do want retries on specific HTTP statuses (e.g., 429/503), configure status_forcelist and consider honoring Retry-After. [4]

Sources: Requests “Advanced Usage / Timeouts” [1], Requests API (timeout tuple definition) [2], Requests HTTPAdapter(max_retries=...) behavior [3], urllib3 Retry docs (allowed methods/idempotency) [4], urllib3 Timeout semantics (not full wall-clock) [5].


🏁 Script executed:

rg "rq\.(post|get|put|delete)" tests/integration_tests/http_api_tls/util/test_case.py -n | head -30

Repository: pingcap/ticdc

Length of output: 1944


🏁 Script executed:

rg "timeout=" tests/integration_tests/http_api_tls/util/test_case.py

Repository: pingcap/ticdc

Length of output: 120


🏁 Script executed:

rg "timeout=" tests/integration_tests/http_api_tls/util/test_case.py -B 3 -A 1

Repository: pingcap/ticdc

Length of output: 321


🏁 Script executed:

grep -n "timeout=" tests/integration_tests/http_api_tls/util/test_case.py

Repository: pingcap/ticdc

Length of output: 124


🏁 Script executed:

cat -n tests/integration_tests/http_api_tls/util/test_case.py | sed -n '370,385p'

Repository: pingcap/ticdc

Length of output: 713


🏁 Script executed:

cat -n tests/integration_tests/http_api_tls/util/test_case.py | sed -n '80,100p'

Repository: pingcap/ticdc

Length of output: 1037


🏁 Script executed:

cat -n tests/integration_tests/http_api_tls/util/test_case.py | sed -n '258,267p'

Repository: pingcap/ticdc

Length of output: 409


Add JSON body and timeout to the changefeed not-exists call for consistency and reliability.
The success path (line 272) sends a JSON body with Content-Type header, but the failure path (line 289) omits both, which may cause the server to return a format error instead of the expected "changefeed not exists" error. Additionally, add timeout=5 to both requests to prevent indefinite hangs.

⚙️ Suggested patch
     # test resume changefeed failed
     url = BASE_URL0 + "/changefeeds/changefeed-not-exists/resume?keyspace=keyspace1"
-    resp = rq.post(url, cert=CERT, verify=VERIFY)
+    data = json.dumps({})
+    headers = {"Content-Type": "application/json"}
+    resp = rq.post(url, data=data, headers=headers, cert=CERT, verify=VERIFY, timeout=5)
     assert resp.status_code == rq.codes.bad_request
🧰 Tools
🪛 Ruff (0.14.14)

274-274: Probable use of requests call without timeout

(S113)

🤖 Prompt for AI Agents
In `@tests/integration_tests/http_api_tls/util/test_case.py` around lines 272 -
275, The failure path for the "changefeed not exists" request is missing the
JSON body, Content-Type header, and a timeout, causing inconsistent behavior;
update the second rq.post call (the changefeed not-exists request) to pass the
same data=json.dumps({}) and headers={"Content-Type": "application/json"} and
add timeout=5, and also add timeout=5 to the earlier success-path rq.post call
that uses data, headers, CERT, VERIFY so both requests use the same
payload/headers and a 5s timeout (refer to the variables data, headers, rq.post,
CERT, VERIFY).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant