Skip to content

fix(agentd): use custom idle timeout annotation for garbage collection#343

Open
sicaario wants to merge 1 commit into
volcano-sh:mainfrom
sicaario:fix-agentd-timeout
Open

fix(agentd): use custom idle timeout annotation for garbage collection#343
sicaario wants to merge 1 commit into
volcano-sh:mainfrom
sicaario:fix-agentd-timeout

Conversation

@sicaario
Copy link
Copy Markdown
Contributor

What type of PR is this?

/kind bug

What this PR does / why we need it:
This PR fixes a bug where agentd ignores the custom runtime.agentcube.io/idle-timeout annotation set on a Sandbox. Previously, the garbage collection reconciliation loop used a hardcoded 15 * time.Minute value for every sandbox.

With this change, agentd successfully parses the Sandbox's custom idle timeout from the metadata annotations and correctly honors shorter or longer custom session durations. It will fall back to the default 15 * time.Minute if the annotation is missing or improperly formatted.

It also introduces new unit test coverage inside agentd_test.go to explicitly verify custom short timeouts, custom long timeouts, and invalid timeout fallback paths.

Which issue(s) this PR fixes:
Fixes #342

Special notes for your reviewer:

  • All changes covered by unit tests.

Does this PR introduce a user-facing change?:

Fixes an issue where custom Sandbox idle timeouts (`runtime.agentcube.io/idle-timeout`) were ignored by the garbage collector and always fell back to 15 minutes.

Copilot AI review requested due to automatic review settings May 17, 2026 15:09
@volcano-sh-bot volcano-sh-bot added the kind/bug Something isn't working label May 17, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for custom idle timeouts in the agent reconciler via sandbox annotations and includes comprehensive tests for this new functionality. It also updates RSA key comparisons in tests to handle precomputed values and corrects a status code comment in the server middleware. Feedback was provided to improve the robustness of the RSA key verification by comparing prime factors and using more idiomatic big integer comparisons.

Comment thread pkg/router/jwt_test.go Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes agentd sandbox garbage collection to honor per-Sandbox idle timeout overrides (runtime.agentcube.io/idle-timeout) instead of always using the hardcoded 15-minute default, aligning behavior with how the Workload Manager annotates sandboxes for custom session durations (Issue #342).

Changes:

  • Parse runtime.agentcube.io/idle-timeout from Sandbox annotations in agentd and use it to compute GC expiration (fallback to default on missing/invalid values).
  • Add unit tests covering short/long custom timeouts and invalid-timeout fallback behavior in agentd.
  • Minor Router/JWT test adjustments (comment correction + RSA key equality comparison changes).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
pkg/agentd/agentd.go Uses Sandbox runtime.agentcube.io/idle-timeout annotation to compute expiration time for GC.
pkg/agentd/agentd_test.go Adds unit test coverage for custom/invalid idle-timeout handling.
pkg/router/server.go Updates middleware comment to match the 429 response code used.
pkg/router/jwt_test.go Makes private key PEM test compare RSA components instead of struct equality.

Comment thread pkg/agentd/agentd.go Outdated
Comment thread pkg/router/server.go
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 17, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 49.17%. Comparing base (524e55e) to head (e9eb09e).
⚠️ Report is 54 commits behind head on main.
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #343      +/-   ##
==========================================
+ Coverage   47.57%   49.17%   +1.60%     
==========================================
  Files          30       30              
  Lines        2819     2861      +42     
==========================================
+ Hits         1341     1407      +66     
+ Misses       1338     1301      -37     
- Partials      140      153      +13     
Flag Coverage Δ
unittests 49.17% <100.00%> (+1.60%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sicaario sicaario force-pushed the fix-agentd-timeout branch from 0e138fe to e9eb09e Compare May 18, 2026 02:50
@volcano-sh-bot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign lizhencheng9527 for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sicaario
Copy link
Copy Markdown
Contributor Author

Hi @acsoto, @YaoZengzeng

I have updated this PR to completely streamline its scope and address the automated review feedback.
I would really appreciate it if you could take a look and share your thoughts. Thank you for your guidance and time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bug Something isn't working size/M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

agentd: custom idle timeout annotation is ignored by garbage collector

4 participants