-
Notifications
You must be signed in to change notification settings - Fork 70
feat: remove default 8h max TTL for AgentRuntime sandboxes #360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
360fd5a
ca81bed
71a3bdf
19ea49b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -53,8 +53,9 @@ type AgentRuntimeSpec struct { | |
|
|
||
| // MaxSessionDuration describes the maximum duration for a session. | ||
| // After this duration, the session will be terminated no matter active or inactive. | ||
| // +kubebuilder:validation:Required | ||
| // +kubebuilder:default="8h" | ||
| // When omitted, sessions have no maximum duration and will only be cleaned up | ||
| // by idle timeout (SessionTimeout). | ||
| // +optional | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This needs the generated CRD artifact update as well. Right now the installed AgentRuntime CRD still has |
||
| MaxSessionDuration *metav1.Duration `json:"maxSessionDuration,omitempty" protobuf:"bytes,3,opt,name=maxSessionDuration"` | ||
| } | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -155,16 +155,16 @@ func (rs *redisStore) StoreSandbox(ctx context.Context, sandboxRedis *types.Sand | |
| return fmt.Errorf("StoreSandbox: marshal sandbox failed: %w", err) | ||
| } | ||
|
|
||
| if sandboxRedis.ExpiresAt.IsZero() { | ||
| return fmt.Errorf("StoreSandbox: sandbox expired at is zero") | ||
| } | ||
|
|
||
| pipe := rs.cli.Pipeline() | ||
| pipe.SetNX(ctx, sessionKey, b, 0) | ||
| pipe.ZAdd(ctx, rs.expiryIndexKey, redisv9.Z{ | ||
| Score: float64(sandboxRedis.ExpiresAt.Unix()), | ||
| Member: sandboxRedis.SessionID, | ||
| }) | ||
| // Only add to expiry index when ExpiresAt is set. | ||
| // Zero ExpiresAt means the sandbox has no maximum lifetime. | ||
| if !sandboxRedis.ExpiresAt.IsZero() { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we add store tests for the zero
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes ofc we can i was fixing the failure test cases till now i would add store test for the zero expireat now |
||
| pipe.ZAdd(ctx, rs.expiryIndexKey, redisv9.Z{ | ||
|
Comment on lines
159
to
+163
|
||
| Score: float64(sandboxRedis.ExpiresAt.Unix()), | ||
| Member: sandboxRedis.SessionID, | ||
| }) | ||
| } | ||
|
Comment on lines
158
to
+167
|
||
| pipe.ZAdd(ctx, rs.lastActivityIndexKey, redisv9.Z{ | ||
| Score: float64(time.Now().Unix()), | ||
| Member: sandboxRedis.SessionID, | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -174,16 +174,28 @@ func (vs *valkeyStore) StoreSandbox(ctx context.Context, sandboxStore *types.San | |
| return errors.New("StoreSandbox: sandbox is nil") | ||
| } | ||
|
|
||
| if sandboxStore.ExpiresAt.IsZero() { | ||
| return fmt.Errorf("StoreSandbox: sandbox expires time is zero") | ||
| } | ||
|
|
||
| sessionKey := vs.sessionKey(sandboxStore.SessionID) | ||
| b, err := json.Marshal(sandboxStore) | ||
| if err != nil { | ||
| return fmt.Errorf("StoreSandbox: marshal sandbox: %w", err) | ||
| } | ||
|
|
||
| if sandboxStore.ExpiresAt.IsZero() { | ||
| // Zero ExpiresAt means no maximum lifetime — skip expiry index. | ||
| // The sandbox will still be tracked in the activity index for idle timeout. | ||
| commands := make(valkey.Commands, 0, 3) | ||
| commands = append(commands, vs.cli.B().Setnx().Key(sessionKey).Value(string(b)).Build()) | ||
|
Comment on lines
+183
to
+187
|
||
| commands = append(commands, vs.cli.B().Zadd().Key(vs.lastActivityIndexKey).ScoreMember(). | ||
| ScoreMember(float64(time.Now().Unix()), sandboxStore.SessionID).Build()) | ||
|
|
||
| for i, resp := range vs.cli.DoMulti(ctx, commands...) { | ||
| if err = resp.Error(); err != nil { | ||
| return fmt.Errorf("StoreSandbox: DoMulti failed: %w, command index: %v", err, i) | ||
| } | ||
|
Comment on lines
+183
to
+194
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am seeing this is a little different from redis, is that because valkey sdk does not align with redis?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the structural difference is purely due to the Valkey SDK's API design, not a logic difference. Redis (go-redis/v9) has a Pipeline() abstraction that lets you queue a variable number of commands and skip some conditionally before calling Exec(): Valkey (valkey-go) uses DoMulti() which takes a fixed slice of commands built upfront. Since Go slices can't have "holes", the cleanest pattern with this SDK is to branch the two cases (zero vs non-zero ExpiresAt) into separate command slices with an early return, rather than trying to conditionally append inside a single block: The end result is identical behaviour — zero ExpiresAt skips the expiry index in both stores. The branching is an SDK-driven implementation detail, not a semantic difference. |
||
| } | ||
| return nil | ||
| } | ||
|
Comment on lines
+183
to
+197
|
||
|
|
||
| commands := make(valkey.Commands, 0, 5) | ||
| commands = append(commands, vs.cli.B().Setnx().Key(sessionKey).Value(string(b)).Build()) | ||
| commands = append(commands, vs.cli.B().Zadd().Key(vs.expiryIndexKey).ScoreMember(). | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -40,7 +40,7 @@ func TestBuildSandboxPlaceHolder_TableDriven(t *testing.T) { | |
| validate func(t *testing.T, result *types.SandboxInfo) | ||
| }{ | ||
| { | ||
| name: "no ShutdownTime falls back to DefaultSandboxTTL", | ||
| name: "no ShutdownTime means no expiry (sandbox runs indefinitely)", | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why not remove all DefaultSandboxTTL in ut test like in workload_builder_test.go L44 since we are not using it anymore |
||
| setupSandbox: func() *sandboxv1alpha1.Sandbox { | ||
| return &sandboxv1alpha1.Sandbox{ | ||
| ObjectMeta: metav1.ObjectMeta{ | ||
|
|
@@ -54,8 +54,8 @@ func TestBuildSandboxPlaceHolder_TableDriven(t *testing.T) { | |
| SessionID: "session-123", | ||
| }, | ||
| validate: func(t *testing.T, result *types.SandboxInfo) { | ||
| expected := now.Add(DefaultSandboxTTL) | ||
| assert.WithinDuration(t, expected, result.ExpiresAt, 2*time.Second) | ||
| assert.True(t, result.ExpiresAt.IsZero(), | ||
| "ExpiresAt should be zero when no MaxSessionDuration is set") | ||
| assert.Equal(t, "creating", result.Status) | ||
| assert.Equal(t, "session-123", result.SessionID) | ||
| }, | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -153,15 +153,10 @@ type buildSandboxClaimParams struct { | |
|
|
||
| // buildSandboxObject builds a Sandbox object from parameters | ||
| func buildSandboxObject(params *buildSandboxParams) *sandboxv1alpha1.Sandbox { | ||
| if params.ttl == 0 { | ||
| params.ttl = DefaultSandboxTTL | ||
| } | ||
| if params.idleTimeout == 0 { | ||
| params.idleTimeout = DefaultSandboxIdleTimeout | ||
| } | ||
|
|
||
| shutdownTime := metav1.NewTime(time.Now().Add(params.ttl)) | ||
|
|
||
| // Allocate fresh maps for copied metadata so we never mutate informer-cached input. | ||
| // Annotations are only copied when params.podAnnotations is non-nil. | ||
| podLabels := make(map[string]string, len(params.podLabels)+2) | ||
|
|
@@ -198,12 +193,18 @@ func buildSandboxObject(params *buildSandboxParams) *sandboxv1alpha1.Sandbox { | |
| Annotations: podAnnotations, | ||
| }, | ||
| }, | ||
| Lifecycle: sandboxv1alpha1.Lifecycle{ | ||
| ShutdownTime: &shutdownTime, | ||
| }, | ||
| Replicas: ptr.To[int32](1), | ||
| }, | ||
| } | ||
|
|
||
| // Only set ShutdownTime when MaxSessionDuration is explicitly configured. | ||
| // When omitted (ttl == 0), the sandbox runs indefinitely and is only | ||
| // cleaned up by idle timeout (SessionTimeout). | ||
| if params.ttl > 0 { | ||
| shutdownTime := metav1.NewTime(time.Now().Add(params.ttl)) | ||
| sandbox.Spec.Lifecycle.ShutdownTime = &shutdownTime | ||
| } | ||
|
Comment on lines
+203
to
+206
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. By removing the fallback to To maintain the 8h default guarantee for
Comment on lines
+200
to
+206
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This makes
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 It's better to align with CodeInterpreter.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @HarshitPal25 ptal |
||
|
|
||
| return sandbox | ||
| } | ||
|
|
||
|
|
@@ -400,10 +401,15 @@ func buildSandboxByCodeInterpreter(namespace string, codeInterpreterName string, | |
| }, | ||
| }, | ||
| } | ||
| // CodeInterpreter always has a hard cap: use explicit MaxSessionDuration or | ||
| // fall back to DefaultSandboxTTL (8h). This differs from AgentRuntime where | ||
| // omitting MaxSessionDuration means no hard expiry. | ||
| ttl := DefaultSandboxTTL | ||
| if codeInterpreterObj.Spec.MaxSessionDuration != nil { | ||
| shutdownTime := metav1.NewTime(time.Now().Add(codeInterpreterObj.Spec.MaxSessionDuration.Duration)) | ||
| simpleSandbox.Spec.Lifecycle.ShutdownTime = &shutdownTime | ||
| ttl = codeInterpreterObj.Spec.MaxSessionDuration.Duration | ||
| } | ||
| shutdownTime := metav1.NewTime(time.Now().Add(ttl)) | ||
| simpleSandbox.Spec.Lifecycle.ShutdownTime = &shutdownTime | ||
| sandboxEntry.Kind = types.SandboxClaimsKind | ||
| return simpleSandbox, sandboxClaim, sandboxEntry, nil | ||
| } | ||
|
|
@@ -443,8 +449,13 @@ func buildSandboxByCodeInterpreter(namespace string, codeInterpreterName string, | |
| idleTimeout: idleTimeout, | ||
| } | ||
|
|
||
| // CodeInterpreter always has a hard cap: use explicit MaxSessionDuration or | ||
| // fall back to DefaultSandboxTTL (8h). This differs from AgentRuntime where | ||
| // omitting MaxSessionDuration means no hard expiry. | ||
| if codeInterpreterObj.Spec.MaxSessionDuration != nil { | ||
| buildParams.ttl = codeInterpreterObj.Spec.MaxSessionDuration.Duration | ||
| } else { | ||
| buildParams.ttl = DefaultSandboxTTL | ||
| } | ||
| sandbox := buildSandboxObject(buildParams) | ||
| return sandbox, nil, sandboxEntry, nil | ||
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. never used? |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -100,3 +100,4 @@ spec: | |
| sessionTimeout: "15m" | ||
| maxSessionDuration: "8h" | ||
| status: {} | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also upd example/agent-runtime/agent-runtime.yaml