Skip to content

OpenClaw skilify worker re-spawns on every agent_end (cumulative ~2-5 wasted spawns per long session) #100

@efenocchi

Description

@efenocchi

Summary

After PR #98 wired the skilify worker into OpenClaw's agent_end hook (commit 7900eb7), a long OpenClaw session (e.g. 50 turns) ends up spawning the worker 2-5 times instead of just once. Each spawn beyond the first is functionally a no-op (the worker's watermark advance prevents re-mining), but each still costs:

  • ~50ms Node cold-start
  • ~200ms SQL roundtrip for the watermark check query
  • One child_process fork

…times 2-5 spawns per long session. Cumulative wasted overhead: ~1-2s per session, plus process-tree churn.

Current behavior

openclaw/src/index.ts agent_end handler calls spawnOpenclawSkilifyWorker unconditionally on every turn. The spawn function uses a per-projectKey filesystem lock (~/.deeplake/state/skilify/<projectKey>.worker.lock) to prevent overlapping workers — but the lock is RELEASED when each worker exits, so subsequent agent_ends within the same session can re-acquire it and spawn fresh workers.

In a 50-turn session:

  1. Turn 1 → lock acquired → real worker spawn (~30s actual mining)
  2. Turns 2-X (during the ~30s) → lock held → cheap skip
  3. Worker finishes → lock released
  4. Turn X+1 → lock re-acquired → fresh worker spawn → 1 SQL query → exit (no work to do)
  5. Repeat 3-4 every time the previous worker finishes

Proposed fix (option B from PR #98 review)

Add a per-runtime in-memory Set<string> of session IDs that have already triggered a spawn:

// alongside the existing capturedCounts Map
const skilifySpawnedFor = new Set<string>();

// in agent_end handler, replace unconditional spawn with:
if (!skilifySpawnedFor.has(sid)) {
  skilifySpawnedFor.add(sid);
  try {
    spawnOpenclawSkilifyWorker({ ... });
  } catch (e: any) { ... }
}

~5 lines. One spawn per session lifetime per OpenClaw runtime instance. Memory cost: ~32 bytes per session ID, well below any realistic openclaw uptime.

Trade-offs considered

  • Option A (counter file ~/.deeplake/state/skilify/<key>.openclaw.counter bumped on every agent_end, fire when counter % N === 0): rejected — ~30 lines of file I/O on every turn, persistent state across runtime restarts that nobody asked for, and the cadence is arbitrary on a single-session-at-a-time runtime
  • Option B (in-memory Set, recommended): ~5 lines, no file I/O, robust under runtime restart (the worker's watermark state persists in ~/.deeplake/state/skilify/<projectKey>.json, so a re-spawn after restart costs ~250ms and doesn't redo any actual mining)
  • Option C (hybrid Set + interval): overkill for openclaw's "one session at a time" assumption

Bundle-scan test to add

it("openclaw deduplicates skilify spawns by session_id (no re-fire within same session)", () => {
  const text = readFileSync(resolve(BUNDLE_ROOT, "openclaw", "src", "index.ts"), "utf-8");
  expect(text).toMatch(/skilifySpawnedFor\s*=\s*new Set/);
  expect(text).toMatch(/skilifySpawnedFor\.has\(sid\)/);
  expect(text).toMatch(/skilifySpawnedFor\.add\(sid\)/);
});

Background

Surfaced during PR #98 reviews while validating per-agent skilify mining parity across CC / Codex / Cursor / Hermes / Pi / OpenClaw. The "fire on every agent_end with lock" pattern is functionally correct but sub-optimal — the lock prevents overlapping workers, but doesn't prevent serially redundant ones within the same session. Pi has the same surface (session_shutdown fires once at session end) but fires only ONCE per session naturally, so doesn't need this fix.

Acceptance criteria

  • One real worker spawn per OpenClaw session (verified by bundle scan + ideally a runtime test mocking spawn)
  • Watermark advance still works correctly across runtime restarts (re-spawn after restart is acceptable, redundant mining is not)
  • No regression on the existing per-projectKey lock (still useful for cross-process safety)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions