Skip to content

Fix PR 513 review follow-ups#526

Merged
i386 merged 2 commits into
mainfrom
followup/pr513-open-comments
May 14, 2026
Merged

Fix PR 513 review follow-ups#526
i386 merged 2 commits into
mainfrom
followup/pr513-open-comments

Conversation

@i386
Copy link
Copy Markdown
Collaborator

@i386 i386 commented May 12, 2026

This is a follow-up to PR #513 after it landed on main. It keeps the original context-smart package work as merged and narrows this PR to the remaining review comments that were still actionable after the stale threads were closed.

The main theme is making split orchestration more deterministic under bad or stale peer metadata: coordinator terms should not be invented from shutdown generations, rejected claims should be inspectable by the caller, malformed coordinator IDs should be visible in logs, and UI participation checks should not infer identity from string prefixes.

Review Comment Coverage

Coordinator and Claim State

  • Stop requests now send coordinator_term: 0 instead of reusing shutdown_generation. These are separate concepts: shutdown generation is a local lifecycle counter, while coordinator term belongs to the coordinator claim protocol.
  • Rejected stage claims now echo the attempted claim in the ACK. That makes a rejection self-describing and avoids returning an empty placeholder or unrelated current claim to the caller.
  • Invalid StagePreparationStatus.coordinator_id values are logged with the parse error before being dropped. The previous behavior made bad gossip indistinguishable from an intentionally missing coordinator ID.

Split Planning

  • Runtime model byte planning now reports join/planning failures and falls back explicitly to filesystem sizing. The old unwrap_or(Ok(0)).unwrap_or(0) path could silently turn planning failures into zero-byte estimates.

Coordinator Topology Search

  • Node subset exploration is now lazy. plan_topology visits each subset through a callback instead of materializing every combination into a Vec<Vec<_>> first, which removes avoidable memory pressure on larger peer sets while preserving the existing ordering and selection behavior.

UI Status Handling

  • Runtime stage node_id is optional in the API/UI types.
  • Split participant detection now requires an exact node ID match. Prefix matching could misclassify nodes when IDs share a prefix, and optional node IDs should not participate in that check.

OpenAI Admission Coverage

  • Generation admission logic was extracted into a helper so the queue behavior can be tested directly.
  • Added tests for immediate lane acquisition, full queue 429 responses, timeout cleanup of queue depth, and wait-then-acquire behavior when a lane becomes available. The 429 cases also assert the OpenAI error code and Retry-After header.

Compatibility

This does not introduce a protocol shape change. The coordinator ID parse handling remains tolerant of absent or invalid IDs, and the stop request term is set to the neutral value rather than introducing a new field or semantic requirement.

Validation

  • cargo fmt --all -- --check
  • cargo check -p mesh-llm-host-runtime
  • cargo test -p skippy-coordinator --lib
  • LLAMA_STAGE_BUILD_DIR=.deps/llama-build/build-stage-abi-metal cargo test -p skippy-server --lib
  • just build

cc @michaelneale

Base automatically changed from micn/context-smarts to main May 12, 2026 08:04
@i386 i386 force-pushed the followup/pr513-open-comments branch from 0296f31 to 2e97902 Compare May 12, 2026 09:32
@i386 i386 marked this pull request as ready for review May 13, 2026 01:54
@i386 i386 requested a review from michaelneale May 13, 2026 01:54
…ments

# Conflicts:
#	crates/skippy-server/src/frontend/tests.rs
#	crates/skippy-server/src/openai.rs
Copy link
Copy Markdown
Collaborator

@michaelneale michaelneale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I think - one small thing:

falls back explicitly to filesystem sizing

do we want it to fall back or fail? are files a reasonable proxy?

@i386
Copy link
Copy Markdown
Collaborator Author

i386 commented May 13, 2026

Good call. I dropped the fallback instead of keeping filesystem sizing.

For this runtime-load path, guessing is worse than failing: layer-package refs need the package identity/manifest source bytes, and a zero or file-size estimate can make capacity planning lie. Direct GGUFs still get their normal total_model_bytes() path when planning succeeds, but if byte planning itself fails we should surface that as a load failure rather than stumble forward with a bogus estimate.

@i386 i386 merged commit 2c33448 into main May 14, 2026
21 checks passed
@i386 i386 deleted the followup/pr513-open-comments branch May 14, 2026 02:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants