Fix PR 513 review follow-ups by i386 · Pull Request #526 · Mesh-LLM/mesh-llm

i386 · 2026-05-12T06:36:40Z

This is a follow-up to PR #513 after it landed on main. It keeps the original context-smart package work as merged and narrows this PR to the remaining review comments that were still actionable after the stale threads were closed.

The main theme is making split orchestration more deterministic under bad or stale peer metadata: coordinator terms should not be invented from shutdown generations, rejected claims should be inspectable by the caller, malformed coordinator IDs should be visible in logs, and UI participation checks should not infer identity from string prefixes.

Review Comment Coverage

Coordinator and Claim State

Stop requests now send coordinator_term: 0 instead of reusing shutdown_generation. These are separate concepts: shutdown generation is a local lifecycle counter, while coordinator term belongs to the coordinator claim protocol.
Rejected stage claims now echo the attempted claim in the ACK. That makes a rejection self-describing and avoids returning an empty placeholder or unrelated current claim to the caller.
Invalid StagePreparationStatus.coordinator_id values are logged with the parse error before being dropped. The previous behavior made bad gossip indistinguishable from an intentionally missing coordinator ID.

Split Planning

Runtime model byte planning now reports join/planning failures and falls back explicitly to filesystem sizing. The old unwrap_or(Ok(0)).unwrap_or(0) path could silently turn planning failures into zero-byte estimates.

Coordinator Topology Search

Node subset exploration is now lazy. plan_topology visits each subset through a callback instead of materializing every combination into a Vec<Vec<_>> first, which removes avoidable memory pressure on larger peer sets while preserving the existing ordering and selection behavior.

UI Status Handling

Runtime stage node_id is optional in the API/UI types.
Split participant detection now requires an exact node ID match. Prefix matching could misclassify nodes when IDs share a prefix, and optional node IDs should not participate in that check.

OpenAI Admission Coverage

Generation admission logic was extracted into a helper so the queue behavior can be tested directly.
Added tests for immediate lane acquisition, full queue 429 responses, timeout cleanup of queue depth, and wait-then-acquire behavior when a lane becomes available. The 429 cases also assert the OpenAI error code and Retry-After header.

Compatibility

This does not introduce a protocol shape change. The coordinator ID parse handling remains tolerant of absent or invalid IDs, and the stop request term is set to the neutral value rather than introducing a new field or semantic requirement.

Validation

cargo fmt --all -- --check
cargo check -p mesh-llm-host-runtime
cargo test -p skippy-coordinator --lib
LLAMA_STAGE_BUILD_DIR=.deps/llama-build/build-stage-abi-metal cargo test -p skippy-server --lib
just build

cc @michaelneale

…ments # Conflicts: # crates/skippy-server/src/frontend/tests.rs # crates/skippy-server/src/openai.rs

michaelneale

ok I think - one small thing:

falls back explicitly to filesystem sizing

do we want it to fall back or fail? are files a reasonable proxy?

i386 · 2026-05-13T02:42:16Z

Good call. I dropped the fallback instead of keeping filesystem sizing.

For this runtime-load path, guessing is worse than failing: layer-package refs need the package identity/manifest source bytes, and a zero or file-size estimate can make capacity planning lie. Direct GGUFs still get their normal total_model_bytes() path when planning succeeds, but if byte planning itself fails we should surface that as a load failure rather than stumble forward with a bogus estimate.

i386 mentioned this pull request May 12, 2026

Improve context planning, admission, and coordinator fencing #513

Merged

Base automatically changed from micn/context-smarts to main May 12, 2026 08:04

fix: address PR 513 follow-up comments

2e97902

i386 force-pushed the followup/pr513-open-comments branch from 0296f31 to 2e97902 Compare May 12, 2026 09:32

ndizazzo assigned i386 May 12, 2026

i386 marked this pull request as ready for review May 13, 2026 01:54

i386 requested a review from michaelneale May 13, 2026 01:54

Merge remote-tracking branch 'mesh/main' into followup/pr513-open-com…

03c2eb3

…ments # Conflicts: # crates/skippy-server/src/frontend/tests.rs # crates/skippy-server/src/openai.rs

ndizazzo approved these changes May 13, 2026

View reviewed changes

michaelneale approved these changes May 13, 2026

View reviewed changes

i386 merged commit 2c33448 into main May 14, 2026
21 checks passed

i386 deleted the followup/pr513-open-comments branch May 14, 2026 02:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix PR 513 review follow-ups#526

Fix PR 513 review follow-ups#526
i386 merged 2 commits into
mainfrom
followup/pr513-open-comments

i386 commented May 12, 2026 •

edited

Loading

Uh oh!

michaelneale left a comment

Uh oh!

i386 commented May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

i386 commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review Comment Coverage

Coordinator and Claim State

Split Planning

Coordinator Topology Search

UI Status Handling

OpenAI Admission Coverage

Compatibility

Validation

Uh oh!

michaelneale left a comment

Choose a reason for hiding this comment

Uh oh!

i386 commented May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

i386 commented May 12, 2026 •

edited

Loading