Skip to content

Dynamic peer discovery — Kubernetes headless service (Option B from #5) #63

@kitplummer

Description

@kitplummer

Scoped follow-up to #5. Covers Option B only (Kubernetes peer discovery for in-cluster deployments). Option A (mDNS LAN) is tracked separately in #62; Option C (gossip) remains parked under #5.

Pivot from #5's original spec: #5 proposed headless Service + DNS poll + BootstrapInfo handshake. peat-mesh's existing KubernetesDiscovery takes a different (better) approach — EndpointSlice watching + annotation-carried metadata. This issue follows the peat-mesh design rather than reinventing.

Context

In-cluster deployments today must pre-share Iroh endpoint IDs at sidecar startup (--peer endpoint_id@host:port, src/main.rs:60). For autoscaling deployments — replicaset scales 3 → 5, new pods need to join the mesh — there's no mechanism for them to find existing peers or for existing peers to learn about them short of an external orchestrator calling ConnectPeer per new pod.

Prior art in peat-mesh =0.9.0-rc.9 (most of the work is already done — but feature-gated)

The Kubernetes implementation lives in peat-mesh behind the kubernetes feature flag (not currently enabled in peat-node's dep). Reusable today:

  • peat_mesh::discovery::KubernetesDiscovery — watches EndpointSlice resources via the kube API
  • peat_mesh::discovery::KubernetesDiscoveryConfig:
    • namespace: Option<String> — defaults to service-account mount, falls back to default
    • label_selector: String — defaults app=peat-mesh
    • annotation_prefix: String — defaults peat.
    • poll_interval: Duration — defaults 30s
  • extract_peers_from_endpoint_slice(...):
    • node_idendpoint.target_ref.name (the pod name)
    • addresses ← endpoint.addresses
    • port ← EndpointSlice port (defaults 8080)
    • custom metadata (e.g. relay_url, and presumably iroh endpoint ID) ← EndpointSlice annotations with the configured prefix
  • Reference wiring: peat-mesh/src/bin/peat-mesh-node.rs:152-156 (the kubernetes / k8s mode branch)

AutomergeBackend::with_iroh does not fold discovery into its config — peat-mesh keeps the two concerns parallel, leaving the consumer to construct discovery alongside the backend. peat-node follows the same pattern.

Concrete gap list (what needs to be built in peat-node)

  1. Enable the kubernetes feature on the peat-mesh dep. Cargo.toml:16 currently reads features = [\"automerge-backend\"]; add \"kubernetes\".
  2. Construct + spawn discovery in src/node.rs. Instantiate KubernetesDiscovery::new(...), take its event stream, call start, call advertise(node_id, sidecar_port), and spawn an event-consumer task that maps PeerInfo events → node.connect_peer(endpoint_id, &addresses, \"\"). Mirrors the mDNS wiring in Dynamic peer discovery — mDNS/DNS-SD (LAN, Option A from #5) #62.
  3. CLI flags in src/main.rs:
    --discovery-mode <none|mdns|kubernetes>   / PEAT_NODE_DISCOVERY_MODE (default: none)
    --discovery-namespace <ns>                / PEAT_NODE_DISCOVERY_NAMESPACE (auto from SA mount)
    --discovery-label-selector <sel>          / PEAT_NODE_DISCOVERY_LABEL_SELECTOR (default: app=peat-node)
    --discovery-annotation-prefix <prefix>    / PEAT_NODE_DISCOVERY_ANNOTATION_PREFIX (default: peat.)
    --discovery-interval <seconds>            / PEAT_NODE_DISCOVERY_INTERVAL (default: 30)
    
    Share --discovery-mode and --discovery-interval with Dynamic peer discovery — mDNS/DNS-SD (LAN, Option A from #5) #62.
  4. Iroh endpoint ID propagation into the EndpointSlice annotation — the key technical unknown. peat-mesh's extractor reads annotations off the EndpointSlice resource, not off pods. EndpointSlices are auto-managed by the kube-controller-manager and don't inherit per-pod annotations for free. Options to investigate during design:
    • a) Helm chart sets a static endpoint_id annotation on the Service → EndpointSlice mirroring carries it through. Only works if all replicas share an endpoint_id (they don't — endpoint_id is per-instance).
    • b) Each pod self-patches its own EndpointSlice annotation on startup via the kube API. Requires patch RBAC on endpointslices — viable but adds a sidecar-startup dependency on the API server.
    • c) Each pod self-patches its own pod annotation (cheaper RBAC), and we extend peat-mesh's extractor to look at the pod via target_ref.name → pod GET → annotations. Adds one API call per peer discovery cycle.
    • d) Use a deterministic keypair seeded from (formation_id, pod_name) so endpoint_id is computable from target_ref.name without any annotation lookup. Cleanest but requires a deterministic-keypair option in AutomergeBackendConfig (peat-mesh additive change).
      First spelunk should be: does peat-mesh's reference binary or operator solve this today, and how? That answer dictates everything else.
  5. RBAC manifests in chart/peat-node/templates/: new ServiceAccount, Role (or ClusterRole if cross-namespace) with get/list/watch on endpointslices.discovery.k8s.io, and RoleBinding. Optional additional patch on endpointslices (option 4b) or pods (option 4c) depending on gap 4's resolution.
  6. Service / Deployment labels. app=peat-node label on the Deployment so the default label_selector works without forcing operators to think about it.
  7. Self-filter + dedup in the event consumer. Skip own pod, skip already-connected peers.
  8. Graceful degradation. kube API unreachable (e.g., running outside a cluster) → log + continue, do not fail startup. The --discovery-mode=none path must remain the safe default.
  9. Helm chart values (chart/peat-node/values.yaml): discovery.mode, discovery.namespace, discovery.labelSelector, discovery.annotationPrefix, discovery.interval. Default mode: none to stay backward-compatible.
  10. Docs: README.md config table + docs/CONFIGURATION.md deployment example for in-cluster discovery, including a complete Deployment + Service + RBAC manifest example.
  11. Tests:
    • Unit: extract_peers_from_endpoint_slice is already covered in peat-mesh; peat-node tests cover the event-consumer dedup/self-filter.
    • Integration (extend test/cross-cluster-sync.sh or new k3d test): 3 sidecar replicas in a Deployment with no static --peer flags converge to a fully-connected mesh; scale to 5 and verify new pods join within --discovery-interval.

Acceptance

  • cargo build and cargo test green
  • helm template chart/peat-node renders cleanly with discovery.mode=kubernetes and produces the RBAC + Service + Deployment manifests
  • Existing static --peer flow continues to work when --discovery-mode=none (default)
  • k3d integration test: 3-replica Deployment converges to a fully-connected mesh under the new discovery mode; scaling up adds the new pod within discovery.interval
  • README API/config table and docs/CONFIGURATION.md updated; sample Deployment + RBAC manifest included

Constraints

  • Proto-first per SKILL.md: if any discovery state is exposed via gRPC (e.g., "list discovered peers"), it goes in proto/sidecar.proto first. Optional for the first cut.
  • Discovery off by default. Failure to reach the kube API must log + degrade gracefully, never panic.
  • Don't break the endpoint_id@host:port parsing in main.rs — kubernetes discovery is additive.
  • Cross-namespace discovery is out of scope for the first cut; namespace defaults to the pod's own SA-mount namespace.

Dependencies

  • Probably requires a small peat-mesh PR, depending on gap 4's resolution. If KubernetesDiscovery doesn't already have a story for per-pod endpoint_id propagation, the cleanest fix is option 4d (deterministic per-pod keypair seeded from (formation_id, pod_name)), which needs additive surface in AutomergeBackendConfig.
  • Helm chart values addition; chart version bump per repo convention.

Effort estimate

Medium. Larger than #62 because of the RBAC manifests and the gap-4 question. If gap 4 resolves to options (a–c) it's plumbing + chart work (~300 lines + manifests + docs). If it needs option (d), add a small peat-mesh PR first. The peat-mesh KubernetesDiscovery itself is done — peat-node is the wiring + RBAC + the endpoint_id propagation answer.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions