Skip to content

Replace kelindar/search with hugot for static binary compatibility#50

Merged
samcm merged 1 commit intomasterfrom
fix/static-binary-hugot-embedding
Mar 11, 2026
Merged

Replace kelindar/search with hugot for static binary compatibility#50
samcm merged 1 commit intomasterfrom
fix/static-binary-hugot-embedding

Conversation

@samcm
Copy link
Member

@samcm samcm commented Mar 11, 2026

Summary

Swap out kelindar/search (GGUF + llama.cpp via purego/fakecgo) for knights-analytics/hugot with pure Go ONNX inference. kelindar/search uses ebitengine/purego which calls dlopen at runtime, producing dynamically linked binaries even with CGO_ENABLED=0 — this crashes on Alpine with "exec: no such file or directory" because there's no glibc dynamic linker.

hugot's NewGoSession() gives us genuinely static binaries that run on Alpine without issues. Model format changes from a single GGUF file to an ONNX directory (model.onnx + tokenizer.json), downloaded from HuggingFace's sentence-transformers/all-MiniLM-L6-v2. Vector search switches from kelindar's index to brute-force dot product over L2-normalized embeddings — totally fine for our corpus size.

Changes:

  • pkg/embedding/embedder.go — hugot session + FeatureExtractionPipeline with normalization
  • pkg/resource/example_index.go, runbook_index.go — inline vectors + dot product search
  • pkg/searchruntime/runtime.go — ONNX directory resolution instead of single file lookup
  • pkg/config/config.go — drop GPULayers field, defer model path resolution to searchruntime
  • Dockerfile — remove llama-builder stage and libgomp1 dep, download ONNX model
  • goreleaser.server.Dockerfile — add model-downloader stage for ONNX files
  • Makefile — replace cmake/g++ build with curl downloads
  • Go 1.25.0 (required by hugot)

…y compatibility

kelindar/search uses ebitengine/purego which relies on fakecgo + dlopen, making
binaries dynamically linked even with CGO_ENABLED=0. This caused the panda-server
Docker image to crash on Alpine ("exec: no such file or directory") because the
binary needed glibc's dynamic linker.

Switch to knights-analytics/hugot with pure Go ONNX inference (NewGoSession) for
embedding generation. Replace kelindar's vector index with simple brute-force dot
product search over L2-normalized embeddings, which is sufficient for our corpus
size (thousands of documents).

- Swap GGUF model format for ONNX directory (model.onnx + tokenizer.json)
- Remove libllama build stage from Dockerfile
- Simplify Makefile (no more C++ compilation or LD_LIBRARY_PATH wrappers)
- Update goreleaser Dockerfile to download ONNX model from HuggingFace
- Bump Go to 1.25.0 (required by hugot)
@samcm samcm force-pushed the fix/static-binary-hugot-embedding branch from 288a809 to fcc35c8 Compare March 11, 2026 09:32
@samcm samcm merged commit 8770a19 into master Mar 11, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants