Add INT8 ONNX quantization pipeline for edge deployment (Raspberry Pi / low-spec CPUs)#250
Open
shahwork005-oss wants to merge 3 commits into
Open
Conversation
- quantize/export_onnx.py: export GreenFormer to ONNX FP32 (512x512), handles pos_embed interpolation from 2048->512 and disables FlashAttention for CPU tracing - quantize/calibrate_int8.py: static INT8 PTQ with auto-generated HSV green hint masks from calibration frames - quantize/create_dummy_model.py: minimal ONNX model for pipeline smoke-test without the real checkpoint - camera/infer_pi.py: 4-channel RGBA input (RGB + hint mask), correct ONNX input/output names, returns alpha + fg - camera/camera_capture.py: live camera pipeline with auto compositing - requirements-edge.txt: Pi/Jetson deps (onnxruntime only, no PyTorch) - requirements-export.txt: export machine deps + timm fork instructions Results on CorridorKey_v1.0 checkpoint at 512x512: FP32 ONNX: 275.7 MB INT8 ONNX: 70.8 MB (3.9x smaller, same accuracy) Tested: export -> calibrate (1793 frames) -> infer_pi -> live camera (84 frames) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- ruff format: auto-formatted all 5 files to project style (120 char line length) - Remove unused generate_green_hint_mask import in camera_capture.py - Sort imports in calibrate_int8.py (I001) - Remove bare f-prefix on string without placeholders in create_dummy_model.py (F541) - Add 'raise ... from e' chaining in export_onnx.py (B904) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds a complete pipeline to quantize CorridorKey's GreenFormer model to INT8 ONNX format, enabling deployment on edge devices (Raspberry Pi, NVIDIA Jetson, USB-attached cameras) without requiring PyTorch at inference time.
Results on
CorridorKey_v1.0checkpoint.safetensorsonnxruntime(5 packages total)New files
How to use
Export (run once on your main machine):
pip install -r requirements-export.txt python quantize/export_onnx.py \ --checkpoint models/CorridorKey_v1.0.safetensors \ --output models/corridorkey_fp32.onnx \ --img-size 512Calibrate (needs 100-200 green screen frames in
calibration_frames/):python quantize/calibrate_int8.py \ --fp32-model models/corridorkey_fp32.onnx \ --int8-model models/corridorkey_int8.onnx \ --frames-dir calibration_frames/Deploy on Pi / edge device (no PyTorch needed):
pip install -r requirements-edge.txt python camera/infer_pi.py --model corridorkey_int8.onnx --image frame.jpg --output result.png python camera/camera_capture.py --model corridorkey_int8.onnx # live cameraTechnical notes
export_onnx.pybicubic-interpolates positional embeddings to the export resolution (default 512x512), matchingCorridorKeyEngine._load_model()behaviourfused_attn=False) for ONNX compatibility; standardscaled_dot_product_attentionis usedinfer_pi.pyauto-generates a green hint mask via HSV thresholding when no external masking pipeline (GVM/BiRefNet) is available — sufficient for well-lit green screensTest plan
python quantize/create_dummy_model.py-- pipeline smoke-test, no checkpoint neededpython quantize/export_onnx.py-- FP32 ONNX export (275.7 MB)python quantize/calibrate_int8.py-- INT8 calibrated on 1793 real green screen frames (70.8 MB)python camera/infer_pi.py-- single-image inference on calibration frames, keying confirmedpython camera/camera_capture.py-- live camera, 84 frames at 0.2 FPS on x86 CPUMotivated by the goal of running CorridorKey on Raspberry Pi cameras and USB-stick camera setups for real-time green screen removal on low-spec hardware.
Co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com