Add gemma3-opus-distill post by vicArc · Pull Request #3377 · huggingface/blog

vicArc · 2026-05-07T15:40:46Z

A community/guest post: a negative-results postmortem on QLoRA-distilling Claude Opus 4.6 reasoning traces into Gemma 3 4B on a single RTX 3050. The fine-tune trains cleanly (loss 2.67 → 1.01) but regresses against the base on MATH-500 (-5.0pp) and GSM8K (-15.0pp). The post explains the two contributing causes (eval format mismatch, capability narrowing on a narrow 1.9k-example dataset), what I'd do differently, and why publishing measured negative results is more useful than the usual celebratory fine-tune writeup.

Preparing the Article

Add an entry to _blog.yml
Add a thumbnail (assets/gemma3-opus-distill/thumbnail.jpg)
Short title and blog path (gemma3-opus-distill)
Frontmatter with author (Viesar, guest)
Publication date is correct (May 7, 2026)
Previewed in https://huggingface.co/new-blog (no publish)

Notes

All linked models/datasets are already public on the Hub.
No additional assets beyond the thumbnail; large images aren't needed for this post.
External/community contribution — happy to take review feedback on tone or length.

A negative-results writeup on QLoRA-distilling Claude Opus 4.6 reasoning traces into Gemma 3 4B on a single RTX 3050 — fine-tune trains cleanly but regresses on MATH-500 (-5pp) and GSM8K (-15pp), with postmortem on evaluation format mismatch and capability narrowing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add gemma3-opus-distill post#3377

Add gemma3-opus-distill post#3377
vicArc wants to merge 1 commit intohuggingface:mainfrom
vicArc:Viesar/gemma3-opus-distill

vicArc commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vicArc commented May 7, 2026

Preparing the Article

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant