Backend developer focused on AI Agent application development. Deeply interested in LLMs and reinforcement learning, currently exploring from the engineering side toward the algorithm side.
兴趣方向:
- AI Agent — 工具调用、多 Agent 协作、RAG 检索增强、记忆系统、Skill、Harness 工程
- 数据工程 — MinHash 去重、PPL、Self-Instruct / Evol-Instruct 数据合成、数据飞轮
- 大模型训练 — Pretrain → SFT → RLHF 全流程,LoRA / QLoRA 参数高效微调
- 强化学习 — GRPO、过程奖励塑形、On-Policy 自蒸馏(OPD)
- 后端工程 — Spring Boot、MySQL、Redis、高可用服务设计