Add SocialiToM generation and evaluation pipeline#3
Open
lishitian15-ops wants to merge 1 commit into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
本次修改概述
新增了 SocialiToM 的完整生成流水线,支持从社会场景上下文生成故事、问题、CoT、干扰项和标准答案。
补充了上下文池、故事填充、输出适配、问题生成和社会后果规则等核心模块,增强了不同任务类型的生成能力。
新增了质量判断模块,用于对生成样本做规则/LLM 质量筛选。
新增了批量生成脚本和评测脚本,支持本地 vLLM / OpenAI-compatible 接口,方便大规模合成与评测。
新增了 Qwen pass@5 评测流程,并输出 thinking、acc@1、majority_vote_acc 等更细粒度指标。