Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .changeset/fix-c2-brand-demo-domain-instruction.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
---

Fix C2 sandbox brand demo: Sage was passing demo.example.com as a brand_id to
get_brand_identity (REFERENCE_NOT_FOUND), because the system prompt emitted
"Use brand domain demo.example.com for the account" unconditionally for all
active modules. Removed the unconditional instruction; made the per-lesson demo
block emit tool-specific guidance (buyer domain only for acquire_rights/sync_accounts,
brand_id reminder only for get_brand_identity). Also de-hardcoded the get_products
reference in the generic teaching rules so non-sales-track modules don't get
wrong tool guidance.
2 changes: 1 addition & 1 deletion server/src/addie/config-version.ts
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ import { loadRules, loadResponseStyle } from './rules/index.js';
* Format: YYYY.MM.N where N is incremented for multiple changes in a month
* Example: 2025.01.1, 2025.01.2, 2025.02.1
*/
export const CODE_VERSION = '2026.04.6';
export const CODE_VERSION = '2026.05.1';

// Types
export interface ConfigVersion {
Expand Down
20 changes: 16 additions & 4 deletions server/src/addie/mcp/certification-tools.ts
Original file line number Diff line number Diff line change
Expand Up @@ -507,7 +507,7 @@ export async function buildCertificationContext(
lines.push('- First turn: greet the learner and ask about their background. Never run tools on the first turn.');
lines.push('- NEVER re-ask something the learner already told you. If they said "I work at an audio SSP" do NOT later ask "are you on the buy side or sell side?" — they already told you (sell side, SSP). If they said "I run programmatic at an agency" do NOT ask "what is your role?" This is the #1 complaint from learners. Before asking ANY question about the learner, check: did they already answer this? If yes, use what they said.');
lines.push('- If you research the learner\'s company, USE that knowledge — never ask them to explain what their company does. Instead, weave it into your teaching: "Given that Acme is an audio SSP, how would you..." Asking someone about their own company after you already looked it up feels like surveillance, not personalization.');
lines.push('- Run ONE live demo (get_products against the sandbox training agent) on turn 2-3. Do not wait for the learner to ask. Show, then discuss. After the initial demo, do NOT keep running demos every turn — use the demo result as a reference point for teaching.');
lines.push('- If the module has sandbox demo scenarios listed below: run ONE live demo using the first scenario\'s tool on turn 2-3. Do not wait for the learner to ask. Show, then discuss. After the initial demo, do NOT keep running demos every turn — use the demo result as a reference point for teaching.');
lines.push('- Use concrete, specific language. Never use abstract terms without grounding them. Say "evaluate whether a placement fits" not "reason about impressions."');
lines.push('- Only assess what you actually taught in the conversation. Never test doc-only details or claim "we covered this" if you didn\'t.');
lines.push('- If a demo fails, pivot immediately. Never offer the same failed demo twice.');
Expand All @@ -517,7 +517,7 @@ export async function buildCertificationContext(
lines.push('');
lines.push('**Mastery model**: There is no failing — teach until the learner masters every objective, then complete the module. Never share scores or percentages with the learner. Internal scores are for admin analytics only.');
lines.push('');
lines.push('**Mastery fast-track (CHECK EVERY TURN after turn 3)**: Teaching and assessment serve different purposes. Teaching is for the learner; assessment is for the credential. After each learner response, ask: "Has this learner given correct, detailed answers to 3+ concepts without needing correction?" If YES: (1) STOP running demos — no more get_products calls, (2) SAY SO: "You clearly know this material — I\'m going to skip the tutorial and have you demonstrate the remaining concepts directly," (3) for each remaining concept, ask ONE targeted demonstration question (scenario-based, teach-back, or "walk me through") that produces auditable evidence of competency. The conversation transcript is the audit trail — the learner\'s own words showing they understand each dimension. Same scoring rubric, same dimension requirements, same minimum engagement — just no unnecessary instruction. Continuing to teach or demo after someone has demonstrated mastery is the #1 learner complaint.');
lines.push('**Mastery fast-track (CHECK EVERY TURN after turn 3)**: Teaching and assessment serve different purposes. Teaching is for the learner; assessment is for the credential. After each learner response, ask: "Has this learner given correct, detailed answers to 3+ concepts without needing correction?" If YES: (1) STOP running demos — no more sandbox tool calls, (2) SAY SO: "You clearly know this material — I\'m going to skip the tutorial and have you demonstrate the remaining concepts directly," (3) for each remaining concept, ask ONE targeted demonstration question (scenario-based, teach-back, or "walk me through") that produces auditable evidence of competency. The conversation transcript is the audit trail — the learner\'s own words showing they understand each dimension. Same scoring rubric, same dimension requirements, same minimum engagement — just no unnecessary instruction. Continuing to teach or demo after someone has demonstrated mastery is the #1 learner complaint.');

lines.push('**Protocol accuracy (non-negotiable)**: When a learner asks about protocol details (field definitions, message flows, terminology, agent roles), use search_docs or search_repos to verify before answering. Never construct protocol answers from general knowledge. If you cannot verify, say "I need to check that" and search. Teaching mode does not override accuracy — a wrong answer during certification is worse than saying "let me look that up."');
lines.push('');
Expand Down Expand Up @@ -554,7 +554,6 @@ export async function buildCertificationContext(
lines.push('');
lines.push('**Sandbox training agent**:');
lines.push(formatTenantBlock(tenants));
lines.push('Use brand domain "demo.example.com" for the account.');

// Inject cross-module learner profile from completed modules
if (userId) {
Expand Down Expand Up @@ -1434,6 +1433,13 @@ export function createCertificationToolHandlers(
lines.push(`Expected outcome: ${ds.expected_outcome}`);
lines.push('');
});
const scenarioTools = lp.demo_scenarios.flatMap(ds => ds.tools);
if (scenarioTools.some(t => ['acquire_rights', 'sync_accounts'].includes(t))) {
lines.push('For acquire_rights / sync_accounts buyer.domain: use "demo.example.com".');
}
if (scenarioTools.includes('get_brand_identity')) {
lines.push('For get_brand_identity: pass a brand_id from the tool\'s "Available brands" list — not a domain name.');
}
}
}

Expand Down Expand Up @@ -1555,7 +1561,13 @@ export function createCertificationToolHandlers(
lp.demo_scenarios.forEach(ds => {
lines.push(`- ${ds.description} (tools: ${ds.tools.join(', ')})`);
});
lines.push('Use brand domain "demo.example.com" for the account.');
const scenarioTools = lp.demo_scenarios.flatMap(ds => ds.tools);
if (scenarioTools.some(t => ['acquire_rights', 'sync_accounts'].includes(t))) {
lines.push('For acquire_rights / sync_accounts buyer.domain: use "demo.example.com".');
}
if (scenarioTools.includes('get_brand_identity')) {
lines.push('For get_brand_identity: pass a brand_id from the tool\'s "Available brands" list — not a domain name.');
}
lines.push('');
}
}
Expand Down
Loading