Skip to content

Conversation

@zainhas
Copy link
Collaborator

@zainhas zainhas commented Nov 17, 2025

Note

Adds a new LLM judge optimization notebook and a DPO training dataset.

  • Evals:
    • Add notebook Evals/Optimizing_LLM_Judges.ipynb for LLM judge optimization experiments.
    • Add DPO training dataset Evals/judge_dpo_data/rewardbench2_dpo_train.jsonl.

Written by Cursor Bugbot for commit 0e33a16. This will update automatically on new commits. Configure here.

@VProv
Copy link
Contributor

VProv commented Nov 21, 2025

Cell that starts as

# pip install together
import json
import os
from together import Together

client = Together()

# Create dataset comparing two model responses
compare_data = [
    {
        "prompt": "Explain photosynthesis",
        "response_a": "Photosynthesis is how plants make food using sunlight.",
        "response_b": "Photosynthesis is the process by which plants convert light energy into chemical energy, using chlorophyll to transform CO2 and water into glucose and oxygen."
    },
]

seems to be redundant

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants