Skip to content

Conversation

@mikeldking
Copy link
Collaborator

@mikeldking mikeldking commented Sep 25, 2025

this is the feature branch for the upcoming version 13

@mikeldking mikeldking requested review from a team as code owners September 25, 2025 20:54
@github-project-automation github-project-automation bot moved this to 📘 Todo in phoenix Sep 25, 2025
@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Sep 25, 2025
@mikeldking mikeldking changed the base branch from main to feat/version-12 September 25, 2025 20:56
@dosubot dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels Sep 25, 2025
@mikeldking mikeldking changed the title version 13 feat!: version 13 - dataset evaluators Sep 25, 2025
@pkg-pr-new
Copy link

pkg-pr-new bot commented Sep 25, 2025

Open in StackBlitz

npm i https://pkg.pr.new/Arize-ai/phoenix/@arizeai/phoenix-client@9642
npm i https://pkg.pr.new/Arize-ai/phoenix/@arizeai/phoenix-mcp@9642

commit: 7a0c5f7

@mikeldking mikeldking added the feature branch a feature branch that consolidates multiple features into a single commit on main label Sep 25, 2025
@mikeldking mikeldking marked this pull request as draft September 25, 2025 21:22
Copy link
Contributor

@RogerHYang RogerHYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocking feature branch

@github-project-automation github-project-automation bot moved this from 📘 Todo to 🔍. Needs Review in phoenix Sep 25, 2025
Base automatically changed from feat/version-12 to main September 29, 2025 18:13
An error occurred while trying to automatically change base from feat/version-12 to main September 29, 2025 18:13
@RogerHYang RogerHYang removed this from phoenix Oct 6, 2025
@RogerHYang RogerHYang force-pushed the version-13 branch 4 times, most recently from fc49ed1 to 2b19c56 Compare October 24, 2025 15:47
@RogerHYang RogerHYang force-pushed the version-13 branch 3 times, most recently from f4ab1f0 to 2ef94fd Compare October 29, 2025 15:31
@mikeldking mikeldking closed this Nov 4, 2025
@mikeldking mikeldking reopened this Nov 4, 2025
RogerHYang and others added 23 commits November 8, 2025 13:29
* feat(evaluators): mutations for playground evaluator selector

* add test

* use upsert update

* clean up

* clean up

* clean up

---------

Co-authored-by: Roger Yang <[email protected]>
* backend

* fix tests

* fix tests

* types

* types

* update frontend

* clean

* fix
* add minimal evaluators menu

* handle selection

* styling

* add menu footer

* set menu max width

* replace footer button with link
* feat: Add name field to evaluators form

* Reorganize choices and their default state

* Disable prompt save, tools, response format

* Move input mapping field from select to combobox

* Update arrow icon

* Persist input mapping fields across labels

* Do not render response format or tools if they are saved as provider default
* Add dummy evaluation payloads to single playground run

* Implement for chat mutations and subscription over dataset

* Ruff 🐶 and update graphql schema

* compile relay

* frontend

* Add dataset example id and repetition number

* Address feedback

* Update input typing

* Update relay

* Load and display real global evaluators

---------

Co-authored-by: Alexander Song <[email protected]>
Co-authored-by: Tony Powell <[email protected]>
* Add filter and sort capabilities to evaluators

* Improve clarity of allowed sort columns
* add EvaluatorSelect to dataset page

* stub out evaluator config dialog and rework data fetching

* add readonly prompt messages to eval config modal

* add output config to modal

* add dataset example preview and input mapping section to modal

* wire up add evaluator mutation

* add suspense boundaries

* Refactor promptVersionToInstance to depend on inline fragment

* remove unnecessary type annotations:

---------

Co-authored-by: Tony Powell <[email protected]>
* evaluator crud

* clean

* patch mutation

* update

* types

* Revert "types"

This reverts commit 25579b5.

* type ignore

* plural delete

* clean

* decorator

* fix metadata

* clean

* clean

* already exists

* test

* simplify

* test

* simplify
* add annotation name to eval select

* address feedback
…s useful (#10187)

* feat(evaluators): provide a useful correctness pre-built evaluator

* feat(evaluators): provide a useful correctness pre-built evaluator

* simplify
mikeldking and others added 6 commits November 8, 2025 13:52
* evaluator prompt validation

* cursor tests

* clean

* condense

* test

* clean

* clean

* test

* parse pydantic errors

* clean

* validate mutations

* fix tests

* validate choices

* test with form

* test

* type check

* clean
* include only dataset-specific evaluators in playground eval selector

* fix dataset page tab selection

* add aria label to dialog

* add annotation names to playground select

* handle long annotation  names

* separate components for DatasetEvaluatorSelect and PlaygroundEvaluatorSelect

* remove extra opacity css var

* updates to Menu

* updates to evaluator menus

* fix menu item flicker

* wip: enable mapping evaluator from playground

* formatting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature branch a feature branch that consolidates multiple features into a single commit on main size:XS This PR changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants