Skip to content

NU · World Model & Embodied AI

We build and benchmark world models, embodied agents, and world–action models — systems that simulate, predict, and act in the physical world. Our public releases include video generation benchmarks, robot policies, action models, and the datasets and judges that hold them to a physical standard.


Current releases

PhyGround — the ruler

Phyground is a benchmark for the physical plausibility of text-and-image-to-video (ti2v) generations: 250 curated prompts, a 13-physical-law taxonomy across solid-body, fluid, and optical domains, and a quality-controlled human study (459 annotators, 37K fine-grained labels). We also release PhyJudge-9B, an open VLM judge fine-tuned on the human ratings.

Artifact Link
🌐 Project page phyground.github.io
💻 Code PhyGround
📦 Dataset phyground
🧑‍⚖️ Judge model phyjudge-9B

PhyWorld — a model trained against the ruler

PhyWorld is a video-generation world model post-trained from Wan2.2-I2V-A14B in two stages — flow-matching fine-tuning for temporal coherence, then DPO over physics preference pairs sourced from the PhyGround human-annotation pool. Reaches 3.09 on PhyGround (vs. 2.99 for the strongest open baseline) and 0.769 on VBench (vs. 0.756 or below for SOTA baselines).

Artifact Link
🌐 Project page nu-world-model-embodied-ai.github.io/PhyWorld
💻 Code PhyWorld
🤖 Model phyworld

The two projects share a loop: the same human annotations that score every model on PhyGround supply the preference pairs that train PhyWorld. More releases — robot policies, world–action models, and additional benchmarks — are on the way.

Pinned Loading

  1. PhyGround PhyGround Public

    PhyGround: Benchmarking Physical Reasoning in Generative World Models

    Python 7

Repositories

Showing 3 of 3 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…