Every humanoid policy, every public number.

We pulled together the famous humanoid policies and models — 25 of them — and the real, cited numbers they report. Almost every figure is self-reported on the lab's own task suite, and most of it lives in a simulator. There is no common, independent, real-world measurement. Physical Turing is the humanoid testing company running the trials to add that missing column — each metric below shows the lab's figure in white and our own real-world measurement in amber, reading “Testing” until the trials land.

25 humanoid policies tracked

25: Humanoid policies tracked; real, source-linked numbers
9: Evaluated only in simulation; no real-hardware number
16: Report a real-world number; mostly self-defined tasks

25 policiesReportedPhysical Turing

Policy	Embodiment	Success rate task completion	Tracking error motion fidelity	HumanoidBench norm. score · sim
ASAP Aligning Simulation and Real-World Physics for Agile Whole-Body Skills Whole-bodyReal-world CMU · NVIDIA · 2025	Unitree G1	— Testing	112 mmreal vs 159 mm baseline Testing	— Testing
HOVER Versatile Neural Whole-Body Controller for Humanoid Robots Whole-bodyReal-world NVIDIA · CMU · 2024	Unitree H1	— Testing	47.4 mmreal vs 51.0 mm specialist Testing	— Testing
HumanPlus Humanoid Shadowing and Imitation from Humans Whole-bodyReal-world Stanford · 2024	Unitree H1 (custom)	60–100%real across 6 real skills Testing	— Testing	— Testing
HumanUP Learning Getting-Up Policies for Real-World Humanoid Robots Whole-bodyReal-world UIUC · Simon Fraser · 2025	Unitree G1	78.3%real getting up; vs 41.7% OEM Testing	— Testing	— Testing
HoST Learning Humanoid Standing-up Control across Diverse Postures Whole-bodyReal-world Shanghai AI Lab · 2025	Unitree G1	100%real 20/20, 4 terrains Testing	— Testing	— Testing
TWIST Teleoperated Whole-Body Imitation System Whole-bodyReal-world Stanford · Simon Fraser · 2025	Unitree G1	— Testing	— Testing	— Testing
TrajBooster Boosting Humanoid Whole-Body Manipulation via Trajectory-Centric Learning Whole-bodyReal-world OpenHelix · 2025	Unitree G1	up to 100%real best task; ~10 min data Testing	— Testing	— Testing
OmniH2O Universal and Dexterous Human-to-Humanoid Whole-Body Teleoperation and Learning Whole-bodySim + real CMU (LeCAR Lab) · 2024	Unitree H1	94.1%sim AMASS imitation; + up to 10/10 real tasks Testing	— Testing	— Testing
ExBody2 Advanced Expressive Humanoid Whole-Body Control Whole-bodyReal-world UC San Diego · 2024	Unitree G1	— Testing	0.107 radreal mean per-joint; best vs baselines Testing	— Testing
H2O Learning Human-to-Humanoid Real-Time Whole-Body Teleoperation Whole-bodySimulation CMU (LeCAR Lab) · 2024	Unitree H1	72.5%sim AMASS; vs 85.5% oracle Testing	167 mmsim global MPJPE Testing	— Testing
GR00T N1 An Open Foundation Model for Generalist Humanoid Robots Generalist VLAReal-world NVIDIA · 2025	Fourier GR-1	76.8%real +32% vs Diffusion Policy; 66.5% sim Testing	— Testing	— Testing
GR00T N1.5 Isaac GR00T N1.5 Generalist VLAReal-worldVendor-reported NVIDIA · 2025	Unitree G1 / Fourier GR-1	98.8%real known objects; 84.2% novel Testing	— Testing	— Testing
Helix A Vision-Language-Action Model for Generalist Humanoid Control Generalist VLAReal-worldVendor-reported Figure AI · 2025	Figure 02	88.2→94.4%real barcode scan; vendor-reported Testing	— Testing	— Testing
EgoVLA Learning Vision-Language-Action Models from Egocentric Human Videos Generalist VLASimulation UC San Diego · NVIDIA · 2025	Unitree H1 (sim)	77.8%sim 7 bimanual tasks Testing	— Testing	— Testing
iDP3 Generalizable Humanoid Manipulation with 3D Diffusion Policies ManipulationReal-world Stanford · Simon Fraser · UPenn · 2024	Fourier GR-1	9/10real unseen objects; vs 0–3/10 DP Testing	— Testing	— Testing
Berkeley Humanoid A Research Platform for Learning-based Control LocomotionReal-world UC Berkeley · 2024	Berkeley Humanoid	— Testing	0.058 m/sreal vs 0.051 m/s sim Testing	— Testing
Humanoid-Gym Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real Transfer LocomotionReal-world RobotEra · Tsinghua · 2024	RobotEra XBot-S / XBot-L	— Testing	— Testing	— Testing
TD-MPC2 Scalable, Robust World Models for Continuous Control RL baselineSimulation UC San Diego · 2023	Unitree H1 (sim)	— Testing	— Testing	0.710sim via SimbaV2 Testing
SimBa Simplicity Bias for Scaling Up Parameters in Deep RL RL baselineSimulation Sony AI · KAIST · 2024	Unitree H1 (sim)	— Testing	— Testing	0.606sim via SimbaV2 Testing
SimbaV2 Hyperspherical Normalization for Scalable Deep RL RL baselineSimulation KAIST · Sony AI · 2025	Unitree H1 (sim)	— Testing	— Testing	0.776sim low-UTD; best in class Testing
TDMPBC Self-Imitative Reinforcement Learning for Humanoid Robot Control RL baselineSimulation Westlake University · 2025	Unitree H1 (sim)	8/14 taskssim at 2M steps; vs 1 baseline Testing	— Testing	— Testing
FastTD3 Simple, Fast, and Capable Reinforcement Learning for Humanoid Control RL baselineSim + real UC Berkeley · CMU · 2025	Unitree H1 (sim) / Booster T1	— Testing	— Testing	— Testing
DreamerV3 Mastering Diverse Domains through World Models (HumanoidBench baseline) RL baselineSimulation DeepMind algorithm · 2024	Unitree H1 (sim)	— Testing	— Testing	0.022sim via SimbaV2; high-UTD Testing
SAC Soft Actor-Critic (HumanoidBench baseline) RL baselineSimulation UC Berkeley algorithm · 2018	Unitree H1 (sim)	— Testing	— Testing	0.279sim via SimbaV2 Testing
PPO Proximal Policy Optimization (HumanoidBench baseline) RL baselineSimulation OpenAI algorithm · 2017	Unitree H1 (sim)	— Testing	— Testing	— Testing

Reported by the authorsPhysical Turing — independent, real-world (Testing)real / sim marks where the lab measured it · — = not reported

HumanoidBench scores are normalized (0–1) on the low-UTD setting; the figures for TD-MPC2, SimBa, SAC and DreamerV3 are as compiled in the SimbaV2 benchmark (Lee et al., 2025), DreamerV3's at the high-UTD setting. Every other figure links to its own source via the policy name.

These are other people's results, reported by the labs and vendors themselves — on different robots, different tasks, and mostly in simulation — so they don't compare cleanly to one another. Physical Turing publishes no number of its own here yet; every amber cell is a measurement we are running, not a claim. They fill in as trials complete.

Want the first independent score on your policy?

Physical Turing runs the real-world trials so you can ship what's actually ready. Tell us about your policy or humanoid and we'll scope an evaluation.

Book a demo

Questions first? Email support@physicalturing.ai.