1 week ago
Mon Feb 2, 2026 1:10am PST
Show HN: Democlean – Score robot demos by motion quality
I built a CLI that scores robot demonstration episodes using mutual information between states and actions.

The problem: robot learning datasets contain bad demos (jerky movements, hesitation, inconsistent timing). Training on these hurts policy performance. Manual review doesn't scale.

    pip install democlean
    democlean analyze lerobot/pusht
democlean scores each episode by how predictable the actions are given the states. Smooth, purposeful motion scores high. Jerky, inconsistent motion scores low.

Validation: I correlated MI scores with motion metrics on lerobot/pusht (human teleoperation data). High-MI episodes had 12% lower jerk (p=0.02) and 24% higher state-action correlation (p=0.03). I did not train policies to measure downstream improvement.

Limitations I want to be upfront about:

- MI correlates with episode length (r≈0.8). Longer episodes score higher.

- This measures motion smoothness, not task success.

- Works best with 50+ episodes from a single task.

- Inspired by DemInf (Hejna et al., RSS 2025) but uses raw KSG estimation instead of their VAE pipeline. Simpler, probably less accurate for high-dimensional observations.

Complements score_lerobot_episodes which catches visual issues (blur, lighting). This catches behavioral issues.

GitHub: https://github.com/dipampaul17/democlean

Happy to answer questions about the approach or validation.

read article
comments:
add comment
loading comments...