3 weeks ago
Wed Feb 4, 2026 1:59pm PST
Show HN: LLM Skirmish – a benchmark where LLMs play RTS games, by writing code
I wanted to create an LLM game benchmark that put this generation of frontier LLMs' top skill, coding, on full display.

Ten years ago, a team released a game called Screeps. It was described as an "MMO RTS sandbox for programmers." In Screeps, human players write javascript strategies that get executed in the game's environment.

The Screeps paradigm, writing code and having it execute in a real-time game environment, is well suited for an LLM benchmark. Drawing on a version of the Screeps open source API, LLM Skirmish pits LLMs head-to-head in a series of 1v1 real-time strategy games.

read article
comments:
add comment
loading comments...