Agentic PCG: Procedural Content Generation
via Tool-using LLMs

We build a tool-using LLM framework for procedural content generation, where an agent iteratively edits, evaluates, and optimizes game levels with environment feedback. The framework works in both static design tasks and games with dynamic gameplay mechanics, supports not only simple edits like tile placement but also classic PCG algorithms as tools, and can follow open-ended natural language instructions alongside explicit functional constraints.

Agentic PCG across four game domains

The agent iteratively optimizes levels across Binary Maze, Lode Runner, Zelda, and Sokoban.

Super Mario Bros level optimization

Super Mario Bros level editing with gameplay simulation feedback.

Perceive, Reason, Plan, and Edit

Instead of asking the LLM to directly generate an entire game level in one shot, we wrap the game as an interactive environment, similar to an RL environment. This allows the environment to handle gameplay dynamics, evaluate the current level, and provide structured feedback based on level metrics. Within this loop, the LLM agent can perceive the current level state, reason about what should be improved, make editing plans, and iteratively modify the level.

Our framework supports both static and dynamic game environments. In some tasks, the level can be evaluated directly from its structure using metrics such as tile counts, connectivity, or solvability. In more dynamic environments, evaluation can also depend on simulated gameplay, such as the actions taken by a deterministic A* agent interacting with the level. This makes it possible to provide rich environment feedback ranging from simple structural statistics to behavior-based signals.

See detailed optimization process below ↓

Tool-Using Level Design

The tool set can include both simple and complex operations. At the simplest level, the agent may place individual tiles, draw lines, or edit patches of the map. But the framework is not restricted to these primitive edits. In the Binary Maze domain, for example, the agent can also invoke classic PCG algorithms that are naturally suited to maze generation, such as binary space partitioning and tree-search-based diggers.

Binary Maze optimization: from initial digger generation to BSP, manual corridor rerouting, and precise fine-tuning to hit the target path length.

Free-Form Language Instructions

Beyond functional constraints, the framework can also incorporate free-form natural language instructions. This enables open-ended language control on top of explicit metric-based requirements, so the agent can satisfy concrete functional targets while also adapting to higher-level design goals such as theme, story, or intended player experience.

Mario levels generated with natural language instructions guiding theme and structure alongside playability constraints.

Targeting Different Controllable Metrics

The agent can be directed to optimize toward specific target values for different controllable metrics in each game domain. Each grid below shows multiple trials where the agent targets a different metric value per trial, demonstrating fine-grained controllability over level properties.

Binary Maze Show Final MapsBinary Maze final maps

Binary Maze controllability across metrics

Binary Door Show Final MapsBinary Door final maps

Binary Door controllability across metrics

Lode Runner Show Final MapsLode Runner final maps

Lode Runner controllability across metrics

Zelda Show Final MapsZelda final maps

Zelda controllability across metrics

Sokoban Show Final MapsSokoban final maps

Sokoban controllability across metrics

Super Mario Bros Show Final MapsSMB final maps

SMB controllability across metrics

Detailed Optimization Process

Accepted edits across each game domain, showing the agent's rationale and level changes at each step.

Binary Door

Lode Runner

Zelda

Sokoban

Super Mario Bros