using LLMs as game engines

idea

I’ve been playing around with LLMs for a while, and I had this idea: what if we could use them not just for generating code, but as an actual game engine? It’s not practical, but I thought it might be an interesting and fun experiment.

The basic idea was to see if an LLM could:

  1. Take in the current game state and player input
  2. Process that information based on game rules
  3. Spit out a new game state
               Rules                  

               ┌─▼───────────┐        
   ┌───Input───▶             │        
   │           │ Game engine ┣━Sn+1━┓ 
┌──┴───┐  ┌─Sn─▶             │      ┃ 
│ User │  │    └─────────────┘      ┃ 
└──▲───┘  └───────────◀━━━━━━━━━━━━━┫ 
   │                                ┃ 
   │          ┌─────────────┐       ┃ 
   └──Image───│  Renderer   ◀━━━━━━━┛ 
              └─────────────┘         

Architecture of our game

By doing this, we’re effectively using the LLM as a natural-language computer. This idea that LLM and AI could replace traditional computers is not new.

experiment

To test this out, I built a simple Snake game. In my first attempt, I tried providing just the ASCII representation of the screen to the model without using an external state; however, the model couldn’t compute the new state correctly. I experimented with different methods to give the state to the model and tried various models (Llama 3.1 8B and GPT-4-o), but none produced satisfying results.

Eventually, I settled on this configuration:

It actually worked, which was surprising but obviously this is a very stupid way of running a game.

  1. It’s slow. Like, really slow compared to a normal game engine.
  2. It eats up a lot of compute for nothing.
  3. The whole thing can break if you word the prompts slightly wrong.
  4. You can only do super simple games right now.
  5. In case of bug you are at the mercy of the model to know what is going on.

gif of snake This GIF is at 4x speed.

While this isn’t going to replace Unity or Unreal Engine anytime soon, it’s still interesting/impressive that LLM are capable of this. With the simultaneous increase of LLM capabilities and decrease of inference cost ($ and time wise), we might end up in a state where replacing some programs with LLM might be a real possibility.

conclusion

This experiment is more of a “what if” scenario than a practical application, but it’s a fun way to explore the capabilities of these AI models and might spark some interesting ideas for future projects. Although it doesn’t seem impressive on paper, it’s quite fascinating to play a game that hasn’t been traditionally coded at all.

It’s also an interesting way to test what LLMs can do. It checks their ability to follow complex instructions, keep track of information, and solve problems in a changing environment.

I’ve put the project up on my github if anyone wants to check it out or build on the idea. Feel free to contribute or just mess around with it.