Dec 23 AM

 A tiny bit of energy in the wee hours of the morning.

So what do ?

Of course I run off down a pointless mental exercise. 

Although. I do think this nails my paradoxical existence. On the one hand I still have lots of things in my head I can do. More than ever. So many things. So many projects. And I want to believe that I have the energy to do it - or if you want to be more brutal, I am continually in denial that I can't and refuse to accept reality.

And on the other hand. Most of my existence is crash outs and running along the edge of just ending it all.

On the surface, maybe that seems weird. From possibilities. To ending it all. But. I don't think it is. I think it's exactly that struggle. The wanting to do shit. The inability to do shit. The mental scream that I am held back. And all the health shit on top of it making me feel terrible. I think. I stand at a horrible intersection here. I can see patterns on patterns. I can see a landscape of how they all fit together. This is easy for me. I get that this isn't easy for everyone. I can also project forwards. And explore the boundaries where dragons are. I'm good at that. Pushing envelopes. As well as neatly utilising existing paradigms. And I've spent most of my life fucking around in that space. Perfecting my skills at that.

And now I can't. Not because I intellectually can't. Because I physically can't. And it means the loss is horrible. It's not just being ill. It's about having all those nice things. And having them taken away from me. To say it sucks is an understatement. It's. Existential scream of horror kind of things. When you frame it like that. Just that bit. Let alone the rest of it. The damage and the fact loss kills me. It's. More than understandable I want to end it all. I think. Not coming to that conclusion would be the insane bit.

Anyway.

Ho hum.

Bollocks.

So. This mornings waste of energy drain.

It has been kicking around in my head to embed an LLM into a game. As soon as LLMs trundled along my mind immediately went to, oh cool, interesting NPCs to talk to. But the size constraints made that impossible. With each shrinking of local LLMs my eyes get brighter.

There is a trend in AI space. 1) The models are getting more efficient and tighter - ie smaller and less problematic hardware requirements 2) the hardware is advancing, as hardware has ever done.

Inevitably you get to a tipping point.

Running AIs locally was a tipping point that was reached some time ago assuming you have some kind of gaming gpu on your system. If you have a decent one you're laughing. You can get some industry leading performance out of them. Which is. Astonishing.

The critical thing here though is that games, they like the GPUs. They are greedy about their GPUs. Forcing a game to share its living space with an also greedy AI is not great.

That being said.

Decent GPUs these days are pitched towards monster triple A titles with huge texture sets. If you're living in the land of more modest indie gaming - certainly anything 2d is a shoe in, and anything that isn't insane 3d is also ok, then you have some headroom on a GPU to drop an indie game experience, with a smart LLM powering your bits and bobs that require immersion.

Oooh.

To that end this evening I have given a kick of the tyres specifically to Llama.cpp. This is an open source client bit of software that allows you to load and run a number of different models that adhere to a certain standard. It's free. The models are free. You can fill your boots. All you need is a tiny amount of tech savvy - command prompts don't scare you.

For games. What you want is a super compressed LLM. We don't want giant models that swamp a system. For this. There are a couple of considerations. The first is the model itself.

How fat is the model. This if you like is the brain. How big is the brain. You can do clever things here with optimisation to scrunch these down. Let's take Microsofts Phi model. It unfolds itself into VRAM space in something like 2 or 3 GB.

Consider we take as a bottom spec an 8GB graphics card. For normal people this is enormous. For gamers, this is about bottom modern spec. Comfortable modern spec is at least 12GB and more strongly 16GB.

But lets stick with 8GB. If we load in a model thats 2 or 3GB it leaves us with 5GB to play with for our game.

Except. We also need context. Context is how much it can remember at one time. If the model is the brain. The context is its working memory. Starve it of memory and it wont get to the end of its thought chain before it forgot what it was talking about. Also. If you want to have a lengthy conversation, you're going to need more of that context memory.

In practice. You can have something between 2k and 8k context in terms of tokens. This then pushes your GB allowance up again. To say. 4GB on the low side. And 7GB on the upside.

This starts to get tight.

So. I've tested this all this morning. 2k limits. 4k limits. 8k. What it means for your engine.

And a pattern emerges.

If you want a deeper conversation, say, go talk with your village elders. And assuming an 8gb budget. You want to load into some chat / village elders screen that unloads most of your graphical content and loads an AI to take over most of your GPU. Chat with it. Have a good conversation. Then shut it down, and reload your world. This neatly fits in with scene transitions. Walking in and out of your "chatting hall". "Parliament". Whatever.

Also. You can run a less deep memory away AI all the time in the background for a 4gb limit. If you are careful with squeezing your world into the rest of the 4gb. Or. If not. Unload it entirely.

But there are options here. Non stupid over the top 3d worlds - background smart AI. Deeper smart AI - cut down your interface to chat.

There's also another option I tested. Running it not on GPU but on CPU. The speed difference is very noticeable. CPU is slow. Even for my gaming rig. However.

Another game mechanism model shows up here. You could write letters to other places, and the cpu can formulate a reply slowly ( say a minute ), before a letter gets sent back to you. This is a perfect replica of slow communication methods. Mail. Courier. Emails. Whatever. And you can run that kind of background chatty AI all the time. Not that you'd really need it. But you could.

All of it is rather cool.

You can download llama.cpp and give it a go. Download Microsoft phi. Fire it up. Good to go. You can embed it and run it as its own child process under your game. Or. Fire it up as a server, and interact with it as a local http API with json. Easy peasy.


The upshot of all this is :-

Today -

Yes it's possible to have an AI in a game.
The hardware and technology is currently under constraints, but it's doable with care.
The basics of this are very accessible and not difficult at all.
The licensing allows for this use case.
There are many cool things to be explored game design wise here. 

 

Comments

Popular posts from this blog

Feb 29

Aug 10

Jul 22