jasonfry.co.uk

Running Your Own AI Chatbot Locally

I've been using ChatGPT, and some other LLMs locally for a few months for doing some basic programming assistance, using them somewhat like the person typing when pair programming, but only for simple mundane tasks. This sort of thing: "Here's a Swift struct, please write me an SQL query to insert / find / update". Or this morning: I'd sketched out a load of psuedo json to model some data structures I was thinking about, so instead of manually typing out Swift structs I gave the chatbot the psudeo json and asked it to make the Swift types. With a few follow up messages it got 90% of the way there and probably saved me 20 minutes of pretty dull work.

I've also tried running some of these things locally but never found any model good enough to actually save me time. But yesterday Facebook released Llama3, their latest model family, and I think this might tip the balance for me.

Here's how you can run your ChatGPT style chatbot locally:

  1. You can run AI models locally with a runtime called Ollama - download and install it from their website
  2. Then you download model(s) on the command line, e.g. ollama pull llama3:8b or ollama pull llama3:70b (RAM: 16GB for Llama 3 8B, 64GB for Llama 3 70B)
  3. Then interact with Ollama using a UI, e.g. MacOS/iOS/iPadOS App Enchanted (or for something more complex, see OpenWebUI)

You will need a fairly decent computer, e.g. at least 16GB RAM (I have 64GB), an ARM Mac (I have an M1 Ultra), or a recent Nvidea GPU.