Read post

Local AI Chatbot experience + resources

by admin, 2023-12-03 15:57:24

Here I collected some of the things that worked for me and were relatively lightweight.
I wouldn't say the model behaves as good as GPT-4 or even GPT-3.5, but it is fine for short and shallow conversations, considering that it runs right on the laptop, no need for internet connection.

My hardware: GPD Pocket 3 (16Gb RAM, i7-1195G7). I run models on the CPU, because any respectful GPU is absent.

Chat UI: Amica
Server: llama.cpp
Model: Meta AI's LLaMA v2.

Although 13B parameters run just fine for me in the console (2-3 tokens per second), with the chat UI open the performance becomes abysmal (i.e. 1 token per 5-10 seconds or so).
So I opt out for a 7B parameter configuration. As per my subjective judgement, it is decent enough. Right on the edge, so to speak.
I have also tried 3B parameter configuration, but it is not good enough for holding at least a plausible chat. The only good thing is it can run on a Raspberry PI (32-bit raspbian) :)

Upd. 13B performance is somehow better in web chat UI. The built-in one, not Amica.

Artem Bondarenko

Local AI Chatbot experience + resources