🧱Choosing the hardware
Which mini-PC, how much RAM, which SSD? The jargon-free buying guide for a machine that runs a coding agent and AI models locally.
Before you buy anything, good news: you don’t need a beast of a machine. A coding agent and compact AI models run just fine on a small PC costing a few hundred euros. But there’s one criterion that matters more than all the others, and it’s not the one people usually look at. Let’s untangle it together, no jargon.
The three things that really matter (in order)
Forget the marketing. To run AI locally, here’s what counts, ranked from most to least important.
1. RAM, by far factor number 1
This is the decision. To answer, an AI model has to fit entirely in memory. Not enough RAM, and the model simply won’t load, or it crawls horribly. Here are the concrete tiers:
- 16 GB : the bare minimum. Enough to run the coding agent (which can lean on a cloud model), but too tight for a real local AI model. Avoid it if you can.
- 32 GB : comfortable. You run a 30-billion-parameter model in a quantized (compressed) version, which already covers a huge range of needs. A solid entry point.
- 64 GB : the sweet spot. This is where local AI gets really serious: you load big models, you keep headroom for the system and the agent at the same time. If you’re torn between 32 and 64, take 64.
- 96 GB and up : the luxury. For the heaviest models or running several things in parallel. Nice, but not necessary to start.
2. Memory bandwidth, the hidden speed
Less well known, but decisive for comfort. Memory bandwidth is the speed at which the processor reads RAM. And an AI model has to reread its entire memory to produce each word. The higher the bandwidth, the faster the words come out (we talk about tokens per second). A machine with fast memory “types” its text before your eyes; a slow machine doles it out word by word. Keep an eye on this point, especially on Macs (see below), where unified memory is particularly fast.
3. The NVMe SSD, fast and roomy
The models are big: count on ~20 GB per model, and you’ll quickly collect several. Aim for an NVMe SSD of 1 TB minimum. NVMe (and not the old SATA SSD) because loading 20 GB into memory at launch should be nearly instantaneous, not a coffee break.
So what about the processor?
It matters less than you think for this use. Any AMD Ryzen or Intel Core from the last two or three generations does the job easily. Don’t pay extra for the fastest CPU, put that money into RAM. It’s the RAM that decides what you’ll be able to do.
Which form factor?
Several families, all valid. Stay neutral, choose based on your budget and your preferences.
- Barebones or ready-to-use mini-PCs : Minisforum, Beelink, GMKtec, ASUS NUC. The most flexible choice: compact, frugal, and on the barebones versions you add the RAM and SSD yourself, so you can push memory to the max cheaply.
- Apple Mac mini (M chips) : an excellent alternative, and even a well-kept secret for local AI. Its unified memory is fast and shared between CPU and GPU: a 64 GB Mac mini runs models that a dedicated graphics card can’t load. Small caveat: it runs macOS, not Linux.
iGPU or dedicated graphics card?
The big question, and the answer might surprise you. For this use:
- A dedicated NVIDIA graphics card greatly speeds up the speed of models. But it’s capped by its VRAM (24 to 32 GB in practice), it adds noise, power draw and cost, and it rarely fits in a mini case.
- Most people get along just fine with a mini-PC with a processor/iGPU + lots of RAM, running compact MoE models (clever models that activate only a part of themselves for each answer, see Choosing your local model).
The trade-off, plainly: a dedicated GPU gives you speed, but limits your model size and adds noise. Abundant RAM gives you big models, slower but silent. To start, the “lots of RAM, no GPU” approach is the simplest and most worry-free.
Frugal and silent: perfect for 24/7
We’ll say it again because it matters: these machines draw 10 to 30 W at idle and are nearly silent. That’s exactly what you want for a box running permanently in a corner of the office. A dedicated GPU breaks that calm a bit, it’s up to you whether the speed is worth it.
Don’t want to buy? The VPS option
Let’s be honest: you can also buy nothing and rent a server in the cloud, a VPS (virtual private server). It’s a virtual machine at a host (Hetzner, OVH, Scaleway, DigitalOcean…), billed monthly, already running Linux and reachable from anywhere. Everything else in the path (agent, Docker, networking, deployment) applies just the same.
It’s a route I’ve tested less than the homemade mini-PC, so I’ll give it to you for what it is, with its trade-offs:
- For : no hardware to buy, nothing to plug in, a public IP and bandwidth from the get-go, and you scale power up or down in a few clicks.
- Against : it’s a subscription (from a few euros to a few dozen per month, ticking even while you sleep), your data lives on someone else’s machine, and above all the big local LLM isn’t part of the deal : affordable VPSes have no GPU, so you fall back on the hybrid approach (orchestrator in the cloud, the VPS serves your projects and, at best, small models). GPU offers exist, but the price climbs fast.
My take: to learn and host projects without buying anything, a small VPS is a perfectly decent playground. To run real local models, the heart of this site, nothing replaces a machine of your own, with its RAM and, if you want, its GPU. Up to you where you set the cursor.
The recommendation table
Three profiles, depending on your budget and ambition.
| Profile | RAM | What for | Reference |
|---|---|---|---|
| Discovery budget | 32 GB | Coding agent + a small quantized local model | Entry-level barebones mini-PC |
| Comfort (recommended) | 64 GB | The sweet spot: big local models + agent, headroom everywhere | Minisforum / Beelink 64 GB, or Mac mini 32 GB |
| Heavy-duty | 96 GB+ | The heaviest models, several workloads in parallel | Mac mini 64 GB, or mini-PC + dedicated GPU if you want the speed |
If you should remember just one line: aim for “Comfort,” 64 GB. It’s the best fun/price ratio, and you won’t feel cramped six months from now.
The machines I recommend (and have tested)
Enough generalities, here are concrete models, several of them put through the test bench. Four families, depending on what you want to do.
The all-rounder mini-PC, Minisforum M2
My recommended starting point: compact, frugal, silent, and built to run 24/7 in a corner. You run the agent and a quantized local model on it without breaking a sweat. It’s the healthiest footprint / performance / price balance to get started. → My Minisforum M2 review on Frandroid
Unified memory, Mac mini M4, Mac Studio & Framework Desktop
If local AI is your real subject, unified memory is a weapon. CPU and GPU share a single, very fast memory, which lets you load big models that no consumer graphics card can hold.
- Mac mini (M4, M5) / Mac Studio : the best of the kind on the Apple side: ultra-fast unified memory, a tiny and silent machine. The latest M5 chips gain even more speed, and the Mac Studio scales very high on memory for the heaviest models. (Reminder: macOS, not Linux, see the box above.)
- Framework Desktop : my favorite on the x86 unified-memory side: it carries an AMD chip with generous and very fast unified memory, repairable and open like everything Framework makes. An excellent host for beefy local models, and it runs Linux. → My Framework Desktop review on Frandroid
Why unified memory is a game changer (and how to spot it)
This is the most important technical point in this whole guide, so let’s take our time. On a classic PC, there are two separate memories: the processor’s RAM, and the graphics card’s VRAM. The GPU can only use its VRAM, often 8, 12 or 16 GB, and not a byte more. It’s a wall.
Unified memory breaks that wall: CPU and GPU share one and the same pool of memory, very fast. So you can allocate a large share of that memory to the GPU when an AI model needs it. Concretely: a 64 GB unified-memory machine can present, say, 48 GB “as VRAM” to a model. No consumer graphics card knows how to do that. This is what lets a little Mac mini or a Framework Desktop load models that a €1,500 RTX simply cannot hold.
High bandwidth, a PC with a dedicated graphics card
Aiming for maximum speed, or also doing creative work (image generation, video, training)? There, a real graphics card with its dedicated memory makes full sense: its bandwidth crushes that of an iGPU, and the tokens fly. My value-for-money / memory tip: the GeForce RTX 5060 Ti 16 GB : 16 GB of VRAM at a reasonable price, the sweet spot for running good models fast without blowing the budget.
The ports to check before buying
Before clicking “order,” run through this little checklist:
2.5 GbE Ethernet
Not mandatory, but pleasant: fast wired networking is comfortable for transferring models or serving a project. Wi-Fi gets you by, the cable reassures.
Enough USB
Enough to plug in a keyboard, an install stick, an external drive for backups. Check there are at least two or three ports.
SODIMM RAM, upgradeable
The detail that changes everything: on barebones models, the RAM is in SODIMM sticks that you install yourself. You choose your capacity and you can upgrade later. Steer clear (for this project) of machines where the RAM is soldered and fixed, except Macs, whose unified memory is chosen at purchase and can’t be changed afterward.