NewsPublished on June 16, 2026· 3 min read

Build a local AI inference appliance with Ubuntu Core 26 in a VM

Escritorio de Ubuntu 25.04 (Plucky Puffin) con GNOME — Imagen: Canonical Ltd. / GPL · Wikimedia Commons

Canonical has published a tutorial aimed at anyone who wants to play with Ubuntu Core 26 before committing to dedicated hardware. The premise is straightforward: spin up a virtual machine, install an inference snap, and end up with an AI appliance that answers through an OpenAI-compatible API. Everything runs locally, no cloud involved, and you never touch a physical device until you know exactly what you want to build.

Ubuntu Core is the flavour of Ubuntu built for appliances, edge devices, robotics and industrial systems. It’s minimal and entirely snap-based, which changes the way you work compared to a classic Ubuntu Server. The tutorial uses that design to show the full path, from a quick VM test to the route that leads to a production image.

Spinning up the machine and loading the model

The starting point is Multipass, which pulls the Core 26 image and boots the VM with a single command:

multipass launch core26 -n aibox --cpus 4 --memory 10GB --disk 16GB

Four CPUs, 10 GB of memory, 16 GB of disk. Once inside, the workload arrives as another snap:

sudo snap install gemma4

The gemma4 snap installs the runtime and model configuration that suit the host machine, so there’s no dependency wrangling or manual environment setup. You can confirm the service is running with gemma4 status.

By default the inference service only listens on localhost inside the VM. To expose it to the host machine, you set two snap options:

sudo gemma4 set http.host=0.0.0.0 webui.http.host=0.0.0.0 --assume-yes

From there you get two entry points. The OpenAI-compatible API lives at http://<VM-IP>:8336/v1, which lets you point any client that already speaks that protocol without rewriting anything. The web UI, for hands-on testing, sits at http://<VM-IP>:8337.

Why the appliance model matters

What makes this interesting isn’t just that it works, it’s how it’s separated. The Ubuntu Core base system stays isolated from the application workload: the snap delivers the AI layer, services run as managed components, and configuration happens through snap options rather than editing system files by hand. That cuts down the places where something can break and keeps the operating system in a known state.

That same separation is what makes the jump to production easier. Manual snap installation is the development workflow; when you want this on real hardware, you define a custom Ubuntu Core image through model assertions. An assertion describes which snaps are included, their configuration, permissions and update policies, so the device boots with everything in place and nobody has to install anything manually.

The rest of the model fits that philosophy: transactional updates, application snaps that update independently, rollback when something goes wrong, and centralised fleet management with Landscape once you have many devices spread out. If you come from the traditional server world, this “everything is an immutable, reproducible snap” approach pays off most when you have to maintain hundreds of devices in places where nobody is going to SSH in to fix things.

For anyone weighing up AI at the edge, this tutorial is a cheap way to feel out the appliance model without buying anything. Build the VM, break it, rebuild it, and once the design is clear, move it to a Core image for whatever hardware you settle on.

Source

Original article by Canonical on the Ubuntu blog: A look into Ubuntu Core 26: Building a local AI inference appliance in a virtual machine. Published by Canonical.

Build a local AI inference appliance with Ubuntu Core 26 in a VM

Spinning up the machine and loading the model

Why the appliance model matters

Source

Related systems