Studio CodeAI
PROJECTSJune 2, 2026

HERMES Agent: Sovereign and Secure Deployment on VPS

Projects · AI Agents · Security

Secure deployment of Hermes Agent on a VPS — first installment in our series on sovereign AI agents. A fundamental question for any organization adopting AI: who controls this assistant? Its data, its actions, its memory — where do they actually live?

For two years, we have settled into a routine: open a tab, type a question, read an answer. For most of us, artificial intelligence lives in someone else's browser.

A new generation of tools is quietly changing that logic. These are no longer chatbots we visit, but agents that reside somewhere — ideally, on a machine we control. They remember conversations across sessions, execute tasks, and remain reachable through your usual messaging applications.

This article describes how we deployed Hermes Agent, an open-source framework, on a simple virtual private server (VPS), with security as the guiding principle from start to finish.

What is Hermes Agent?

Hermes Agent is an autonomous agent framework, published as open source (MIT licence) by the Nous Research lab. It belongs to a recent wave of similar tools sharing a common promise: moving AI from "responder" to "actor."

Concretely, it is neither a coding assistant grafted onto an editor, nor a wrapper around a single API. It is software that lives on a server, retains memory between sessions, and grows more useful the more you use it.

The distinction that clarifies everything: the framework and the model

When people say "Hermes," they are actually referring to two different things. Confusing them means missing the point entirely.

Fig. 01 — The framework and the model: two distinct layers

HERMES AGENT (framework) memory · channels · tools · security guardrails <- the CAR (the chassis, stable) calls THE MODEL (the "brain") Claude Haiku via API · or a local model… <- the ENGINE (interchangeable)

The framework is the car: the chassis, the dashboard, the controls. It manages the agent's memory, communication channels, tools, and guardrails. It is stable and does not change from day to day.

The model is the engine: the intelligence itself. And it is interchangeable. The same framework can run with a cloud-based model (such as Claude, via API) or with a model running entirely on your own server. This modularity is what makes Hermes compelling: you choose the trade-off between capability, cost, and confidentiality, without changing tools.

Why Hermes, and how does it differ from similar tools?

Hermes is not alone in this space. Other open-source agent frameworks have paved the way. The category is young and maturing quickly.

What sets Hermes apart is a stated obsession with two qualities that have historically been hard to achieve in an agent: reliability and persistent memory. Where many agents forget everything between tasks, Hermes is designed to remember and pick up where it left off. And a key point for SMEs: it runs on modest hardware as long as the model is called via an API — no expensive GPU required.

Security first: think like an architect, not a tinkerer

This is where the real difference between a demo and a responsible deployment is made. Granting an agent the ability to act on a server is handing it keys. The question is not whether to hand them over, but how many, and to which room.

Least privilege. The agent receives only the capabilities it strictly needs. We disabled browser automation and desktop control — useless on a headless server. Every permission granted is a considered one.

Isolation, or controlling the blast radius. Without precautions, the agent acts directly on the entire server. With Docker container isolation, it works inside a disposable sandbox: a clean room it cannot leave.

Fig. 02 — The "blast radius": without and with isolation

WITHOUT ISOLATION THE VPS agent acts on the entire server An error can affect the entire server WITH DOCKER ISOLATION THE VPS AGENT in a sandbox — door closed — An error remains confined to the container

The analogy holds in a single image: would you rather have a trainee roam freely through all your offices, or work in a dedicated room with the door closed? Isolation does not eliminate risk — it limits its reach.

No secrets in plaintext. Access keys never appear in a readable config file, a conversation, or a code repository. They live in environment variables, separately. An exposed secret, even for a second, must be treated as compromised and immediately rotated.

Deployment in four steps

Hermes's official Docker container method follows a clean, reproducible logic. We approached it as a checklist, validating each step before moving to the next.

Fig. 03 — Deployment in four steps

Docker Image official retrieved Configuration guided setup local API key Test agent responds before opening Gateway permanent auto-restart each step validated before the next

1. The image. Pull the official Hermes Docker image — a ready-to-use package containing everything the agent needs to run.

2. Configuration. A guided setup asks the right questions: which model, which tools, which channels. The API key is entered directly in the server terminal — never displayed, never copied elsewhere.

3. The test. A rule we never skip: verify that the agent responds from the command line before opening it to the world. If it says hello correctly here, the foundation is sound.

4. The permanent gateway. Launch the container in the background with automatic restart. The agent now survives a server reboot without any manual intervention.

One detail that changes everything: all memory, configuration, and sessions live in a persistent volume, separate from the software itself. The image can be updated without losing anything — and that volume is backed up to a private Git repository.

Fig. 04 — The overall architecture

YOU YOUR VPS THE CLOUD message (terminal, messaging...) response HERMES AGENT (Docker container) memory (persistent volume) · confined tools communication channels · guardrails request response Model API (Claude...) backup private Git repo

What this changes in practice

At the end of this deployment, the agent is always available, hosted on infrastructure we control, with permissions, memory, and access channels all under our governance.

To be precise — because intellectual honesty matters: as long as the model is called via a cloud API, the reasoning leaves your server. Sovereignty is never absolute by default — it is a dial you adjust. What self-hosting gives back is control over everything else: the infrastructure, stored memories, permitted actions, and the option — tomorrow — to replace the cloud engine with a fully local model for the most sensitive workloads.

For an SME, a small business, or a public body, this is exactly the right starting point: understand the architecture, stay in control, and increase the level of data confidentiality at the pace your real needs demand.

Sovereign AI Agents series — installment 1

Ready to deploy your first AI agent
on your own infrastructure?

Book a call →