> ## Documentation Index
> Fetch the complete documentation index at: https://docs.adrian.secureagentics.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Hosted and self-hosted

> Two ways to run Adrian's backend: hosted by Secure Agentics, or self-hosted from the open-source release.

Adrian is available as a free hosted service managed by Secure Agentics, and as a self-hosted open-source release. Both share the same architecture and SDK integration. The hosted route is the quickest way to get started, and is free (forever) with a generous fair-use policy. The self-hosted route is for those that prefer to run stuff locally, or who need data sovereignty.

## Hosted

Run by Secure Agentics on managed AWS infrastructure and server-grade GPUs. No setup beyond installing the SDK and pointing it at the hosted backend at wss\://adrian.secureagentics.ai/ws.

Expected latency is roughly **100-600 ms per event classification**. Treat this as rough guidance. Actual latency depends on:

* Region
* Server load at the time
* Size of the event being classified
* Severity and complexity of the classification

Latency was benchmarked using L4, L40S, and H100 GPUs. Other GPU classes have not yet been measured.

Sub-60 ms latencies have been achieved in testing with optimisations that are not yet in production. Production rollout is planned and will reduce these numbers further.

## Self-hosted

Self-hosted Adrian runs the Go backend, the Next.js dashboard, and a bundled Llama.cpp container serving Gemma 4 (E2B or E4B) entirely on your own infrastructure. Bring-up is a single `docker compose --profile llm up` after a one-shot bootstrap. See the [Backend reference](/reference/backend) for configuration and endpoints.

Expected latency is roughly **\~500 ms per event classification** on Gemma 4 E4B running on an NVIDIA RTX 5070 Mobile. Treat this as rough guidance. Actual latency depends on:

* GPU class (server-grade GPUs run faster; older or smaller-VRAM GPUs slower)
* Model variant (Gemma 4 E2B is smaller and faster than E4B)
* Server load at the time
* Size of the event being classified
* Severity and complexity of the classification
