Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.adrian.secureagentics.ai/llms.txt

Use this file to discover all available pages before exploring further.

Adrian is available as a free hosted service managed by Secure Agentics, and as a self-hosted open-source release. Both share the same architecture and SDK integration. The hosted route is the quickest way to get started, and is free (forever) with a generous fair-use policy. The self-hosted route is for those that prefer to run stuff locally, or who need data sovereignty.

Hosted

Run by Secure Agentics on managed AWS infrastructure and server-grade GPUs. No setup beyond installing the SDK and pointing it at the hosted backend at wss://adrian.secureagentics.ai/ws. Expected latency is roughly 100-600 ms per event classification. Treat this as rough guidance. Actual latency depends on:
  • Region
  • Server load at the time
  • Size of the event being classified
  • Severity and complexity of the classification
Latency was benchmarked using L4, L40S, and H100 GPUs. Other GPU classes have not yet been measured. Sub-60 ms latencies have been achieved in testing with optimisations that are not yet in production. Production rollout is planned and will reduce these numbers further.

Self-hosted

Self-hosted Adrian runs the Go backend, the Next.js dashboard, and a bundled Llama.cpp container serving Gemma 4 (E2B or E4B) entirely on your own infrastructure. Bring-up is a single docker compose --profile llm up after a one-shot bootstrap. See the Backend reference for configuration and endpoints. Expected latency is roughly ~500 ms per event classification on Gemma 4 E4B running on an NVIDIA RTX 5070 Mobile. Treat this as rough guidance. Actual latency depends on:
  • GPU class (server-grade GPUs run faster; older or smaller-VRAM GPUs slower)
  • Model variant (Gemma 4 E2B is smaller and faster than E4B)
  • Server load at the time
  • Size of the event being classified
  • Severity and complexity of the classification