HostYourAI

Pricing

OpenAI-compatible inference with clear tenancy choices: shared, dedicated, or private.

HostYourAI offers three execution modes side by side on a single account and credit balance. Start with EU Hosted inference, move to dedicated capacity when the workload or compliance profile requires it.

1. EU Hosted Gateway — pay per token

One OpenAI-compatible API key, model catalog, hyai/auto, and scale-to-zero shared capacity. Best for SaaS integrations, agencies, agent apps, and experimentation. Indicative tariffs (EUR per million tokens):

Model classInput €/MOutput €/M
≤ 8B (Llama 3.2, Qwen 3 small, Phi-4 mini)0.030.06
≤ 16B (Llama 3.1 8B, Qwen 2.5 7B, Mistral 7B)0.050.10
≤ 32B (Qwen 3 14B, Phi-4, Mistral Nemo 12B)0.100.18
≤ 48B (Mistral Small 3.1, DeepSeek Coder V2 Lite)0.150.25
≤ 80B (Qwen 3 32B, DeepSeek R1 distill 32B)0.250.40
Large (70B+, Mixtral, DeepSeek-V3)0.400.60

Current beta framing: EU Hosted means EU-located GPU processing with shared router capacity. EU Sovereignty Mode is sold separately once a fully EU-sovereign provider chain, DPA, subprocessors, audit export, and support-access controls are active.

2. Dedicated EU Deployment — pay per hour

You pick a GPU class and region, deploy your own vLLM instance, and pay for as long as it runs. Best for custom Hugging Face models, BYOK upstreams, steady high-volume workloads, or when you need full control over the deployment.

GPU classTypical useFrom (EUR / hr)
1× L40S / RTX 4090 (24–48 GB)Models up to ~20B€ 0.40
1× A100 80 GB / H100 80 GBModels up to ~70B (quantised)€ 2.20
2× H100 (160 GB)70B fp16 / HA€ 4.40
4× H100 (320 GB)Large models / high throughput€ 8.80

Indicative — live pricing depends on EU-region GPU availability at our providers. Exact price for each offer is shown before you deploy.

3. Single-Tenant Encrypted — capacity per month

Private runtime with dedicated, isolated GPU(s) per customer, at-rest encryption, optional BYOK (your own KMS key), and a private network policy. Best for healthcare, government, legal, finance, and workloads that cannot use shared capacity.

ConfigurationVRAMPer monthSetup (one-off)
1× L40S48 GB€ 1,200€ 500
1× H10080 GB€ 3,500€ 1,000
2× H100160 GB€ 6,500€ 1,000
4× H100320 GB€ 12,500€ 1,500

Always-on or on-demand availability. BYOK onboarding +€ 1,000 one-off. Setup includes provisioning, isolation configuration, network policy, DPA onboarding. Confidential computing (TEE) is on the roadmap; we will not price what we have not yet validated.

BYOK — bring your own API key

You can attach your own OpenAI / Anthropic / Google / Mistral API key to an instance. We forward your traffic to the upstream under your contract with them — we don't bill compute, we only charge a thin proxy fee per request (€ 0.0005 / request, capped at €5 / month per BYOK instance) which covers our routing, audit log and uptime. Useful for hybrid setups that mix EU-hosted open-weights with frontier closed models.

Free tier

Billing

What's not on the price list

Bespoke procurement, custom contracts, NEN 7510 / BIO audit packages, white-label / reseller arrangements, and confidential-computing deployments are quoted per project. Talk to us via /contact.