The EU LLM router

Your AI. Your Infra. In Europe.

One central LLM router for every open model, billed per token. Call it over plain HTTP, or drop it into the OpenAI or Anthropic SDK you already use. Need isolation? Spin up a dedicated GPU instance. Every model is tested on our own hardware, and it all runs in Europe.

main.py
from openai import OpenAI

client = OpenAI(
    base_url="https://api.hostyour.ai/v1",
    api_key="hyai_..."
)

response = client.chat.completions.create(
    model="llama-3.2-70b",
    messages=[{"role": "user", "content": "Hello!"}]
)

Trusted by teams at

Rijksuniversiteit Groningen Hanzehogeschool Provincie Drenthe Frisius AI Jumbo
4 simple steps

How it works

From sign-up to a live AI pipeline in four steps.

1

Create an account

Sign up for free with your email. No credit card required.

2

Create an API key

Generate a Router key in seconds. It's one drop-in endpoint for every model.

3

Pick a model

Use a Router model and pay per token, let hyai/auto choose, or deploy a dedicated GPU instance.

Go live

Point your existing OpenAI SDK at our base URL and ship. There's nothing else to change.

zsh · python
$ pip install openai
Successfully installed openai-1.x
$ python
>>> from openai import OpenAI
>>> client = OpenAI(
... base_url="https://api.hostyour.ai/v1"
... )
>>> response = client.chat.completions.create(
... model="llama-3.3-70b",
... messages=[{"role": "user", "content": "Hi!"}]
... )
>>> print(response.choices[0].message.content)
"Hello! How can I help you?"
Features

Open-weight LLMs, hosted in Europe

A shared Router that drops into your OpenAI SDK, plus dedicated GPU instances. We only list models we've run and tested ourselves.

EU Inference Router

One endpoint that drops into your OpenAI SDK, billed per token. Open-weight models run on vLLM in the EU, with no dependency on US clouds.

Dedicated GPU instances

A single-tenant vLLM server on your own GPU. No shared resources, no one else's rate limits, and it's ready in minutes.

Drop-in for OpenAI & Anthropic

Keep the OpenAI or Anthropic SDK you already use. Point it at our base URL, swap the model, and you're done. No new client, no rewrite.

Verified Model Garden

Every week we boot and test each model on our own hardware, so you only ever see the ones that actually work.

EU hosted & GDPR

European datacenters, end-to-end encryption, and full data sovereignty. Your data never leaves the EU.

Playground & teams

Try any model right in the browser, then bring your team along with role-based access.

Router

One endpoint for every open model

The Router is a shared inference gateway that drops straight into your OpenAI or Anthropic SDK. Point it at one base URL, pick a model (or let hyai/auto choose for you), and pay only per token. Every model runs open-weight on vLLM inside the EU. It warms up when you need it and scales back to zero when you don't.

  • A true drop-in for the OpenAI and Anthropic SDKs: change one base URL and you're live
  • hyai/auto picks the best available model for each request
  • Pay per token, with no idle GPU cost. It scales to zero when nothing's running
  • Open-weight models served on vLLM, all hosted in the EU
POST /v1/chat/completions
{ "model": "hyai/auto",
"messages": [ … ] }
Routed to
llama-3.3-70b warmEU
qwen-2.5-72b warmEU
mistral-small warming upEU
Instances

Your own AI instance

Each instance is a dedicated AI model running on its own GPU. Choose from 100+ text models or image generation models like FLUX and SDXL, select your European region, and deploy with one click. You can also deploy any custom HuggingFace model. You get a private OpenAI-compatible API endpoint with no shared resources and no rate limits from others.

  • Text generation (Llama, Qwen, DeepSeek) and image generation (FLUX, SDXL) on dedicated GPUs
  • Deploy any custom HuggingFace model or choose from our curated list
  • OpenAI-compatible API endpoint ready in ~10 minutes, works with any SDK
  • Start, stop and scale on demand. Pay only when running
Already have an API key?

Prefer to use your own key from OpenAI, Groq, or another provider? Connect it in seconds and get the same proxy endpoint, knowledge base, chatbots, and all other features. No GPU needed.

OpenAI Groq Mistral DeepSeek Together Custom
Instances
DeepSeek R1 70B
A100 80GB · Amsterdam
running
Llama 3.3 70B
A100 80GB · Frankfurt
running
FLUX.1 Schnell IMAGE
RTX 4090 · Amsterdam
running
gpt-4o-mini BYOK
OpenAI · own key
running
Qwen 2.5 72B
H100 80GB · Helsinki
stopped
3 active · 1 stopped
+ New instance
Team: Engineering
MV
Martijn de Vries
martijn@company.com
admin
SB
Sophie Bakker
sophie@company.com
member
JK
Jan Koster
jan@company.com
member
3 instances
2 knowledge bases
+ Invite
Teams

Collaborate with your team

Invite colleagues to your workspace and work together on AI projects. Share instances, knowledge bases, and bots across your team. Each member gets a role (admin or member) so you control who can deploy, edit, or only view.

  • Invite team members by email. They join instantly with one click
  • Share instances, knowledge bases, and bots across the team
  • Role-based access: admin (full control) or member (use & view)
Try it live

API Playground

Test our OpenAI-compatible API directly in your browser. Same interface, your own infrastructure.

curl https://api.hostyour.ai/v1/chat/completions \
  -H "Authorization: Bearer hyai_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-r1-distill-llama-70b",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is the capital of the Netherlands?"}
    ],
    "temperature": 0.7,
    "max_tokens": 1024
  }'
Playground
U
User
What is the capital of the Netherlands?
AI
DeepSeek R1 127 tokens · 342ms
The capital of the Netherlands is Amsterdam. However, The Hague (Den Haag) is the seat of government where the parliament is located.
Type your message...
100%
OpenAI Compatible
<100ms
Time to first token
0
Code changes needed
100+ models

Text & Image Models

From Llama to FLUX. Deploy text or image generation models with one click, or bring your own HuggingFace model.

DeepSeek R1 32B
DeepSeek R1 70B
DeepSeek R1 7B
DeepSeek Coder V2
Qwen 2.5 72B
Qwen 2.5 32B
Qwen Coder 32B
Llama 3.3 70B
Llama 3.1 70B
Llama 3.1 8B
Mixtral 8x22B
Mixtral 8x7B
Mistral Small 22B
Mistral Nemo 12B
Mistral 7B
Codestral 22B
Gemma 2 27B
Gemma 2 9B
Phi 3.5 MoE
Phi 3 Medium
CodeLlama 70B
CodeLlama 34B
StarCoder2 15B
Command R+
Command R
Yi 1.5 34B
InternLM 2.5 20B
Vicuna 13B
FLUX.1 Schnell
SDXL 1.0
FLUX.1 Dev
SD 3.5 Medium
+ 50 more

We set up your GPU

No CUDA drivers, no Docker, no ML ops. We ensure your model runs optimally. Works for text and image models.

Custom HuggingFace Models

Don't see your model? Deploy any HuggingFace model directly. Just enter the model ID and required VRAM.

0
DevOps needed
100+
AI models
6
Providers (BYOK)
4
EU datacenters
EU Sovereign

Your data, safe in Europe

Complete data sovereignty. No American cloud, no CLOUD Act, no worries.

EU Datacenters

Amsterdam, Frankfurt, Paris, Helsinki

GDPR Compliant

Full compliance with EU privacy legislation

No CLOUD Act

Beyond reach of American legislation

Dedicated Hardware

Your model on your own GPU, no sharing

GDPR

No vendor lock-in, no unexpected price increases, no content policies limiting you, no data being used for training purposes. Open-source models, European infrastructure.

Testimonials

What our customers say

Teams across Europe are building with HostYourAI.

"Finally a platform where we don't have to manage GPUs ourselves. Deploy in 10 minutes, OpenAI-compatible API, and everything runs in the EU."

MV
Martijn de Vries
CTO, DataFlow AI

"We switched from AWS Bedrock. Costs are 40% lower and we now have full control over which model we run."

SB
Sophie Bakker
Lead Developer, TechNL

"For our research, GDPR compliance was essential. HostYourAI offers dedicated instances in Amsterdam with complete data sovereignty."

JK
Dr. Jan Koster
AI Researcher, RUG
Pricing

Simple and transparent

GPU instances pay-as-you-go. BYOK is free.

Pay as you go
Credits for GPU instances. BYOK at no platform cost
From €0.10 /hour
GPU price varies per type • BYOK instances are free
  • GPU deploy or bring your own API key
  • All models, agents & knowledge bases
  • Top up with iDEAL or credit card
  • Teams, workflows & templates included
Create account

Need enterprise? Contact us

Ready to build your AI pipeline?

Connect your data, deploy your model, and go live. Up and running in minutes.