HostYourAI

Product

EU Router HostYourAI Code OpenAI-compatible API Anthropic-compatible API Model Garden Dedicated Instances Playground Fine-tuning (Loes)Connect your GPU pool

Solutions

Use cases

HostYourAI Code LLM Inference RAG pipelines Chatbots AI agents Fine-tuning

Industries

Government Healthcare Finance Legal

Models

DeepSeek V4 Pro DeepSeek V4 Flash GLM 5.2 Llama 3.1 405B Qwen3.5 397B Llama 3.3 70B Mistral DeepSeek R1 All models →

Compare

Azure OpenAI AWS Bedrock Claude API ChatGPT OpenAI

Resources

Documentation Guide: migrate to the EU Router Guide: deploy your own LLM (vLLM)Guide: build RAG on EU GPUs Model catalog

Pricing

NL EN DE

Model garden Router · on request Dedicated · on request

gemma 2 27b

Name: gemma 2 27b hosting (EU)
Brand: HostYourAI
Availability: LimitedAvailability

This model runs as a dedicated deployment on large GPUs and isn't in the shared playground by default. Get in touch and we'll set it up for you.

gemma 2 27b is an open-source language model from Google with 27B parameters, hosted on EU GPUs via an OpenAI-compatible API.

Request access ← All models

google/gemma-2-27b vLLM ready

text->text · google · EU-hosted

27B

Parameters

—

Context window

160GB

Minimum VRAM

POST /api/v1/chat/completions On request

Specifications

Parameters 27B

Minimum VRAM 160 GB

Architecture Gemma2ForCausalLM (vLLM)

License gemma

Modality text->text

Released June 2024

Publisher google ↗

Pricing

Shared router · per token

On request

Not available on the shared router. Pricing on request as a dedicated GPU deployment.

Dedicated GPU · per hour

On request

Dedicated deployment, from 160 GB of VRAM. Billed per GPU-hour.

Shared EU router, pay-per-token, scale-to-zero. Dedicated GPU deployments are billed hourly, see pricing.

✓ Verified working on 24-07-2026, responded in 81026 ms on our EU infrastructure.

Call it now

Drop-in replacement for OpenAI: change only the base URL and API key. The Anthropic format (/v1/messages) is supported too.

curl https://hostyourai.com/api/v1/chat/completions \
  -H "Authorization: Bearer hyai-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/gemma-2-27b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Frequently asked questions

Can I run gemma 2 27b in the EU?

Yes. HostYourAI runs gemma 2 27b on GPUs in European datacenters via vLLM. Prompts and outputs never leave the EU and there is no US cloud provider in the chain.

Is hosting gemma 2 27b GDPR-compliant?

Yes. All processing happens inside the EU, a Data Processing Agreement (DPA) is available and the subprocessor list is public. Open-source weights also mean: no training on your data.

How much does gemma 2 27b cost?

gemma 2 27b needs several GPUs at once, so it runs as a dedicated deployment billed per GPU-hour rather than per token. Tell us your volume and we will work it out with you.

Is the API OpenAI-compatible?

Yes. You use the standard OpenAI SDKs with a custom base URL (https://hostyourai.com/api/v1). The Anthropic Messages API is supported as a drop-in as well.

More models from Google

gemma 4 31B it qat w4a16 ct

[!Note] This model card is for the new versions of the Gemma 4 family optimized with Quantization-Aware Training (QAT), which allows preserving similar quality to bfloat16 while dramatically reducing the memory requirements to load the model. Four versions of the QAT checkpoints are available: Unquantized QAT checkpoints (Q40): Half-precision weights extracted from the QAT pipeline, ideal for custom downstream compilation and research. Available for Gemma 4 E2B, E4B, 12B, 26B A4B, and 31B, and their drafter models. GGUF (Q40): Ready-to-deploy formats for broad ecosystem compatibility. Availabl

34B 262K context View model →

gemma 4 26B A4B it qat q4 0 unquantized

27B 262K context View model →

gemma 4 31B it qat q4 0 unquantized

33B 262K context View model →

gemma 4 31B

Gemma is a family of open models built by Google DeepMind. Gemma 4 models are multimodal, handling text and image input (with audio supported on E2B, E4B, and 12B) and generating text output. This release includes open-weights models in both pre-trained and instruction-tuned variants. Gemma 4 features a context window of up to 256K tokens and maintains multilingual support in over 140 languages.

33B 262K context View model →

gemma 4 26B A4B

27B 262K context View model →

gemma 4 26B A4B it

27B 262K context View model →

Request access

gemma 2 27b isn't available by default yet. Leave your details and we'll arrange a dedicated deployment.

Request access