NL EN Demo boeken Inloggen Aan de slag

Product

EU Router OpenAI-compatible API Anthropic-compatible API Model Garden Dedicated Instances Playground Fine-tuning (Loes)

Oplossingen

Use cases

LLM Inference RAG pipelines Chatbots AI agents Fine-tuning

Sectoren

Overheid Zorg Finance Juridisch

Modellen

Llama 3.3 70B Mistral DeepSeek R1 Qwen 2.5 72B Gemma 2 27B Codestral 22B Alle modellen →

Vergelijk

Azure OpenAI AWS Bedrock Claude API ChatGPT OpenAI

Resources

Documentatie Gids: migreren naar de EU Router Gids: eigen LLM deployen (vLLM)Gids: RAG bouwen op EU-GPUs Modelcatalogus

Bedrijf

Over ons Beveiliging Verwerkersovereenkomst Privacybeleid Voorwaarden Contact

Prijzen

Inloggen Aan de slag

Model garden

LongWriter llama3.1 8b

Name: LongWriter llama3.1 8b hosting (EU)
Brand: HostYourAI
Price: 0.10 EUR
Availability: InStock

Direct via de EU-router of als dedicated GPU-deployment. Data blijft in Europa.

LongWriter-llama3.1-8b is trained based on Meta-Llama-3.1-8B, and is capable of generating 10,000+ words at once.

Start gratis ← Alle modellen

zai-org/LongWriter-llama3.1-8b

text->text · zai-org · EU-hosted

Parameters

131K

Contextvenster

19GB

Minimale VRAM

POST /api/v1/chat/completions200 OK

Specificaties

Parameters 8B

Contextvenster 131,072 tokens

Minimale VRAM 19 GB

Architectuur LlamaForCausalLM (vLLM)

Licentie llama3.1

Modaliteit text->text

Uitgebracht August 2024

Uitgever zai-org ↗

Prijzen

€0.10

Input (per 1M tokens)

€0.18

Output (per 1M tokens)

Gedeelde EU-router, pay-per-token, scale-to-zero. Dedicated GPU-deployments worden per uur afgerekend — zie prijzen.

Direct aanroepen

Drop-in vervanger voor OpenAI: wijzig alleen de base-URL en de API-key. Ook het Anthropic-formaat (/v1/messages) wordt ondersteund.

curl https://hostyourai.com/api/v1/chat/completions \
  -H "Authorization: Bearer hyai-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "zai-org/LongWriter-llama3.1-8b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Veelgestelde vragen

Kan ik LongWriter llama3.1 8b in de EU draaien?

Ja. HostYourAI draait LongWriter llama3.1 8b op GPU's in Europese datacenters via vLLM. Prompts en outputs verlaten de EU niet en er is geen Amerikaanse cloudprovider in de keten.

Is LongWriter llama3.1 8b hosten AVG/GDPR-compliant?

Ja. Alle verwerking vindt plaats binnen de EU, er is een verwerkersovereenkomst (DPA) beschikbaar en de subprocessor-lijst is openbaar. Open-source gewichten betekenen ook: geen training op jouw data.

Wat kost LongWriter llama3.1 8b?

Via de gedeelde EU-router betaal je €0.10 per miljoen input-tokens en €0.18 per miljoen output-tokens, zonder vaste kosten. Voor hoge volumes of isolatie kun je LongWriter llama3.1 8b ook als dedicated GPU-instance per uur draaien.

Is de API compatibel met OpenAI?

Ja. Je gebruikt de standaard OpenAI-SDK's met een aangepaste base-URL (https://hostyourai.com/api/v1). Ook de Anthropic Messages API wordt ondersteund als drop-in.

Andere modellen van Z.AI

GLM 5.2

We're introducing GLM-5.2, our latest flagship model for long-horizon tasks. It marks a substantial leap in long-horizon task capability over its predecessor GLM-5.1 and, for the first time, delivers that capability on a solid 1M-token context. GLM-5.2's new capabilities include: - Solid 1M Context: A solid 1M-token context that stably sustains long-horizon work - Advanced Coding with Flexible Effort: Stronger coding capabilities with multiple thinking effort levels to balance performance and latency - Improved Architecture: We propose IndexShare, which reuses the same indexer across every fou

753B 1M context Bekijk model →

GLM 5.2 FP8

753B 1M context Bekijk model →

GLM 5.1 FP8

GLM-5.1 is our next-generation flagship model for agentic engineering, with significantly stronger coding capabilities than its predecessor. It achieves state-of-the-art performance on SWE-Bench Pro and leads GLM-5 by a wide margin on NL2Repo (repo generation) and Terminal-Bench 2.0 (real-world terminal tasks).

754B 203K context Bekijk model →

GLM 5.1

754B 203K context Bekijk model →

GLM 5

We are launching GLM-5, targeting complex systems engineering and long-horizon agentic tasks. Scaling is still one of the most important ways to improve the intelligence efficiency of Artificial General Intelligence (AGI). Compared to GLM-4.5, GLM-5 scales from 355B parameters (32B active) to 744B parameters (40B active), and increases pre-training data from 23T to 28.5T tokens. GLM-5 also integrates DeepSeek Sparse Attention (DSA), largely reducing deployment cost while preserving long-context capacity.

754B 203K context Bekijk model →

GLM 5 FP8

754B 203K context Bekijk model →

Probeer LongWriter llama3.1 8b gratis

Account aanmaken duurt een minuut. Test LongWriter llama3.1 8b direct in de playground.

Start gratis