HostYourAI

An open Dutch AI project

Meet Loes. Sovereign Dutch AI.

A model picks up its habits from whatever data it’s fed. When that data sits outside Europe, where none of us can look, so do its blind spots. Loes is the opposite: a strong European base, taught on clean Dutch data we can actually show you, trained and hosted entirely in the EU. Making her a “she” is deliberate too: a quiet counterweight in a field that mostly leans the other way.

Local, Open, Ethical, Sovereign.

European base weights/ Clean, accountable data/ Trained & hosted in the EU/ Built in the open

Why Loes is different

Sovereign by design.

Built for EU rules

Trained and hosted on GPUs inside the EU, under EU law, on a data mix you can audit. We build it to fit the rules from the start, not patch it up afterwards.

Bias you can inspect

Trained on data we can actually point to, in our own languages. So when it leans a certain way, we can show you why, and you can call it out.

Built in the open, together

Open base models from EU labs, clean licensed data, and room for anyone who wants to help push it forward.

How it works

Exactly how we build Loes.

No magic, and nothing hidden. Here’s the whole pipeline, start to finish, with the real settings we use.

  1. 01

    Line up European base models

    We take a few openly licensed European bases, Salamandra from the Barcelona Supercomputing Center, EuroLLM-9B, Mistral-7B, Gemma, and run them against each other instead of betting on one.

  2. 02

    Gather clean, licensed data

    A hand-picked mix of openly licensed public data, Aya, xP3x, Dolly and more, with the licence checked on every single set. No pirated text, no scraped-from-nowhere corpora.

  3. 03

    Fill the Dutch gaps ourselves

    Good open Dutch conversation data is thin, so we generate our own, using an open teacher model (Gemma), never a closed one like GPT-4. That keeps the result clean enough to actually sell.

  4. 04

    Fine-tune with QLoRA

    We train each base on that mix with QLoRA, LoRA rank 32, learning rate 2e-4, two epochs, 4k context, all on GPUs inside the EU. Light enough to do in hours, not weeks.

  5. 05

    Merge and store the weights

    The trained adapter is merged back into the base model, and the full weights are pushed to a private, EU-hosted repository. Nothing ever leaves European servers.

  6. 06

    Test it on real Dutch

    Every candidate gets the same held-out Dutch test. Only the one that clears the bar, a Dutch-quality score of 70% or higher, ships. The rest get dropped, no exceptions.

  7. 07

    Serve it behind a normal API

    The winner runs on EU GPUs with vLLM, behind a standard OpenAI-compatible API. Point your existing code at a new URL and you’re done.

Where this is going

The roadmap.

We’re building Loes in stages, cheapest and fastest first. Each stage has to prove itself before we spend on the next.

Phase 1 In progress

Loes already beats GPT-NL.

The base bake-off, instruction-tuning on the clean mix, and our own Dutch conversation data, all gated on a real Dutch eval. This is the part running now, and it already clears the existing public Dutch baseline.

Days of work · tens of euros of compute

Phase 2 Next

Close whatever Dutch gap is left.

If phase 1 still falls short, we keep pre-training the base on around 100 billion tokens of clean Dutch text, the GPT-NL public corpus, FineWeb-2, Wikipedia, then fine-tune and test again.

Days to weeks · low thousands of euros

Phase 3 Planned

Law and government, done properly.

A version that’s genuinely good at Dutch law and public-sector work, trained on open sources the big frontier models barely touch, published case law, legislation, and EUR-Lex. This is where a sovereign model earns its keep.

After phases 1 and 2 land

01  Contribute

Help build Loes,
in any way.

No single team builds something like Loes. Every part of her is something someone out there could help with, and if that’s you, we’d genuinely like to hear from you.

Data

Clean, openly licensed Dutch & EU text or instruction data, so we can cover more ground without wandering into legal grey areas.

Compute

GPUs, credits, or spare capacity, so we can train and test more models, sooner.

Expertise

NLP, evaluation, linguistics, or deep knowledge of a field, to help make Loes genuinely good.

Funding & partners

Back the project, or team up with us to push European AI forward.

Offer a hand

Tell us what you can bring. We read everything that comes in.

Made in the Netherlands.
Built in the open.

Trained and hosted in the EU, by the people who actually run the GPUs. Come and help us build it.