NL EN Book Demo Login Get Started

Product

EU Router OpenAI-compatible API Anthropic-compatible API Model Garden Dedicated Instances Playground Fine-tuning (Loes)

Solutions

Use cases

LLM Inference RAG pipelines Chatbots AI agents Fine-tuning

Industries

Government Healthcare Finance Legal

Models

Llama 3.3 70B Mistral DeepSeek R1 Qwen 2.5 72B Gemma 2 27B Codestral 22B All models →

Compare

Azure OpenAI AWS Bedrock Claude API ChatGPT OpenAI

Resources

Documentation Guide: migrate to the EU Router Guide: deploy your own LLM (vLLM)Guide: build RAG on EU GPUs Model catalog

Company

About us Security Data Processing Agreement Privacy policy Terms of service Contact

Pricing

Model garden

UserLM 8b

Name: UserLM 8b hosting (EU)
Brand: HostYourAI
Price: 0.10 EUR
Availability: InStock

Instantly via the EU router or as a dedicated GPU deployment. Data stays in Europe.

Unlike typical LLMs that are trained to play the role of the "assistant" in conversation, we trained UserLM-8b to simulate the “user” role in conversation (by training it to predict user turns in a large corpus of conversations called WildChat). This model is useful in simulating...

Start for free ← All models

microsoft/UserLM-8b

text->text · microsoft · EU-hosted

Parameters

Context window

19GB

Minimum VRAM

POST /api/v1/chat/completions200 OK

Specifications

Parameters 8B

Context window 8,192 tokens

Minimum VRAM 19 GB

Architecture LlamaForCausalLM (vLLM)

License mit

Modality text->text

Released September 2025

Publisher microsoft ↗

Pricing

€0.10

Input (per 1M tokens)

€0.18

Output (per 1M tokens)

Shared EU router, pay-per-token, scale-to-zero. Dedicated GPU deployments are billed hourly — see pricing.

Call it now

Drop-in replacement for OpenAI: change only the base URL and API key. The Anthropic format (/v1/messages) is supported too.

curl https://hostyourai.com/api/v1/chat/completions \
  -H "Authorization: Bearer hyai-..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "microsoft/UserLM-8b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Frequently asked questions

Can I run UserLM 8b in the EU?

Yes. HostYourAI runs UserLM 8b on GPUs in European datacenters via vLLM. Prompts and outputs never leave the EU and there is no US cloud provider in the chain.

Is hosting UserLM 8b GDPR-compliant?

Yes. All processing happens inside the EU, a Data Processing Agreement (DPA) is available and the subprocessor list is public. Open-source weights also mean: no training on your data.

How much does UserLM 8b cost?

Via the shared EU router you pay €0.10 per million input tokens and €0.18 per million output tokens, with no fixed costs. For high volume or isolation you can also run UserLM 8b as a dedicated hourly GPU instance.

Is the API OpenAI-compatible?

Yes. You use the standard OpenAI SDKs with a custom base URL (https://hostyourai.com/api/v1). The Anthropic Messages API is supported as a drop-in as well.

More models from Microsoft

FastContext 1.0 4B RL

FastContext-1.0 is a lightweight repository-exploration subagent for LLM coding agents. Instead of letting a single model both explore the repository and solve the task, FastContext separates these two roles: it is invoked on demand by a main coding agent, issues parallel read-only tool calls (READ, GLOB, GREP), and returns compact file paths and line ranges as focused context.

4B 262K context View model →

FastContext 1.0 4B SFT

4B 262K context View model →

X Reasoner 7B

We introduce X-Reasoner, a vision-language model posttrained solely on general-domain text for generalizable reasoning, using a twostage approach: an initial supervised fine-tuning phase with distilled long chainof-thoughts, followed by reinforcement learning with verifiable rewards. Experiments show that X-Reasoner successfully transfers reasoning capabilities to both multimodal and out-of-domain settings, outperforming existing state-of-theart models trained with in-domain and multimodal data across various general and medical benchmarks. More details can be found in the paper: X-Reasoner: T

8.3B 128K context View model →

FrogBoss 32B 2510

FrogBoss is built on the Qwen3-32B transformer architecture with a maximum context length of 64k tokens. The model uses multi-turn debugging workflows and complex code reasoning. Unlike general-purpose LLMs, FrogBoss is specialized for software engineering tasks.

32B 41K context View model →

OptiMind SFT

OptiMind-SFT is a specialized 20B parameter model designed to bridge the gap between natural language and executable optimization solvers. It automates the translation of complex decision-making problems—such as supply chain planning, scheduling, and resource allocation—into correct MILP formulations.

21B 131K context View model →

Fara 7B

Description: Fara-7B is Microsoft's first agentic small language model (SLM) designed specifically for computer use. With only 7 billion parameters, Fara-7B is an ultra-compact Computer Use Agent (CUA) that achieves state-of-the-art performance within its size class and is competitive with larger, more resource-intensive agentic systems.

8.3B 128K context View model →

Try UserLM 8b for free

Creating an account takes a minute. Test UserLM 8b straight away in the playground.

Start for free