Comparison
GPUBox vs Together.ai
Both serve open-source models behind an OpenAI-compatible API. The honest difference: catalog breadth vs jurisdiction. Together hosts hundreds of models in US infrastructure. We host a curated few in the UK.
If catalog breadth matters more than where the data lives, Together wins on that axis. If your stakeholders ask "is the inference happening in the UK?" and "adequate" isn't the answer they want, GPUBox is the answer.
| Attribute | GPUBox | Together.ai |
|---|---|---|
| API surface | OpenAI-compatible. Drop-in replacement at /v1. | OpenAI-compatible. Drop-in replacement at /v1. |
| Hosting jurisdiction | United Kingdom. UK-incorporated operating company. UK VAT registered. | United States primarily. SOC 2 Type II. |
| Model catalog size | Three live models, curated. Quality over breadth. | 200+ open-source models — Llama, Mixtral, Qwen, DeepSeek, Stable Diffusion, audio, embeddings. |
| Frontier model access | Qwen2.5-32B today. We pin model versions; you pick what you call. | DeepSeek-V3, Llama 3.3 405B, Mixtral, Qwen variants. Larger frontier OSS options. |
| Pricing — chat completions | £1.00 per 1M tokens (blended input + output). Currently ~$1.25 at GBP/USD. | Tiered by model. ~$0.18/M for Qwen2.5-7B → ~$0.88/M for DeepSeek-V3 → $5/M for Llama 3.3 405B. |
| Pricing transparency | Single blended rate per model. No separate input/output rates. Published at /pricing. | Per-model pricing. Separate input vs output rates. Discounted for batch. |
| Currency | GBP. VAT-compliant invoicing for UK and EU. | USD. |
| Streaming + tools | Streaming SSE, JSON mode, function calling — all OpenAI-compatible. | Streaming SSE, JSON mode, function calling. |
| Fine-tuning service | Not yet on the API. Roadmap (Factory product). | LoRA + full fine-tuning available. Bring data, get a serving endpoint. |
| Dedicated capacity | Available for sovereign / regulated customers via gpubox.uk. Reserved hardware, signed DPA. | Together Reserved tier — dedicated GPU clusters. Enterprise sales. |
| Audit log | Per-call audit log retained 30 days minimum. | Usage analytics in dashboard. Audit log details vary by tier. |
| Audience | UK developers, regulated industries, sovereignty-conscious enterprises. | Global AI developers, OSS researchers, anyone wanting a wide model catalog. |
Pick GPUBox if
- UK data residency is a contractual or regulatory requirement.
- GBP invoicing matters for accounts payable.
- You want one blended rate, not per-model pricing maps.
- Curated models cover your use case (Qwen + Whisper + embeddings).
- You want a UK-incorporated counterparty for the DPA.
Pick Together.ai if
- You need a specific OSS model not on our menu.
- You want managed fine-tuning today, not Q3-2026.
- You're running OSS-research breadth across many model families.
- US data residency is fine for your customers.
- You need 405B-class models — we run a 32B today.
Try the drop-in for yourself.
Email us for a same-day API key. First £20 of usage is on us.