gpubox.ai

Comparison

GPUBox vs Together.ai

Both serve open-source models behind an OpenAI-compatible API. The honest difference: catalog breadth vs jurisdiction. Together hosts hundreds of models in US infrastructure. We host a curated few in the UK.

If catalog breadth matters more than where the data lives, Together wins on that axis. If your stakeholders ask "is the inference happening in the UK?" and "adequate" isn't the answer they want, GPUBox is the answer.

AttributeGPUBoxTogether.ai
API surfaceOpenAI-compatible. Drop-in replacement at /v1.OpenAI-compatible. Drop-in replacement at /v1.
Hosting jurisdictionUnited Kingdom. UK-incorporated operating company. UK VAT registered.United States primarily. SOC 2 Type II.
Model catalog sizeThree live models, curated. Quality over breadth.200+ open-source models — Llama, Mixtral, Qwen, DeepSeek, Stable Diffusion, audio, embeddings.
Frontier model accessQwen2.5-32B today. We pin model versions; you pick what you call.DeepSeek-V3, Llama 3.3 405B, Mixtral, Qwen variants. Larger frontier OSS options.
Pricing — chat completions£1.00 per 1M tokens (blended input + output). Currently ~$1.25 at GBP/USD.Tiered by model. ~$0.18/M for Qwen2.5-7B → ~$0.88/M for DeepSeek-V3 → $5/M for Llama 3.3 405B.
Pricing transparencySingle blended rate per model. No separate input/output rates. Published at /pricing.Per-model pricing. Separate input vs output rates. Discounted for batch.
CurrencyGBP. VAT-compliant invoicing for UK and EU.USD.
Streaming + toolsStreaming SSE, JSON mode, function calling — all OpenAI-compatible.Streaming SSE, JSON mode, function calling.
Fine-tuning serviceNot yet on the API. Roadmap (Factory product).LoRA + full fine-tuning available. Bring data, get a serving endpoint.
Dedicated capacityAvailable for sovereign / regulated customers via gpubox.uk. Reserved hardware, signed DPA.Together Reserved tier — dedicated GPU clusters. Enterprise sales.
Audit logPer-call audit log retained 30 days minimum.Usage analytics in dashboard. Audit log details vary by tier.
AudienceUK developers, regulated industries, sovereignty-conscious enterprises.Global AI developers, OSS researchers, anyone wanting a wide model catalog.

Pick GPUBox if

  • UK data residency is a contractual or regulatory requirement.
  • GBP invoicing matters for accounts payable.
  • You want one blended rate, not per-model pricing maps.
  • Curated models cover your use case (Qwen + Whisper + embeddings).
  • You want a UK-incorporated counterparty for the DPA.

Pick Together.ai if

  • You need a specific OSS model not on our menu.
  • You want managed fine-tuning today, not Q3-2026.
  • You're running OSS-research breadth across many model families.
  • US data residency is fine for your customers.
  • You need 405B-class models — we run a 32B today.

Try the drop-in for yourself.

Email us for a same-day API key. First £20 of usage is on us.