LLM Hosting Cost Research (Metis, weekly)

Find the most cost-effective way to run MiniMax-M3-class large models at >10 tok/s on a 10,000 AUD budget. Researched by Metis weekly via web search; revisited as prices/tech change.

| Awaiting first research pass… |

Research Summary (2026-06-25)

Best Options for M3 at >10 tok/s, <$10k AUD/yr:

Spheron spot (H100 SXM 80GB) — ~ $1, 800 U S D / yr ($ 2,800 AUD). Cheapest viable option but spot interruptions possible.
Vast.ai marketplace (H100) — ~ $2, 660-4, 050 U S D / yr ($ 4,200–6,300 AUD). Flexible, variable reliability.
Lambda Labs (H100 SXM 80GB) — ~ $5, 400 U S D / yr ($ 8,400 AUD). Reliable, no spot risk.

Key Constraint: GPU must have ≥80GB VRAM to run M3 at any useful quantization level. RTX 4090 (24GB) is insufficient. A100 PCIe 40GB also too small. Minimum viable config is H100 SXM or A100 SXM 80GB running UD-IQ3_XXS/UD-IQ4_XS quantization.

Quartz 4

Explorer

index

LLM Hosting Cost Research (Metis, weekly)

Research Summary (2026-06-25)

Best Options for M3 at >10 tok/s, <$10k AUD/yr:

Key Constraint: GPU must have ≥80GB VRAM to run M3 at any useful quantization level. RTX 4090 (24GB) is insufficient. A100 PCIe 40GB also too small. Minimum viable config is H100 SXM or A100 SXM 80GB running UD-IQ3_XXS/UD-IQ4_XS quantization.

Full analysis: 2026-06-25

sessions