⚡ LLM Inference
Hosted model APIs, GPU inference, fast token serving
#1
C
Cerebras
Fastest inference in the world
⭐ 1.2k$0/mo
#2G
Groq
Fastest LLM inference on LPUs
Free
#3T
Together AI
Run open-source models in production
Free
#4F
Fireworks AI
Production inference for open models
Free
#5O
OpenRouter
Unified API for hundreds of LLMs
Free
#6M
Modal
Serverless GPUs and containers
Free
#7R
RunPod
GPU cloud for AI workloads
$0/mo