Skip to content
C

Cerebras

Fastest inference in the world

About

Wafer-scale chip delivering 2000+ tokens/sec on Llama 3.3. Fastest LLM inference available.

inferencefastwafer-scalellama

Metrics

1.2k
GitHub Stars
80
Forks
15
Open Issues

More in LLM Inference