⚡ LLM Inference
Speed Calculator
Select your GPU, model, and quantization — get estimated tokens/sec and VRAM usage
🖥
GPU Hardware
Model
Search GPU…
▼
VRAM
Select GPU
🧠
LLM Model
Model
Search model…
▼
Parameters
Select model
Quantization
Q2_K
Q3_K_M
Q4_K_M
Q5_K_M
Q6_K
Q8_0
FP16
Select a GPU, VRAM, model, and parameters to see results