How We Calculate
Understand the methodology behind CanIRunAI recommendations
Overview
CanIRunAI analyzes your hardware (GPU, CPU and RAM) and calculates compatibility with each local AI model. All scores are based on real measurements with RTX 3060 as baseline.
Tier System
Each model receives a tier based on estimated speed (tokens per second) on your hardware.
Near-instant responses. Fluid experience for any use case.
Good speed. Comfortable for chat, coding and general use.
Functional but noticeably slower. OK for non-interactive tasks.
Usable with patience. Long responses may take minutes.
Doesn't fit in memory or too slow for practical use.
VRAM Estimation
The model needs to fit in GPU memory (VRAM). We use Q4_K_M quantization as default — the best balance between quality and size.
Quantization
Reduces model weight precision to decrease size and speed up inference.
How the model fits
Model uses up to 85% of VRAM. Full performance, no penalties.
Model fits in VRAM but no headroom. 30% penalty from memory pressure.
Part on GPU, rest on RAM. 40-80% penalty depending on ratio.
No dedicated GPU. Uses 60% of RAM. 88-94% penalty vs GPU.
Insufficient memory. Model cannot be loaded.
Bandwidth Scaling
Speed scales linearly with GPU bandwidth. The RTX 3060 (360 GB/s) is the baseline.
RAM Factor
System RAM amount influences performance. 16GB is the baseline (factor 1.0).
Range: 0.65× (4GB) to 1.18× (32GB+)
CPU Factor
CPU affects ~40% of performance in GPU mode (tokenization, KV cache, data transfer). In CPU-only mode, it's 100% of the impact.
Final Score (0-100)
The overall score is a weighted average of how many models run at each tier.