← Library · Definition

Quantization

Quantization is a technique used to reduce the precision of the numbers (like weights and activations) used in a neural network. Instead of using 32-bit floating-point numbers, it might use 8-bit integers, making the model smaller, faster, and more efficient for deployment on resource-constrained devices without significant performance loss.

Learn one new AI thing every day.

Daily Deck sends you seven plain-English cards like this every morning. Free.

Start free