← Library · Definition
Quantization
Quantization is a technique used to reduce the precision of the numbers (like weights and activations) used in a neural network. Instead of using 32-bit floating-point numbers, it might use 8-bit integers, making the model smaller, faster, and more efficient for deployment on resource-constrained devices without significant performance loss.
Learn one new AI thing every day.
Daily Deck sends you seven plain-English cards like this every morning. Free.
Start free