v2.11.0

KodiaqQ released this 17 Jun 11:02

· 2168 commits to develop since this release

Post-training Quantization:

Features:

(OpenVINO) Added Scale Estimation algorithm for 4-bit data-aware weights compression. The optional scale_estimation parameter was introduced to nncf.compress_weights() and can be used to minimize accuracy degradation of compressed models (note that this algorithm increases the compression time).
(OpenVINO) Added GPTQ algorithm for 8/4-bit data-aware weights compression, supporting INT8, INT4, and NF4 data types. The optional gptq parameter was introduced to nncf.compress_weights() to enable the GPTQ algorithm.
(OpenVINO) Added support for models with BF16 weights in the weights compression method, nncf.compress_weights().
(PyTorch) Added support for quantization and weight compression of the custom modules.

Fixes:

(OpenVINO) Fixed incorrect node with bias determination in Fast-/BiasCorrection and ChannelAlighnment algorithms.
(OpenVINO, PyTorch) Fixed incorrect behaviour of nncf.compress_weights() in case of compressed model as input.
(OpenVINO, PyTorch) Fixed SmoothQuant algorithm to work with Split ports correctly.

Improvements:

(OpenVINO) Aligned resulting compression subgraphs for the nncf.compress_weights() in different FP precisions.
Aligned 8-bit scheme for NPU target device with the CPU.

Examples:

(OpenVINO, ONNX) Updated ignored scope for YOLOv8 examples utilizing a subgraphs approach.

Tutorials:

Compression-aware training:

Features:

(PyTorch) nncf.quantize method is now the recommended path for the quantization initialization for Quantization-Aware Training.
(PyTorch) Compression modules placement in the model now can be serialized and restored with new API functions: compressed_model.nncf.get_config() and nncf.torch.load_from_config. The documentation for the saving/loading of a quantized model is available, and Resnet18 example was updated to use the new API.

Fixes:

(PyTorch) Fixed compatibility with torch.compile.

Improvements:

(PyTorch) Base parameters were extended for the EvolutionOptimizer (LeGR algorithm part).
(PyTorch) Improved wrapping for parameters which are not tensors.

Examples:

(PyTorch) Added an example for STFPM model from Anomalib.

Tutorials:

Quantization-Sparsity Aware Training of PyTorch ResNet-50 Model

Deprecations/Removals:

Removed extra dependencies to install backends from setup.py (like [torch] are [tf], [onnx] and [openvino]).
Removed openvino-dev dependency.

Requirements:

Updated PyTorch (2.3.0) and Torchvision (0.18.0) versions.

Acknowledgements

Thanks for contributions from the OpenVINO developer community:
@DaniAffCH
@UsingtcNower
@anzr299
@AdiKsOnDev
@Viditagarwal7479
@truhinnm

Contributors

UsingtcNower, anzr299, and 4 other contributors

Assets 2