Skip to content

v2.9.0

Compare
Choose a tag to compare
@KodiaqQ KodiaqQ released this 06 Mar 11:39
· 2168 commits to develop since this release

Post-training Quantization:

Features:

  • (OpenVINO) Added modified AWQ algorithm for 4-bit data-aware weights compression. This algorithm applied only for patterns MatMul->Multiply->Matmul. For that awq optional parameter has been added to nncf.compress_weights() and can be used to minimize accuracy degradation of compressed models (note that this option increases the compression time).
  • (ONNX) Introduced support for the ONNX backend in the nncf.quantize_with_accuracy_control() method. Users can now perform quantization with accuracy control for onnx.ModelProto. By leveraging this feature, users can enhance the accuracy of quantized models while minimizing performance impact.
  • (ONNX) Added an example based on the YOLOv8n-seg model for demonstrating the usage of quantization with accuracy control for the ONNX backend.
  • (PT) Added SmoothQuant algorithm for PyTorch backend in nncf.quantize().
  • (OpenVINO) Added an example with the hyperparameters tuning for the TinyLLama model.
  • Introduced the nncf.AdvancedAccuracyRestorerParameters.
  • Introduced the subset_size option for the nncf.compress_weights().
  • Introduced TargetDevice.NPU as the replacement for TargetDevice.VPU.

Fixes:

  • Fixed API Enums serialization/deserialization issue.
  • Fixed issue with required arguments for revert_operations_to_floating_point_precision method.

Improvements:

  • (ONNX) Aligned statistics collection with OpenVINO and PyTorch backends.
  • Extended nncf.compress_weights() with Convolution & Embeddings compression in order to reduce memory footprint.

Deprecations/Removals:

  • (OpenVINO) Removed outdated examples with nncf.quantize() for BERT and YOLOv5 models.
  • (OpenVINO) Removed outdated example with nncf.quantize_with_accuracy_control() for SSD MobileNetV1 FPN model.
  • (PyTorch) Deprecated the binarization algorithm.
  • Removed Post-training Optimization Tool as OpenVINO backend.
  • Removed Dockerfiles.
  • TargetDevice.VPU was replaced by TargetDevice.NPU.

Tutorials:

Compression-aware training:

Fixes

  • (PyTorch) Fixed issue with NNCFNetworkInterface.get_clean_shallow_copy missed arguments.

Acknowledgements

Thanks for contributions from the OpenVINO developer community:
@AishwaryaDekhane
@UsingtcNower
@Om-Doiphode