Release v1.6.0 · openvinotoolkit/nncf

Added AutoQ - an AutoML-based mixed-precision initialization mode for quantization, which utilizes the power of reinforcement learning to select the best quantizer configuration for any model in terms of quality metric for a given HW architecture type.
NNCF now supports inserting compression operations as pre-hooks to PyTorch operations, instead of abusing the post-hooking; the flexibility of quantization setups has been improved as a result of this change.
Improved the pruning algorithm to group together dependent filters from different layers in the network and prune these together
Extended the ONNX compressed model exporting interface with an option to explicitly name input and output tensors
Changed the compression scheduler so that the correspondingepoch_step and step methods should now be called in the beginning of the epoch and before the optimizer step (previously these were called in the end of the epoch and after the optimizer step respectively)
Data-dependent compression algorithm initialization is now specified in terms of dataset samples instead of training batches, e.g. "num_init_samples" should be used in place of "num_init_steps" in NNCF config files.
Custom user modules to be registered for compression can now be specified to be ignored for certain compression algorithms
Batch norm adaptation now being applied by default for all compression algorithms
Bumped target PyTorch version to 1.7.0
Custom OpenVINO operations such as "FakeQuantize" that appear in NNCF-exported ONNX models now have their ONNX domain set to org.openvinotoolkit
The quantization algorithm will now quantize nn.Embedding and nn.EmbeddingBag weights when targeting CPU
Added an option to optimize logarithms of quantizer scales instead of scales themselves directly, a technique which improves convergence in certain cases
Added reference checkpoints for filter-pruned models: UNet@Mapillary (25% of filters pruned), SSD300@VOC (40% of filters pruned)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.6.0