Skip to content

v1.6.0

Compare
Choose a tag to compare
@vshampor vshampor released this 29 Jan 14:25
· 17 commits to master since this release
  • Added AutoQ - an AutoML-based mixed-precision initialization mode for quantization, which utilizes the power of reinforcement learning to select the best quantizer configuration for any model in terms of quality metric for a given HW architecture type.
  • NNCF now supports inserting compression operations as pre-hooks to PyTorch operations, instead of abusing the post-hooking; the flexibility of quantization setups has been improved as a result of this change.
  • Improved the pruning algorithm to group together dependent filters from different layers in the network and prune these together
  • Extended the ONNX compressed model exporting interface with an option to explicitly name input and output tensors
  • Changed the compression scheduler so that the correspondingepoch_step and step methods should now be called in the beginning of the epoch and before the optimizer step (previously these were called in the end of the epoch and after the optimizer step respectively)
  • Data-dependent compression algorithm initialization is now specified in terms of dataset samples instead of training batches, e.g. "num_init_samples" should be used in place of "num_init_steps" in NNCF config files.
  • Custom user modules to be registered for compression can now be specified to be ignored for certain compression algorithms
  • Batch norm adaptation now being applied by default for all compression algorithms
  • Bumped target PyTorch version to 1.7.0
  • Custom OpenVINO operations such as "FakeQuantize" that appear in NNCF-exported ONNX models now have their ONNX domain set to org.openvinotoolkit
  • The quantization algorithm will now quantize nn.Embedding and nn.EmbeddingBag weights when targeting CPU
  • Added an option to optimize logarithms of quantizer scales instead of scales themselves directly, a technique which improves convergence in certain cases
  • Added reference checkpoints for filter-pruned models: UNet@Mapillary (25% of filters pruned), SSD300@VOC (40% of filters pruned)