Skip to content

v2.1.0

Compare
Choose a tag to compare
@vshampor vshampor released this 12 Jan 11:29
· 2168 commits to develop since this release

New features

  • (PyTorch) All PyTorch operations are now NNCF-wrapped automatically.
  • (TensorFlow) Scales for concat-affecting quantizers are now unified
  • (PyTorch) The pruned filters are now set to 0 in the exported ONNX file instead of removing them from the ONNX definition.
  • (PyTorch, TensorFlow) Extended accuracy-aware training pipeline with the early_exit mode.
  • (PyTorch, TensorFlow) Added support for quantization presets to be specified in NNCF config.
  • (PyTorch, TensorFlow) Extended pruning statistics displayed to the user.
  • (PyTorch, TensorFlow) Users may now register a dump_checkpoints_fn callback to control the location of checkpoint saving during accuracy-aware training.
  • (PyTorch, TensorFlow) Default pruning schedule is now exponential.
  • (PyTorch) SILU activation now supported.
  • (PyTorch) Dynamic graph no longer traced during compressed model execution, which improves training performance of models compressed with NNCF.
  • (PyTorch) Added BERT-MRPC quantization results and integration instructions to the HuggingFace Transformers integration patch.
  • (PyTorch) Knowledge distillation extended with the option to specify temperature for the softmax mode.
  • (TensorFlow) Added mixed_min_max option for quantizer range initialization.
  • (PyTorch, TensorFlow) ReLU6-based HSwish and HSigmoid activations are now properly fused.
  • (PyTorch - Experimental) Added an algorithm to search the model's architecture for basic building blocks.

Bugfixes:

  • (TensorFlow) Fixed a bug where an operation with int32 inputs (following a Cast op) was attempted to be quantized.
  • (PyTorch, TensorFlow) LeakyReLU now properly handled during pruning
  • (PyTorch) Fixed errors with custom modules failing at the determine_subtype stage of metatype assignment.
  • (PyTorch) Fix handling modules with torch.nn.utils.weight_norm.WeightNorm applied