TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
-
Updated
May 29, 2024 - Python
TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
[CVPR 2024 Highlight] TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models
EfficientNetV2 (Efficientnetv2-b2) and quantization int8 and fp32 (QAT and PTQ) on CK+ dataset . fine-tuning, augmentation, solving imbalanced dataset, etc.
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
Post-Training quantization perfomed on the model trained with CLIC dataset.
Research experiments archive for post-training quantization with TensorRT. Submitted and Accepted to IEEE EDGE 2024
[ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.
[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models"
Post post-training-quantization (PTQ) method for improving LLMs. Unofficial implementation of https://arxiv.org/abs/2309.02784
A model compression and acceleration toolbox based on pytorch.
Notes on quantization in neural networks
Comprehensive study on the quantization of various CNN models, employing techniques such as Post-Training Quantization and Quantization Aware Training (QAT).
Implementation of EPTQ - an Enhanced Post-Training Quantization algorithm for DNN compression
Improved the performance of 8-bit PTQ4DM expecially on FID.
Post-training quantization on Nvidia Nemo ASR model
quantization example for pqt & qat
A framework to train a ResUNet architecture, quantize, compile and execute it on an FPGA.
Generating tensorrt model using onnx
Low-bit (2/4/8/16) Post Training Quantization for ResNet20
Add a description, image, and links to the post-training-quantization topic page so that developers can more easily learn about it.
To associate your repository with the post-training-quantization topic, visit your repo's landing page and select "manage topics."