efficient-inference

"LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS", Zhiwen Fan, Kevin Wang, Kairun Wen, Zehao Zhu, Dejia Xu, Zhangyang Wang

3d-reconstruction efficient-inference gaussian-splatting

Updated Mar 27, 2024
Python

Zhen-Dong / Awesome-Quantization-Papers

Star

List of papers related to neural network quantization in recent AI conferences and journals.

neural-networks awesome-list papers quantization model-compression edge-computing efficient-inference diffusion-models large-language-models

Updated May 28, 2024

The-Learning-And-Vision-Atelier-LAVA / SMSR

Star

[CVPR 2021] Exploring Sparsity in Image Super-Resolution for Efficient Inference

sparsity super-resolution efficient-inference

Updated Oct 18, 2021
Python

changlin31 / DS-Net

Star

(CVPR 2021, Oral) Dynamic Slimmable Network

pruning model-compression efficient-inference dynamic-networks network-pruning dynamic-pruning

Updated Dec 31, 2021
Python

SqueezeAILab / KVQuant

Star

KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

natural-language-processing compression text-generation transformer llama quantization mistral model-compression efficient-inference efficient-model large-language-models llm small-models localllm localllama

Updated Apr 19, 2024
Python

liuziwei7 / mobile-id

Star

Deep Face Model Compression

computer-vision deep-learning face-recognition model-compression efficient-inference

Updated Aug 21, 2018
MATLAB

xindongzhang / ELAN

Star

[ECCV2022] Efficient Long-Range Attention Network for Image Super-resolution

transformer super-resolution efficient-inference

Updated Jul 20, 2022
Python

cure-lab / DeciWatch

Star

[ECCV 2022] Official implementation of the paper "DeciWatch: A Simple Baseline for 10x Efficient 2D and 3D Pose Estimation"

deep-learning efficiency pytorch human-pose-estimation pose-estimation eccv efficient-inference 2d-human-pose 3d-pose-estimation efficient-neural-networks body-reconstruction eccv2022 3d-body-recovery

Updated Jul 19, 2022
Python

lucidrains / speculative-decoding

Star

Explorations into some recent techniques surrounding speculative decoding

deep-learning transformers artificial-intelligence efficient-inference

Updated Oct 9, 2023
Python

Picovoice / picollm

Star

On-device LLM Inference Powered by X-Bit Quantization

natural-language-processing compression self-hosted llama language-models quantization language-model gemma mistral model-compression efficient-inference llm llms generative-ai large-language-model llama2 mixtral llm-infernece llama3

Updated May 31, 2024
Python

RAIVNLab / STR

Star

Soft Threshold Weight Reparameterization for Learnable Sparsity

sparsity cnn imagenet str icml efficient-inference soft-thresholding edge-machine-learning sparsity-optimization resource-efficient icml-2020 learnable-sparsity icml2020 soft-threshold-reparameterization

Updated Feb 15, 2023
Python

kssteven418 / BigLittleDecoder

Star

[NeurIPS'23] Speculative Decoding with Big Little Decoder

decoding efficient-inference speculative-execution fast-inference llm speculative-decoding

Updated Feb 6, 2024
Python

snap-research / graphless-neural-networks

Star

[ICLR 2022] Code for Graph-less Neural Networks: Teaching Old MLPs New Tricks via Distillation (GLNN)

deep-learning scalability pytorch knowledge-distillation efficient-inference distillation graph-algorithm graph-neural-networks gnn

Updated May 3, 2024
Python

Improve this page

Add a description, image, and links to the efficient-inference topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the efficient-inference topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

efficient-inference

Here are 59 public repositories matching this topic...

huawei-noah / Efficient-AI-Backbones

SqueezeAILab / LLMCompiler

snap-research / EfficientFormer

huawei-noah / AdderNet

horseee / DeepCache

SqueezeAILab / SqueezeLLM

liuzhuang13 / slimming

VITA-Group / LightGaussian

Zhen-Dong / Awesome-Quantization-Papers

The-Learning-And-Vision-Atelier-LAVA / SMSR

changlin31 / DS-Net

SqueezeAILab / KVQuant

liuziwei7 / mobile-id

xindongzhang / ELAN

cure-lab / DeciWatch

lucidrains / speculative-decoding

Picovoice / picollm

RAIVNLab / STR

kssteven418 / BigLittleDecoder

snap-research / graphless-neural-networks

Improve this page

Add this topic to your repo