inference

A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.

Updated May 14, 2024
Python

deepjavalibrary / djl-serving

Star

A universal scalable machine learning model deployment solution

deep-learning deployment inference pytorch serving djl

Updated May 14, 2024
Java

microsoft / DeepSpeed

Star

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

machine-learning compression deep-learning gpu inference pytorch zero data-parallelism model-parallelism mixture-of-experts pipeline-parallelism billion-parameters trillion-parameters

Updated May 14, 2024
Python

openvinotoolkit / openvino

Star

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference

nlp natural-language-processing ai computer-vision deep-learning transformers inference speech-recognition yolo recommendation-system performance-boost good-first-issue openvino diffusion-models stable-diffusion generative-ai llm-inference optimize-ai deploy-ai

Updated May 14, 2024
C++

triton-inference-server / server

Star

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

machine-learning cloud deep-learning gpu inference edge datacenter

Updated May 14, 2024
Python

alteryx / woodwork

Star

Woodwork is a Python library that provides robust methods for managing and communicating data typing information.

python data-science machine-learning typing inference dataframe dataframes semantic-tags featuretools woodwork evalml nlp-primitives

Updated May 14, 2024
Python

triton-inference-server / model_analyzer

Star

Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Server models.

deep-learning gpu inference performance-analysis

Updated May 14, 2024
Python

huggingface / huggingface.js

Star

Utilities to use the Hugging Face Hub API

machine-learning inference hub api-client huggingface

Updated May 14, 2024
TypeScript

vaticle / typedb

Star

TypeDB: the polymorphic database powered by types

database polymorphic logic inference polymorphism knowledge-base type-system strongly-typed knowledge-representation reasoning typedb typeql

Updated May 14, 2024
Java

microsoft / vidur

Star

A large-scale simulation framework for LLM inference

simulation inference transformer llm vllm

Updated May 14, 2024
Python

blefo / FastInference

Star

Seamlessly integrate with top LLM APIs for speedy, robust, and scalable querying. Ideal for developers needing quick, reliable AI-powered responses.

api fast inference distributed llm

Updated May 14, 2024
Python

Tencent / ncnn

Star

ncnn is a high-performance neural network inference framework optimized for the mobile platform

Updated May 14, 2024
C++

Improve this page

Add a description, image, and links to the inference topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the inference topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inference

Here are 1,185 public repositories matching this topic...

zzarif / Pasta-Shape-Recognizer

vectorch-ai / ScaleLLM

huggingface / optimum-intel

vllm-project / vllm

google / XNNPACK

tobymcclean / cooking-with-edgeai

ggerganov / whisper.cpp

huggingface / text-generation-inference

roboflow / inference

deepjavalibrary / djl-serving

microsoft / DeepSpeed

openvinotoolkit / openvino

triton-inference-server / server

alteryx / woodwork

triton-inference-server / model_analyzer

huggingface / huggingface.js

vaticle / typedb

microsoft / vidur

blefo / FastInference

Tencent / ncnn

Improve this page

Add this topic to your repo