This is the repository holding code and data for "FrugalML: How to Use ML Prediction APIs More Accurately and Cheaply".
-
Updated
Jan 12, 2021 - Python
This is the repository holding code and data for "FrugalML: How to Use ML Prediction APIs More Accurately and Cheaply".
Bringing local LLMs to a Minecraft front-end through commands.
LLM Kit - Python Large Language Model Kit for generating data of your choice
Large Multi-Language Models for News Translation
AccIo - Enterprise LLM : Unifying intelligence at your command!
Python-based WebSocket for CLI LLaVA inference.
Effortlessly create and manage your own AI infrastructure with Radiantloom AI. Privacy, security, and flexibility meet ease-of-use in this innovative open-source platform.
Mamba for Vision, Perception and Action
Detailed code explanation of google LLM gemini
How to stream LLM responses using AWS API Gateway Websockets and Lambda
In this workshop, we demonstrate how to choose the right container and right instance types, optimize container parameters, and set up the right autoscaling policies and how to use APIs to get recommendations with Amazon SageMaker
Simple chat interface for local AI using llama-cpp-python and llama-cpp-agent
Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large Language Models that Elevates Model Reasoning by atleast 70%
This repository contains question-answers model as an interface which retrieves answers from vector database for a question. Embeddings or tokenised vector being computed using OpenAI API call which gets inserted into ChromaDB as a RAG. OpenAI API key would be required to run this service.
A framework for multiple LLM models to operate in a non-adversarial fashion based on the structure of a bee colony working together to maintain a hive.
Our project addresses the challenge of multi-document summarization with Large Language Models (LLMs), which are constrained by token length limitations. We propose a novel approach that combines the strengths of LLMs and Maximal Marginal Relevance (MMR).
A Production-Ready, Scalable RAG-powered LLM-based Context-Aware QA App
GUI for GGML Alpaca models
Add a description, image, and links to the llm-inference topic page so that developers can more easily learn about it.
To associate your repository with the llm-inference topic, visit your repo's landing page and select "manage topics."