Skip to content

Issues: huggingface/text-generation-inference

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Milestones
Filter by milestone
Assignee
Filter by who’s assigned
Sort

Issues list

LoRA Adapter from local model are leading to error
#1893 opened May 14, 2024 by philschmid
2 of 4 tasks
TGI 2.0.2 CodeLlama error piece id is out of range.
#1891 opened May 14, 2024 by philschmid
2 of 4 tasks
Min P generation parameter
#1885 opened May 13, 2024 by LawrenceGrigoryan
Question about KV cache
#1883 opened May 13, 2024 by martinigoyanes
SnapKV support
#1881 opened May 13, 2024 by icyxp
concurrent requests permit limit is broken
#1877 opened May 10, 2024 by oOraph
1 of 4 tasks
text generation details not working when stream=False
#1876 opened May 10, 2024 by uyeongkim
2 of 4 tasks
Automatic NUMA binding
#1874 opened May 10, 2024 by fxmarty
[Question] Onnx support in TGI
#1873 opened May 9, 2024 by Ben-Epstein
Regarding llama3-70b-instruct
#1864 opened May 6, 2024 by chintanshrinath
Encounter install error when install vllm package.
#1862 opened May 6, 2024 by for-just-we
2 of 4 tasks
TGI-2.0.2 encounter "CUDA is not available"
#1861 opened May 6, 2024 by Cucunnber
2 of 4 tasks
ProTip! Follow long discussions with comments:>50.