You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have followed the steps on Unsloth official notebook Alpaca + Llama-3 8b full example and finetuned a llama 3 8B model and I wanted to serve it using vllm? However it does not seem to work.
This is the command I used for serving the local model, with "/content/merged_llama3" being the directory that contains all model files:
I dont think I ever need to provide quantization method as it should be written in the config file, it should be a mistake reading all those files. In addition, I did save the model and pushed it to the hub using the given codes in the Unsloth notebook?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I have followed the steps on Unsloth official notebook Alpaca + Llama-3 8b full example and finetuned a llama 3 8B model and I wanted to serve it using vllm? However it does not seem to work.
This is the command I used for serving the local model, with "/content/merged_llama3" being the directory that contains all model files:
which returns an error:
I dont think I ever need to provide quantization method as it should be written in the config file, it should be a mistake reading all those files. In addition, I did save the model and pushed it to the hub using the given codes in the Unsloth notebook?
my model files:
what went wrong?
Beta Was this translation helpful? Give feedback.
All reactions