-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Usage]: Seems nn.module definition may affect the output tokens. Don't know the reason. #4805
Comments
This is quite interesting. Can you double check by setting |
If this is real, I suspect this has something to do with memory leak and pytorch caching allocator. Maybe we leaked some object reference, and when you create new nn module, pytorch caching allocator recycles some memory it thinks is not used anymore, but it is actually used somewhere? I might be wrong anyway. If this is the case, the rootcase would be quite difficult to debug. |
@simon-mo Hi from vllm import LLM
import torch
prompts = ["你好"]
llm1 = LLM(model="/home/zhenzhong/model/chatglm2-6b", trust_remote_code=True, seed=666) # Create an LLM.
torch.nn.Linear(in_features=4096,out_features=8888, bias=True, dtype=torch.bfloat16)
outputs1 = llm1.generate(prompts) # Generate texts from the prompts.
print(outputs1)
llm2= LLM(model="/home/zhenzhong/model/chatglm2-6b", trust_remote_code=True, seed=666) # Create an LLM.
torch.nn.Linear(in_features=4096,out_features=9999, bias=True, dtype=torch.bfloat16)
outputs2 = llm2.generate(prompts) # Generate texts from the prompts.
llm3= LLM(model="/home/zhenzhong/model/chatglm2-6b", trust_remote_code=True, seed=666) # Create an LLM.
outputs3 = llm3.generate(prompts) # Generate texts from the prompts.
print("outputs1 = ", outputs1)
print("outputs2 = ", outputs2)
print("outputs3 = ", outputs3) I set the same seed, but also output three different results. Acutally LLM() has the default seed (seed: int = 0).
|
Your current environment
Env: CPU device
vllm: 0.4.2+cpu
For this code, as long as I define the torch.nn.modules in the domain of the current vLLM model, it affects ouput token results even I don't use them. In other words, If I move theses nn.modules I don't use to the above of LLM() definition, it does't affect results.
llm1 is the same as llm2, because they both define the nn.module in the current model domain. But, llm3 is different because I don't define anything, and llm3 is the correct result I want.
Shouldn't three of them have the same result? Please check the screenshot or text.
Output screenshots:
Besides, if I change the ouput feature of torch.nn.module, it aslo affects output tokens.
I only change the output_features, but results are different.
outputs:
As you see, I don't use these nn.modules actually, but they affect results in fact. I provide 5 output results but they are all different. The only change is about nn.module.
Need some help. Thank you!
How would you like to use vllm
Seems nn.module definition may affect the output tokens. Don't know the reason.
The text was updated successfully, but these errors were encountered: