Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There are questions about the parameters needed for forward propagation in model transformation #926

Open
huangshilong911 opened this issue May 11, 2024 · 2 comments

Comments

@huangshilong911
Copy link

Hello john, once again I have to ask for your advice sincerely.

In a previous demo, I was able to successfully convert a .pth to a .engine model with the following code:

x = torch.ones((1, 3, 1024, 1024)).cuda()

model_trt = torch2trt(model, [x], fp16_mode=True)

However, while performing the conversion against the slim_sam model, the need for a forward channel input parameter arose and I had to modify the code to the following form:


            batched_input = [{
                'image': image,
                'original_size': original_size,
                'point_coords': point_coords,
                'point_labels': point_labels,
                'boxes': boxes,
                'mask_inputs': mask_inputs
            }]

            multimask_output = False

            model_trt = torch2trt(SlimSAM_model, [batched_input, multimask_output], fp16_mode=True)

            torch.save(model_trt.state_dict(), "./vit_b_slim_step2_trt.pth")

But even so, the script still doesn't run well, it still prompts me that I did parameter or entered the wrong parameter, is there any way to solve this situation?

In addition, I referred to the work you recommended to me on nanosam, which also seems to add parameters to the conversion process, which seems to be informative, and I would like to ask for advice on whether I should operate the same way as this work. If so, what should I do?

The relevant operations for the nanosam work are as follows:
图片

@huangshilong911
Copy link
Author

Here is my conversion script, based on the previous question, I tried to enter the parameters needed for the forward channel during the conversion:

import torch
from torch2trt import torch2trt
from torch2trt import TRTModule

device = torch.device("cuda")
print("CUDA visible devices: " + str(torch.cuda.device_count()))
print("CUDA Device Name: " + str(torch.cuda.get_device_name(device)))

model_path = "/home/jetson/Workspace/aicam/ircamera/vit_b_slim_step2_.pth"
SlimSAM_model = torch.load(model_path)
SlimSAM_model.image_encoder = SlimSAM_model.image_encoder.module
SlimSAM_model.to(device)
SlimSAM_model.eval()
print("model_path:",model_path)

SlimSAM_model.eval().cuda()

image = torch.zeros((3, 1024, 1024)).cuda() 
original_size = (1024, 1024)  
point_coords = torch.zeros((1, 5, 2)).cuda()  
point_labels = torch.zeros((1, 5)).cuda()  
boxes = torch.zeros((1, 4)).cuda() 
mask_inputs = torch.zeros((1, 1, 1024, 1024)).cuda()  

batched_input = [{
    'image': image,
    'original_size': original_size,
    'point_coords': point_coords,
    'point_labels': point_labels,
    'boxes': boxes,
    'mask_inputs': mask_inputs
}]

multimask_output = False

model_trt = torch2trt(SlimSAM_model, [batched_input, multimask_output], fp16_mode=True)

torch.save(model_trt.state_dict(), "./vit_b_slim_step2_trt.pth")

model_trt_engine = TRTModule()
model_trt_engine.load_state_dict(model_trt.state_dict())
torch.save(model_trt_engine.state_dict(), "vit_b_slim_step2_trt.engine")

But the following error was reported:

CUDA visible devices: 1
CUDA Device Name: Orin
model_path: /home/jetson/Workspace/aicam/ircamera/vit_b_slim_step2_.pth
Traceback (most recent call last):
  File "convert-sam-trt_copy3.py", line 54, in <module>
    model_trt = torch2trt(SlimSAM_model, [batched_input, multimask_output], fp16_mode=True)
  File "/home/jetson/miniconda3/envs/sam0/lib/python3.8/site-packages/torch2trt-0.5.0-py3.8.egg/torch2trt/torch2trt.py", line 558, in torch2trt
    outputs = module(*inputs)
  File "/home/jetson/miniconda3/envs/sam0/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1111, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jetson/miniconda3/envs/sam0/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/jetson/Workspace/aicam/ircamera/segment_anything_kd/modeling/sam.py", line 98, in forward
    image_embeddings = self.image_encoder(input_images)
  File "/home/jetson/miniconda3/envs/sam0/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1111, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/jetson/Workspace/aicam/ircamera/segment_anything_kd/modeling/image_encoder.py", line 136, in forward
    qkv_emb1 = torch.cat([qkv_emb1,qkv_outputs[num]],dim=0)
RuntimeError: Sizes of tensors must match except in dimension 0. Expected size 300 but got size 396 for tensor number 1 in the list.

@jaybdub
Copy link
Contributor

jaybdub commented May 17, 2024

Hi @huangshilong911 ,

Thanks for reaching out again.

It looks like the error occurs during the execution of the PyTorch model.

Do you experience this error simply by calling the following?

output = SlimSAM_model(*[batched_input, multimask_output])

Best,
John

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants