Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code fails to run in both jupyter and console #166

Open
marcjasner opened this issue Nov 16, 2022 · 11 comments
Open

Code fails to run in both jupyter and console #166

marcjasner opened this issue Nov 16, 2022 · 11 comments

Comments

@marcjasner
Copy link

marcjasner commented Nov 16, 2022

Sorry for the less than descriptive title, but I wasn't sure how else to title it.

I've got a 4gb Jetson Nano (SeeedStudio Jetson Recomputer J2010 carrier board) with a 128gb SSD as the root storage device. It's running JetPack 4.6 (output of 'apt-cache show nvidia-jetpack' below)

`$ sudo apt-cache show nvidia-jetpack
Package: nvidia-jetpack
Version: 4.6-b199
Architecture: arm64
Maintainer: NVIDIA Corporation
Installed-Size: 194
Depends: nvidia-cuda (= 4.6-b199), nvidia-opencv (= 4.6-b199), nvidia-cudnn8 (= 4.6-b199), nvidia-tensorrt (= 4.6-b199), nvidia-visionworks (= 4.6-b199), nvidia-container (= 4.6-b199), nvidia-vpi (= 4.6-b199), nvidia-l4t-jetson-multimedia-api (>> 32.6-0), nvidia-l4t-jetson-multimedia-api (<< 32.7-0)
Homepage: http://developer.nvidia.com/jetson
Priority: standard
Section: metapackages
Filename: pool/main/n/nvidia-jetpack/nvidia-jetpack_4.6-b199_arm64.deb
Size: 29368
SHA256: 69df11e22e2c8406fe281fe6fc27c7d40a13ed668e508a592a6785d40ea71669
SHA1: 5c678b8762acc54f85b4334f92d9bb084858907a
MD5sum: 1b96cd72f2a434e887f98912061d8cfb
Description: NVIDIA Jetpack Meta Package
Description-md5: ad1462289bdbc54909ae109d1d32c0a8

Package: nvidia-jetpack
Version: 4.6-b197
Architecture: arm64
Maintainer: NVIDIA Corporation
Installed-Size: 194
Depends: nvidia-cuda (= 4.6-b197), nvidia-opencv (= 4.6-b197), nvidia-cudnn8 (= 4.6-b197), nvidia-tensorrt (= 4.6-b197), nvidia-visionworks (= 4.6-b197), nvidia-container (= 4.6-b197), nvidia-vpi (= 4.6-b197), nvidia-l4t-jetson-multimedia-api (>> 32.6-0), nvidia-l4t-jetson-multimedia-api (<< 32.7-0)
Homepage: http://developer.nvidia.com/jetson
Priority: standard
Section: metapackages
Filename: pool/main/n/nvidia-jetpack/nvidia-jetpack_4.6-b197_arm64.deb
Size: 29356
SHA256: 104cd0c1efefe5865753ec9b0b148a534ffdcc9bae525637c7532b309ed44aa0
SHA1: 8cca8b9ebb21feafbbd20c2984bd9b329a202624
MD5sum: 463d4303429f163b97207827965e8fe0
Description: NVIDIA Jetpack Meta Package
Description-md5: ad1462289bdbc54909ae109d1d32c0a8
`
I've set up a python 3.6 virtualenv and followed the installation instructions for all required packages. There were no errors during any of the installations and I've verified all of the packages import properly from the python command line. When I run the jupyter notebook 'live_demo.ipynb' I am able to run all of the steps up until the following step:

`import torch2trt

model_trt = torch2trt.torch2trt(model, [data], fp16_mode=True, max_workspace_size=1<<25)`

When I attempt to run that step the systems thinks about it for a bit and then a dialog pops up that says the python kernel has crashed and will be automatically restarted. I cannot get past this step.

To help debug/diagnose I took all of the code from the notebook and incrementally added it to a python file to see if I could reproduce the issue. The code I have so far is:

`import cv2
import json
import trt_pose.coco
import trt_pose.models
import torch
import torch2trt
from torch2trt import TRTModule
import time
import torchvision.transforms as transforms
import PIL.Image
from trt_pose.draw_objects import DrawObjects
from trt_pose.parse_objects import ParseObjects
from jetcam.usb_camera import USBCamera
from jetcam.csi_camera import CSICamera
from jetcam.utils import bgr8_to_jpeg
import ipywidgets
from IPython.display import display

with open('human_pose.json', 'r') as f:
human_pose = json.load(f)

topology = trt_pose.coco.coco_category_to_topology(human_pose)

num_parts = len(human_pose['keypoints'])
num_links = len(human_pose['skeleton'])

model = trt_pose.models.resnet18_baseline_att(num_parts, 2 * num_links).cuda().eval()
MODEL_WEIGHTS = 'resnet18_baseline_att_224x224_A_epoch_249.pth'
model.load_state_dict(torch.load(MODEL_WEIGHTS))

WIDTH = 224
HEIGHT = 224

data = torch.zeros((1, 3, HEIGHT, WIDTH)).cuda()
print("Calling torch2trt.torch2trt\n")
model_trt = torch2trt.torch2trt(model, [data], fp16_mode=True, max_workspace_size=1<<25)
print("Done\n")
OPTIMIZED_MODEL = 'resnet18_baseline_att_224x224_A_epoch_249_trt.pth'
print("Calling torch.save\n")
torch.save(model_trt.state_dict(), OPTIMIZED_MODEL)
print("Done\n")`

When I run this code I see "Calling torch2trt.torch2trt" in the console and then, after a pause, I see the following error repeated many times on the console:

[TensorRT] ERROR: 3: [builderConfig.cpp::canRunOnDLA::341] Error Code 3: Internal Error (Parameter check failed at: optimizer/api/builderConfig.cpp::canRunOnDLA::341, condition: dlaEngineCount > 0 )

After that the system will seem to hang. The desktop will display a low memory warning (If I run 'watch -n1 free -h' in another console window I can see free memory drop from 3+GB to as little as 96MB). After some time the process just reports "Killed" and exits back to the command line.

I am at a loss. Can you please provide any helpful information you have that might help me correct this issue and continue?

Thanks
Marc

@ArcadeHustle
Copy link

bump

@janezlapajne
Copy link

@marcjasner have you figured out what the problem was? However, in my case I am able to pass this point, but I also get the same message written in the console. Also, the FPS is lower (8) than claimed (22)? My program then crashes at line: torch.save(model_trt.state_dict(), OPTIMIZED_MODEL), where the optimized model weights should be saved..

@marcjasner
Copy link
Author

@janezlapajne, no I never resolved that issue. I ended up investigating other pose detection methods and found the jetson_inference posenet code worked reliably and gave pretty good performance (about 17FPS)

@janezlapajne
Copy link

Thanks for telling me, I wish I would find out sooner - wouldn't spend that much time on it.. can you redirect me to the repo/model? I hope it can be quickly set and tested? Probably is this one (?): https://github.com/dusty-nv/jetson-inference/blob/master/docs/posenet.md

If you have any other suggestions etc. please let me know. I would just like to make it work ASAP.

@dusty-nv
Copy link
Member

@janezlapajne yes, that is the one

@marcjasner
Copy link
Author

Yep, as @dusty-nv pointed out, that is the correct one. Also, Dusty is super helpful on the Nvidia forums. He's helped me a number of times!

I think you'll find getting the code compiled and getting the posenet sample running is pretty straightforward and easy, and adapting it to any projects you're working on should be similarly easy.

@janezlapajne
Copy link

Hello, thank you both! Yes, I agree, @dusty-nv dusty never disappoints💪
Also, this jetson-inference package is amazing - made the pose model work in literally a couple of minutes, inside a docker container. Sincerely, dusty, great work! I really appreciate it! A few months ago, I also used the SSD detector from jetson-inference and it worked like a charm. The retraining process has worked out of the box with an additional script for the automatic download procedure of the open-images dataset. Really helpful if you want to quickly test and prototype something.

Anyways, to conclude I will ask something else (correct me if this is not the right place to ask). We plan to use two models concurrently on a Jetson nano - a pose model and detection model i.e. preferably yolov7 with via Deepstream package. Can Jetson Nano handle resources appropriately in such cases? Thank you!

@MAVProxyUser
Copy link

For anyone that may care, I got tired of messing with all the various pose implementations that did NOT work in Jetson environment out of the box due to compile issues or poor documentation.

I moved over to edge based processing ON the camera instead via Luxonis. Here is the Luxonis Pose example. All resources run on a Movidius VPN inside the camera. https://github.com/luxonis/depthai-experiments/tree/master/gen2-human-pose

@dusty-nv
Copy link
Member

a pose model and detection model i.e. preferably yolov7 with via Deepstream package. Can Jetson Nano handle resources appropriately in such cases?

Due to it's limited compute resources, it's hard to say with Nano, so you may need to play around with it - also, deepstream supports pose estimation (https://docs.nvidia.com/tao/tao-toolkit/text/bodypose_estimation/bodyposenet.html), so ultimately you may find that doing both detection+pose in deepstream gives you better performance.

@janezlapajne
Copy link

@dusty-nv ok, will see how it goes. Will report to the forum if I have any further questions.

@patriciamold33
Copy link

@marcjasner When I run the jupyter notebook 'live_demo.ipynb' I am able to run all of the steps up until the following step:

`import torch2trt

model_trt = torch2trt.torch2trt(model, [data], fp16_mode=True, max_workspace_size=1<<25)`

I have a similar issue, just that it doesn't crash , the code just stays stuck... Does anyone have some advice for me?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants