Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker Error - Unknown or Invalid Runtime Name: Nvidia #132

Open
Buddies-as-you-know opened this issue Nov 15, 2023 · 9 comments
Open

Docker Error - Unknown or Invalid Runtime Name: Nvidia #132

Buddies-as-you-know opened this issue Nov 15, 2023 · 9 comments
Assignees
Labels
documentation Improvements or additions to documentation good first issue Good for newcomers verify to close Waiting on confirm issue is resolved

Comments

@Buddies-as-you-know
Copy link

Buddies-as-you-know commented Nov 15, 2023

I am encountering a runtime error with Docker when trying to use the Nvidia runtime. This issue arises despite having a successful output with an initial Docker command and making subsequent edits to the Docker configuration.

Steps to Reproduce

  1. Run the following Docker command which executes successfully:

    sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
  2. Edit /etc/docker/daemon.json as follows:

    {
        "runtimes": {
            "nvidia": {
                "path": "nvidia-container-runtime",
                "runtimeArgs": []
            }
        },
        "default-runtime": "nvidia"
    }
  3. After making these changes, attempt to execute a script with the command:

    scripts/run_dev.sh ~/workspaces/isaac_ros-dev/

    This results in the following error:

    docker: Error response from daemon: unknown or invalid runtime name: nvidia.
    

Expected Behavior

The Docker container should recognize the Nvidia runtime without errors, especially since the initial command runs without issues.

Actual Behavior

The system throws an error stating "unknown or invalid runtime name: nvidia" when trying to run a script that utilizes Docker with the Nvidia runtime.

Environment

  • Docker version: Docker version 24.0.7, build afdd53b
  • Operating System:ubuntu22.04
  • Any other relevant environmental details

Attempts to Resolve

  • Verified that the initial Docker command runs successfully.
  • Checked the syntax and paths in the daemon.json file.
  • Searched for similar issues in forums and GitHub Issues.

Request for Help

Could anyone provide insights or suggest potential solutions to resolve this runtime error? Any advice or guidance would be greatly appreciated.

@hemalshahNV hemalshahNV self-assigned this Nov 17, 2023
@hemalshahNV
Copy link
Contributor

It looks like you may not have nvidia-container-toolkit installed. See here for instructions on how to install on your x86_64 system running Jammy.

@hemalshahNV hemalshahNV added documentation Improvements or additions to documentation good first issue Good for newcomers verify to close Waiting on confirm issue is resolved labels Nov 17, 2023
@Buddies-as-you-know
Copy link
Author

Buddies-as-you-know commented Nov 17, 2023

We have installed nvidia-container-toolkit and then started docker, but we get this error.

@solix
Copy link

solix commented Nov 17, 2023

I am experiencing same issue, nvidia-container-toolkit is also installed.

1 similar comment
@weirdsim14
Copy link

I am experiencing same issue, nvidia-container-toolkit is also installed.

@hemalshahNV
Copy link
Contributor

We're looking into this but haven't been able to reproduce this yet with the same OS and Docker version. We're still running a few more experiments on freshly provisioned machines to see if we can narrow it down.

Our theory is that setup instructions in nvidia-container-toolkit is different than what our machine provisioning scripts do (listed below):

# Install Nvidia Docker runtime
curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | \
  sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list
sudo apt-get update
sudo apt-get install -y nvidia-container-runtime
sudo systemctl restart docker

sudo gpasswd -a $USER docker
sudo usermod -a -G docker $(whoami)
newgrp docker

@mrlreable
Copy link

Hi,

is there any update regarding this issue? I'm experiencing the same on Ubuntu 22.04, Docker v4.30.0

@sid-isq
Copy link

sid-isq commented May 28, 2024

was facing the same issue...
SOLVED by following these steps below

Editing the file /etc/docker/daemon.json to include:

{
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}

and then running:

sudo systemctl daemon-reload
sudo systemctl restart docker

The error stops showing and we are able to see the GPUs inside the containers when we run:

sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi

prior to all this, we followed this tutorial (NVIDIA container toolkit instructions).
Yet, it did not require to edit the file, as described above.

@EmanuelCastanho
Copy link

The previous solution did not solve my problem.
My original daemon.json was:

{
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
        }
    }
}

I changed for the above one and did not solve. I already installed nvidia-container-toolkit. I am using Ubuntu 22.04.3 LTS.

@tanelikor
Copy link

Happened to run across this thread, so will give my experience:

I had the same problem a couple weeks ago, also with Ubuntu 22.04. I had docker installed via snap, and that caused some of the paths to be different than what the Nvidia tools expect. I'm sure it should be fixable for the snap installation as well, but for me the easiest solution was to remove docker entirely and re-install it via apt-get as instructed here in docker guides. I tried to make it work with the snap version but quickly ran out of patience and decided to just reinstall docker entirely.

So if you haven't already, you might want to check how your docker is installed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation good first issue Good for newcomers verify to close Waiting on confirm issue is resolved
Projects
None yet
Development

No branches or pull requests

8 participants