Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Container Crashing during Downloading nvidia driver #148

Closed
Cribz679 opened this issue Jun 6, 2024 · 5 comments
Closed

[Bug]: Container Crashing during Downloading nvidia driver #148

Cribz679 opened this issue Jun 6, 2024 · 5 comments
Labels
status:awaiting-triage type:bug Something isn't working

Comments

@Cribz679
Copy link

Cribz679 commented Jun 6, 2024

Describe the Bug

Any help is appreciated. My container seems to be crashing/restarting during downloading nvidia drivers. Any help is appreciated.

`Build: [2024-05-25 02:40:32] [master] [47f6f7a] [debian]

[ /etc/cont-init.d/10-setup_user.sh: executing... ]
**** Configure default user ****

  • Setting default user uid=568(default) gid=568(default)
  • Adding default user to any additional required device groups
  • Adding user 'default' to group: 'video'
  • Adding user 'default' to group: 'audio'
  • Adding user 'default' to group: 'input'
  • Adding user 'default' to group: 'pulse'
  • Adding user 'default' to group: 'render' for device: /dev/input/event0
  • Adding user 'default' to group: 'video' for device: /dev/dri/card0
  • Adding user 'default' to group: 'avahi' for device: /dev/dri/renderD128
  • Setting umask to 0022
  • Create the user XDG_RUNTIME_DIR path '/tmp/.X11-unix/run'
  • Setting ownership of all log files in '/home/default/.cache/log'
  • Setting root password
  • Setting user password
    DONE

[ /etc/cont-init.d/11-setup_sysctl_values.sh: executing... ]
**** Configure some system kernel parameters ****

  • The vm.max_map_count is already greater than '524288'
    DONE

[ /etc/cont-init.d/30-configure_dbus.sh: executing... ]
**** Configure container dbus ****

  • Container configured to run its own dbus
    DONE

[ /etc/cont-init.d/30-configure_udev.sh: executing... ]
**** Configure udevd ****

  • Disable udevd - /sys is mounted RO
  • Enable dumb-udev service
  • Ensure the default user has permission to r/w on input devices
    DONE

[ /etc/cont-init.d/40-setup_locale.sh: executing... ]
**** Configure local ****

  • Locales already set correctly to en_US.UTF-8 UTF-8
    DONE

[ /etc/cont-init.d/50-configure_pulseaudio.sh: executing... ]
**** Configure pulseaudio ****

  • Enable pulseaudio service.
  • Configure pulseaudio to pipe audio to a socket
    DONE

[ /etc/cont-init.d/60-configure_gpu_driver.sh: executing... ]
**** Found Intel device '12th Gen Intel(R) Core(TM) i5-12500T 12th Gen Intel(R) Core(TM) i5-12500T To Be Filled By O.E.M. CPU @ 1.9GHz' ****

  • Mesa has already been installed into this container
    **** No AMD device found ****
    **** Found NVIDIA device 'NVIDIA GeForce RTX 4070' ****
  • Downloading driver v545.23.08`

Steps to Reproduce

No response

Expected Behavior

No response

Screenshots

No response

Relevant Settings

No response

Version

Build: [2024-05-25 02:40:32] [master] [47f6f7a] [debian]

Platform

TrueNAS Scale via TrueCharts
Distribution: Debian GNU/Linux - 12 (bookworm)
Linux Kernel: 6.6.20-production+truenas unknown unknown GNU/Linux
GPU Drivers: | NVIDIA-SMI 545.23.08 Driver Version: 545.23.08 CUDA Version: 12.3 |

Relevant log output

`Build: [2024-05-25 02:40:32] [master] [47f6f7a176ee0c6f6c870c29397cc1a8d6d57839] [debian]

[ /etc/cont-init.d/10-setup_user.sh: executing... ]
**** Configure default user ****
  - Setting default user uid=568(default) gid=568(default)
  - Adding default user to any additional required device groups
  - Adding user 'default' to group: 'video'
  - Adding user 'default' to group: 'audio'
  - Adding user 'default' to group: 'input'
  - Adding user 'default' to group: 'pulse'
  - Adding user 'default' to group: 'render' for device: /dev/input/event0
  - Adding user 'default' to group: 'video' for device: /dev/dri/card0
  - Adding user 'default' to group: 'avahi' for device: /dev/dri/renderD128
  - Setting umask to 0022
  - Create the user XDG_RUNTIME_DIR path '/tmp/.X11-unix/run'
  - Setting ownership of all log files in '/home/default/.cache/log'
  - Setting root password
  - Setting user password
DONE

[ /etc/cont-init.d/11-setup_sysctl_values.sh: executing... ]
**** Configure some system kernel parameters ****
  - The vm.max_map_count is already greater than '524288'
DONE

[ /etc/cont-init.d/30-configure_dbus.sh: executing... ]
**** Configure container dbus ****
  - Container configured to run its own dbus
DONE

[ /etc/cont-init.d/30-configure_udev.sh: executing... ]
**** Configure udevd ****
  - Disable udevd - /sys is mounted RO
  - Enable dumb-udev service
  - Ensure the default user has permission to r/w on input devices
DONE

[ /etc/cont-init.d/40-setup_locale.sh: executing... ]
**** Configure local ****
  - Locales already set correctly to en_US.UTF-8 UTF-8
DONE

[ /etc/cont-init.d/50-configure_pulseaudio.sh: executing... ]
**** Configure pulseaudio ****
  - Enable pulseaudio service.
  - Configure pulseaudio to pipe audio to a socket
DONE

[ /etc/cont-init.d/60-configure_gpu_driver.sh: executing... ]
**** Found Intel device '12th Gen Intel(R) Core(TM) i5-12500T 12th Gen Intel(R) Core(TM) i5-12500T To Be Filled By O.E.M. CPU @ 1.9GHz' ****
  - Mesa has already been installed into this container
**** No AMD device found ****
**** Found NVIDIA device 'NVIDIA GeForce RTX 4070' ****
  - Downloading driver v545.23.08`
@Cribz679 Cribz679 added status:awaiting-triage type:bug Something isn't working labels Jun 6, 2024
@alansari
Copy link
Contributor

alansari commented Jun 6, 2024

v545.23.08 is not hosted on https://download.nvidia.com/XFree86/Linux-x86_64 though thats what is offered on truenas. You wil need to get the run file and rename it to NVIDIA_545.23.08.run and place it in your dataset you mounted for /home/default/Downloads/. Or you could downgrade truenas to a previous snapshot/version from the grub boot manager. Or jump the gun to electric eel and migrate to useing native docker compose to spin up the container.
edit: fingers crossed today's schedualed maitenence release includes a diff nvidia driver release

@Cribz679
Copy link
Author

Cribz679 commented Jun 6, 2024

Amazing. Thank you. That worked. :)

@alansari
Copy link
Contributor

alansari commented Jun 9, 2024

Just an FYI, i bit the bullet and updated my truenas installation to latest.
In order to get the container up and running i used these steps:

edit: I would highly recomend moving away from the helm/truecharts version to jailmaker/nspawn implementation. Detailed walkthrough can be found in a thread on the steam-headless discord server.

@seisdr
Copy link

seisdr commented Jul 5, 2024

Just an FYI, i bit the bullet and updated my truenas installation to latest. In order to get the container up and running i used these steps:

edit: I would highly recomend moving away from the helm/truecharts version to jailmaker/nspawn implementation. Detailed walkthrough can be found in a thread on the steam-headless discord server.

home/default or home/defaults/Downloads?
Regardless it doesn't work!

@WillKirkmanM
Copy link
Contributor

This has been fixed due to the merged pull request #153. Can Close.

@Josh5 Josh5 closed this as completed Jul 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status:awaiting-triage type:bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants