Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > linux.debian.bugs.dist > #1197730
| From | Luca Toscano <ltoscano@wikimedia.org> |
|---|---|
| Newsgroups | linux.debian.bugs.dist |
| Subject | Bug#1071269: runc: access syscall with F_OK flag fails when used to check a device char file attached to a container |
| Date | 2024-05-17 15:10 +0200 |
| Message-ID | <IF5vr-dXTo-5@gated-at.bofh.it> (permalink) |
| Organization | linux.* mail to news gateway |
[Multipart message — attachments visible in raw view] - view raw
Package: runc X-Debbugs-Cc: ltoscano@wikimedia.org Version: 1.0.0~rc93+ds1-5+deb11u3 Severity: important Tags: upstream Dear Maintainer, As reported in https://github.com/ROCm/k8s-device-plugin the version of runc shipped by Bullseye does not include the following commit: https://github.com/opencontainers/runc/commit/81707abd33d2ddebcd8ceeb08dfc01bf86d8badd Packages like Pytorch (ROCm variant) trigger at some point a syscall like access(/dev/dri/renderDXXX, F_OK) to check if the GPU char device exists or not, while initializing the GPU config/settings. In a container runc is responsible to run eBPF checks when a GPU device is attached, to make sure that the permissions set by the admin are respected. Due to a bug if the device settings are 'rw' the eBPF code will return EPERM even if the user running in the container has all the rights to access the device. The patch seems really small and it may save others a long debug time to figure out the root cause of the EPERM error (that in turn causes the Pytorch's GPU init to fail etc..). Would it be possible to backport the patch to the current Bullseye version? As far as I can see it was shipped with 1.0.0~rc94 from upstream, it is very close to the current runc version in Bullseye. Thanks in advance! Luca -- System Information: Debian Release: 11.9 APT prefers oldstable-updates APT policy: (500, 'oldstable-updates'), (500, 'oldstable-security'), (500, 'oldstable-debug'), (500, 'oldstable') Architecture: amd64 (x86_64) Kernel: Linux 5.10.0-28-amd64 (SMP w/72 CPU threads) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set Shell: /bin/sh linked to /usr/bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled Versions of packages runc depends on: ii libc6 2.31-13+deb11u9 ii libseccomp2 2.5.1-1+deb11u1
Back to linux.debian.bugs.dist | Previous | Next | Find similar
Bug#1071269: runc: access syscall with F_OK flag fails when used to check a device char file attached to a container Luca Toscano <ltoscano@wikimedia.org> - 2024-05-17 15:10 +0200
csiph-web