Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > linux.debian.bugs.dist > #1197730

Bug#1071269: runc: access syscall with F_OK flag fails when used to check a device char file attached to a container

From Luca Toscano <ltoscano@wikimedia.org>
Newsgroups linux.debian.bugs.dist
Subject Bug#1071269: runc: access syscall with F_OK flag fails when used to check a device char file attached to a container
Date 2024-05-17 15:10 +0200
Message-ID <IF5vr-dXTo-5@gated-at.bofh.it> (permalink)
Organization linux.* mail to news gateway

Show all headers | View raw


[Multipart message — attachments visible in raw view] - view raw

Package: runc
X-Debbugs-Cc: ltoscano@wikimedia.org
Version: 1.0.0~rc93+ds1-5+deb11u3
Severity: important
Tags: upstream

Dear Maintainer,

As reported in https://github.com/ROCm/k8s-device-plugin the version of runc
shipped by Bullseye does not include the following commit:

https://github.com/opencontainers/runc/commit/81707abd33d2ddebcd8ceeb08dfc01bf86d8badd

Packages like Pytorch (ROCm variant) trigger at some point a syscall like
access(/dev/dri/renderDXXX, F_OK)
to check if the GPU char device exists or not, while initializing the GPU
config/settings. In a container runc is responsible to run eBPF checks
when a GPU device is attached, to make sure that the permissions set by
the admin are respected. Due to a bug if the device settings are 'rw' the
eBPF
code will return EPERM even if the user running in the container has all
the rights to access the device. The patch seems really small and it may
save others a long debug time to figure out the root cause of the EPERM
error (that in turn causes the Pytorch's GPU init to fail etc..).

Would it be possible to backport the patch to the current Bullseye version?
As far as I can see it was shipped with 1.0.0~rc94 from upstream, it is very
close to the current runc version in Bullseye.

Thanks in advance!

Luca

-- System Information:
Debian Release: 11.9
  APT prefers oldstable-updates
  APT policy: (500, 'oldstable-updates'), (500, 'oldstable-security'),
(500, 'oldstable-debug'), (500, 'oldstable')
Architecture: amd64 (x86_64)

Kernel: Linux 5.10.0-28-amd64 (SMP w/72 CPU threads)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE
not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages runc depends on:
ii  libc6        2.31-13+deb11u9
ii  libseccomp2  2.5.1-1+deb11u1

Back to linux.debian.bugs.dist | Previous | Next | Find similar


Thread

Bug#1071269: runc: access syscall with F_OK flag fails when used to check a device char file attached to a container Luca Toscano <ltoscano@wikimedia.org> - 2024-05-17 15:10 +0200

csiph-web