Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.os.linux.development.apps > #916

Need advice about fixing PROC mount failures in a DIY Linux container

From Lew Pitcher <lew.pitcher@digitalfreehold.ca>
Newsgroups alt.os.linux.slackware, comp.os.linux.misc, comp.os.linux.development.apps, comp.unix.programmer
Subject Need advice about fixing PROC mount failures in a DIY Linux container
Date 2023-01-07 01:27 +0000
Organization A noiseless patient Spider
Message-ID <tpahpv$3a27i$1@dont-email.me> (permalink)

Cross-posted to 4 groups.

Show all headers | View raw


Hi, all

I've come late to the party, and have just started learning
about the ins and outs of Linux containers. To get a better
understanding of the subject, I decided to learn about the
underlying technologies by building my own container software.

I've modelled my DIY container on Brian Swetland's mkbox
container[1], and have a demonstration program that works
on my development system (a 64bit AMD Ryzen 5 3400G with
Radeon Vega Graphics, running Slackware Linux 14.2 with
the 4.4.301 kernel and all available patches applied).
[1] https://github.com/swetland/mkbox


However, when I run either Brian's mkbox or my demo program
on my "production" system (another 64bit AMD Ryzen 5 3400G
with Radeon Vega Graphics, running Slackware Linux 14.2 with
the 4.4.301 kernel and all available patches applied), the
container breaks while trying to mount the proc filesystem
to the new (isolated) root fs.

Specifically, I get an "Operation not permitted" error when
I try to
  mount("proc","proc","proc",MS_REC,NULL)
/but/ ONLY ON THIS ONE SYSTEM.

This failure affects both my DIY container and Brian's mkbox
container.

With my DIY container, I've checked the capabilities given
to the container process, and they are identical and complete
on both systems. On both systems, I run the container process
(mine and Brian's) from the same unprivileged UID/GID.

I have to conclude that there's a difference in the two
environments that causes this problem, but I don't know what
that difference is. Both systems use the type CPU, the
same amount of memory, the same 64-bit addressing mode,
the same kernel, and the same distribution (with the same
essential utilities).

There /are/ differences in the two systems:
pn the development system, my user is a member of a
number of groups that it is not a member of on the 
"production" system. I run a root pulseaudio (I have my
reasons) on the development system that I do not on
the "production" system. Et cetera.

Can anyone suggest an environmental factor or set of
factors that might cause this behaviour?

For reference, I include a copy of a minimal implementation
of my DIY container that illustrates the problem, along with
captures of both a successful run on my development system
and an unsuccessful run on my production system.

========== demo.c ==========
/*
** demonstrate selective problem with Slackware Linux 14.2
** user namespace creation (Kernel 4.4.301)
*/

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/wait.h>
#include <fcntl.h>
#include <sys/mount.h>
#include <sched.h>
#include <string.h>
#include <errno.h>

/* pivot_root() prototype not supplied by headers */
extern int pivot_root(const char *new_root, const char *put_old);

void Die(int line);	/* generate error message and exit process */
#define	DIE() Die(__LINE__)

int main(void)
{
  char	*fauxRoot = "./.fauxroot",	/* will be our new root filesystem */
	*oldRoot  = ".oldroot",		/* where pivot_root puts old root fs */
	*oldProc  = ".oldproc",		/* where we temp relocate /proc to */
	*newProc  = "proc";		/* where we mount /proc to */
  pid_t	init_pid;

  umask(0);

  rmdir(fauxRoot); if (mkdir(fauxRoot,0777))			DIE();

  if (unshare(CLONE_NEWUSER|CLONE_NEWNS|CLONE_NEWPID))		DIE();

  if (mount("none","/",NULL,MS_REC|MS_PRIVATE,NULL))		DIE();
  if (mount(fauxRoot,fauxRoot,NULL,MS_BIND|MS_NOSUID,NULL))	DIE();
  if (chdir(fauxRoot))						DIE();

  rmdir(oldRoot);	if (mkdir(oldRoot,0751))		DIE();
  rmdir(oldProc);	if (mkdir(oldProc,0755))		DIE();
  rmdir(newProc);	if (mkdir(newProc,0755))		DIE();

  if (mount("/proc",oldProc,NULL,MS_BIND|MS_REC,NULL))		DIE();

  /* set new uid, gid */
  {
    FILE *map;

    if ((map = fopen("/proc/self/uid_map","w")) == NULL)	DIE();
    fprintf(map,"0 %lu 1\n",(unsigned long)getuid());
    fclose(map);

    if ((map = fopen("/proc/self/setgroups","w")) == NULL)	DIE();
    fwrite("deny",4,1,map);
    fclose(map);

    if ((map = fopen("/proc/self/gid_map","w")) == NULL)	DIE();
    fprintf(map,"0 %lu 1\n",(unsigned long)getgid());
    fclose(map);
  }

  if (pivot_root(".",oldRoot))					DIE();
  if (umount2(oldRoot,MNT_DETACH))				DIE();
  if (rmdir(oldRoot))						DIE();

  switch (init_pid = fork())
  {
    case -1:
      DIE();
      break;

    case 0:
      if (mount("/proc",newProc,"proc",MS_REC,NULL))		DIE();
      if (umount2(oldProc,MNT_DETACH))				DIE();
      if (rmdir(oldProc))					DIE();
      printf("INIT: my pid is %lu\n",(unsigned long)getpid());
      break;

    default:
      printf("PARENT: INIT pid is %lu\n",(unsigned long)init_pid);
      wait(NULL);
      break;
  }

  return EXIT_SUCCESS;
}

void Die(int line)
{
  fprintf(stderr,"Error encountered at line %d: %s\n",line,strerror(errno));
  exit(EXIT_FAILURE);
}

========== successful execution on development system ==========
Script started on Fri 06 Jan 2023 08:20:12 PM EST
20:20 $ uname -a

Linux wordsworth 4.4.301 #1 SMP Mon Jan 31 20:27:28 CST 2022 x86_64 AMD Ryzen 5 3400G with Radeon Vega Graphics AuthenticAMD GNU/Linux

20:20 $ cat /etc/slackware-version

Slackware 14.2

20:20 $ rm demo

20:20 $ rm -rf .fauxroot

20:20 $ cc -o demo demo.c

20:20 $ ./demo

PARENT: INIT pid is 558

INIT: my pid is 1

20:20 $ ls -laR .fauxroot
fauxroot:

total 12

drwxrwxrwx 3 lpitcher users 4096 Jan  6 20:20 .

drwxr-xr-x 6 lpitcher users 4096 Jan  6 20:20 ..

drwxr-xr-x 2 lpitcher users 4096 Jan  6 20:20 proc


fauxroot/proc:

total 8

drwxr-xr-x 2 lpitcher users 4096 Jan  6 20:20 .

drwxrwxrwx 3 lpitcher users 4096 Jan  6 20:20 ..

20:21 $ exit

exit


Script done on Fri 06 Jan 2023 08:21:02 PM EST


========== unsuccessful execution on production system ==========
Script started on Fri Jan  6 20:21:11 2023
~/code/namespaces $ uname -a

Linux merlin 4.4.301 #1 SMP Mon Jan 31 20:27:28 CST 2022 x86_64 AMD Ryzen 5 3400G with Radeon Vega Graphics AuthenticAMD GNU/Linux

~/code/namespaces $ cat /etc/slackware-version

Slackware 14.2

~/code/namespaces $ rm demo

~/code/namespaces $ rm -rf .fauxroot

~/code/namespaces $ cc -o demo demo.c

~/code/namespaces $ ./demo

PARENT: INIT pid is 1651

Error encountered at line 77: Operation not permitted

~/code/namespaces $ nl -ba demo.c | grep ' 77'

    77	      if (mount("/proc",newProc,"proc",MS_REC,NULL))		DIE();

~/code/namespaces $ ls -laR .fauxroot
fauxroot:

total 16

drwxrwxrwx 4 lpitcher users 4096 Jan  6 20:21 .

drwxr-xr-x 6 lpitcher users 4096 Jan  6 20:21 ..

drwxr-xr-x 2 lpitcher users 4096 Jan  6 20:21 .oldproc

drwxr-xr-x 2 lpitcher users 4096 Jan  6 20:21 proc


fauxroot/.oldproc:

total 8

drwxr-xr-x 2 lpitcher users 4096 Jan  6 20:21 .

drwxrwxrwx 4 lpitcher users 4096 Jan  6 20:21 ..


fauxroot/proc:

total 8

drwxr-xr-x 2 lpitcher users 4096 Jan  6 20:21 .

drwxrwxrwx 4 lpitcher users 4096 Jan  6 20:21 ..

~/code/namespaces $ exit

exit


Script done on Fri Jan  6 20:22:50 2023




-- 
Lew Pitcher
"In Skills, We Trust"

Back to comp.os.linux.development.apps | Previous | NextNext in thread | Find similar


Thread

Need advice about fixing PROC mount failures in a DIY Linux container Lew Pitcher <lew.pitcher@digitalfreehold.ca> - 2023-01-07 01:27 +0000
  Re: Need advice about fixing PROC mount failures in a DIY Linux container Lew Pitcher <lew.pitcher@digitalfreehold.ca> - 2023-01-07 02:12 +0000
    Re: Need advice about fixing PROC mount failures in a DIY Linux container Jasen Betts <usenet@revmaps.no-ip.org> - 2023-01-07 07:06 +0000
    Re: Need advice about fixing PROC mount failures in a DIY Linux container John-Paul Stewart <jpstewart@personalprojects.net> - 2023-01-07 11:41 -0500
    Re: Need advice about fixing PROC mount failures in a DIY Linux container Rainer Weikusat <rweikusat@talktalk.net> - 2023-01-09 19:27 +0000

csiph-web