pid_namespaces

       For an overview of namespaces, see namespaces(7).

       PID  namespaces  isolate the process ID number space, meaning that pro-
       cesses in different PID namespaces can have the same PID.   PID  names-
       paces  allow  containers  to  provide  functionality  such  as suspend-
       ing/resuming the set of processes in the container  and  migrating  the
       container  to a new host while the processes inside the container main-
       tain the same PIDs.

       PIDs in a new PID namespace start at 1, somewhat like a standalone sys-
       tem, and calls to fork(2), vfork(2), or clone(2) will produce processes
       with PIDs that are unique within the namespace.

       Use of PID namespaces requires a kernel that  is  configured  with  the
       CONFIG_PID_NS option.

   The namespace init process
       The first process created in a new namespace (i.e., the process created
       using clone(2) with the CLONE_NEWPID flag, or the first  child  created
       by  a  process  after a call to unshare(2) using the CLONE_NEWPID flag)
       has the PID 1, and  is  the  "init"  process  for  the  namespace  (see
       init(1)).   A  child process that is orphaned within the namespace will
       be reparented to this process rather than init(1) (unless  one  of  the
       ancestors  of the child in the same PID namespace employed the prctl(2)
       PR_SET_CHILD_SUBREAPER command to mark itself as the reaper of orphaned
       descendant processes).

       If  the "init" process of a PID namespace terminates, the kernel termi-
       nates all of the processes in the namespace via a SIGKILL signal.  This
       behavior reflects the fact that the "init" process is essential for the
       correct operation of a PID  namespace.   In  this  case,  a  subsequent
       fork(2)  into this PID namespace will fail with the error ENOMEM; it is
       not possible to create a new processes in a PID namespace whose  "init"
       process  has terminated.  Such scenarios can occur when, for example, a
       process uses an open file descriptor for a /proc/[pid]/ns/pid file cor-
       responding  to  a process that was in a namespace to setns(2) into that
       namespace after the "init" process has  terminated.   Another  possible
       scenario  can occur after a call to unshare(2): if the first child sub-
       sequently created by a fork(2) terminates,  then  subsequent  calls  to
       fork(2) will fail with ENOMEM.

       Only signals for which the "init" process has established a signal han-
       dler can be sent to the "init" process by  other  members  of  the  PID
       namespace.   This restriction applies even to privileged processes, and
       prevents other members of the PID namespace from  accidentally  killing
       the "init" process.

       Likewise,  a process in an ancestor namespace can--subject to the usual
       permission checks described in  kill(2)--send  signals  to  the  "init"
       process  of a child PID namespace only if the "init" process has estab-
       lished a handler for that signal.  (Within the handler,  the  siginfo_t
       si_pid  field  described  in  sigaction(2)  will  be zero.)  SIGKILL or
       SIGSTOP are treated exceptionally: these signals are forcibly delivered
       when sent from an ancestor PID namespace.  Neither of these signals can
       clone(2)  or  unshare(2).   PID  namespaces  thus form a tree, with all
       namespaces ultimately tracing their ancestry to the root namespace.

       A process is visible to other processes in its PID  namespace,  and  to
       the  processes  in each direct ancestor PID namespace going back to the
       root PID namespace.  In this context, "visible" means that one  process
       can  be  the target of operations by another process using system calls
       that specify a process ID.  Conversely, the processes in  a  child  PID
       namespace  can't see processes in the parent and further removed ances-
       tor namespaces.  More succinctly: a process can see (e.g., send signals
       with kill(2), set nice values with setpriority(2), etc.) only processes
       contained in its own PID namespace and in descendants  of  that  names-
       pace.

       A process has one process ID in each of the layers of the PID namespace
       hierarchy in which is visible, and  walking  back  though  each  direct
       ancestor  namespace  through  to  the root PID namespace.  System calls
       that operate on process IDs always operate using the process ID that is
       visible in the PID namespace of the caller.  A call to getpid(2) always
       returns the PID associated with the namespace in which the process  was
       created.

       Some  processes in a PID namespace may have parents that are outside of
       the namespace.  For example, the parent of the initial process  in  the
       namespace  (i.e.,  the  init(1)  process  with PID 1) is necessarily in
       another namespace.  Likewise, the direct children  of  a  process  that
       uses  setns(2)  to  cause its children to join a PID namespace are in a
       different PID namespace from the caller of setns(2).   Calls  to  getp-
       pid(2) for such processes return 0.

       While  processes  may  freely  descend into child PID namespaces (e.g.,
       using setns(2) with CLONE_NEWPID), they  may  not  move  in  the  other
       direction.  That is to say, processes may not enter any ancestor names-
       paces (parent, grandparent, etc.).  Changing PID namespaces  is  a  one
       way operation.

   setns(2) and unshare(2) semantics
       Calls  to  setns(2)  that  specify  a PID namespace file descriptor and
       calls to unshare(2) with the CLONE_NEWPID flag  cause  children  subse-
       quently created by the caller to be placed in a different PID namespace
       from the caller.  These calls do not, however, change the PID namespace
       of the calling process, because doing so would change the caller's idea
       of its own PID (as reported by getpid()), which would break many appli-
       cations and libraries.

       To  put  things  another  way:  a process's PID namespace membership is
       determined when the process is created and  cannot  be  changed  there-
       after.   Among  other things, this means that the parental relationship
       between processes mirrors the parental relationship between PID  names-
       paces:  the  parent  of  a  process  is either in the same namespace or
       resides in the immediate parent PID namespace.

   Compatibility of CLONE_NEWPID with other CLONE_* flags
       CLONE_NEWPID can't be combined with some other CLONE_* flags:
       *  CLONE_VM  requires  all  of the threads to be in the same PID names-
          pace, because, from the point of view of a core dump,  if  two  pro-
          cesses  share  the same address space then they are threads and will
          be core dumped together.  When a core dump is written,  the  PID  of
          each  thread is written into the core dump.  Writing the process IDs
          could not meaningfully succeed if some of the process IDs were in  a
          parent PID namespace.

       To   summarize:   there   is   a  technical  requirement  for  each  of
       CLONE_THREAD, CLONE_SIGHAND, and CLONE_VM to  share  a  PID  namespace.
       (Note furthermore that in clone(2) requires CLONE_VM to be specified if
       CLONE_THREAD or CLONE_SIGHAND is specified.)  Thus, call sequences such
       as the following will fail (with the error EINVAL):

           unshare(CLONE_NEWPID);
           clone(..., CLONE_VM, ...);    /* Fails */

           setns(fd, CLONE_NEWPID);
           clone(..., CLONE_VM, ...);    /* Fails */

           clone(..., CLONE_VM, ...);
           setns(fd, CLONE_NEWPID);      /* Fails */

           clone(..., CLONE_VM, ...);
           unshare(CLONE_NEWPID);        /* Fails */

   /proc and PID namespaces
       A  /proc filesystem shows (in the /proc/PID directories) only processes
       visible in the PID namespace of the process that performed  the  mount,
       even  if  the /proc filesystem is viewed from processes in other names-
       paces.

       After creating a new PID namespace, it  is  useful  for  the  child  to
       change  its  root directory and mount a new procfs instance at /proc so
       that tools such as ps(1) work correctly.  If a new mount  namespace  is
       simultaneously  created  by including CLONE_NEWNS in the flags argument
       of clone(2) or unshare(2), then it isn't necessary to change  the  root
       directory: a new procfs instance can be mounted directly over /proc.

       From a shell, the command to mount /proc is:

           $ mount -t proc proc /proc

       Calling readlink(2) on the path /proc/self yields the process ID of the
       caller in the PID namespace of the procfs mount (i.e., the  PID  names-
       pace  of  the process that mounted the procfs).  This can be useful for
       introspection purposes, when a process wants to  discover  its  PID  in
       other namespaces.

   Miscellaneous
       When a process ID is passed over a UNIX domain socket to a process in a
       different PID namespace (see  the  description  of  SCM_CREDENTIALS  in
       unix(7)),  it  is  translated  into  the corresponding PID value in the
       receiving process's PID namespace.
       This  page  is  part of release 4.04 of the Linux man-pages project.  A
       description of the project, information about reporting bugs,  and  the
       latest     version     of     this    page,    can    be    found    at
       http://www.kernel.org/doc/man-pages/.

Linux                             2015-01-10                 PID_NAMESPACES(7)
Man Pages Copyright Respective Owners. Site Copyright (C) 1994 - 2019 Hurricane Electric. All Rights Reserved.