Background
I'll begin with some simplified background on how processes are created in Linux. I'll not cover all options or all the details, but instead I'll focus on the key ideas.
Generally, new processes are created using the fork()
system call. On success, fork()
will result in a new process that is running the same program as the original (effectively, it's a clone of the program that invoked fork()
at the point at which it called fork()
). The fork()
function returns in both the "parent" process and in the "child" (the newly-created) process. Each process can examine the return value of fork()
to determine if they're the "parent" or the "child," and they can use that to decide what to do next.
Often creating a new process means we want to run a different program, and up until now we only have a way to create copies of the same program. Fortunately, there's a separate system call, exec()
, that replaces the currently running program with a new program.
Consider the case where you have a shell (I'll assume bash
here), and you type ls
to list the contents of the current directory:
(P1:bash) calls fork() --- the kernel creates P2 that is a copy of P1 --- the kernel starts running P2 (P1:bash) fork() returns with the PID of P2, so it knows it's the parent (P1:bash) Waits for P2 to finish (detailed elided) (P2:bash) fork() returns with 0, so it knows is the child (P2:bash) calls exec("ls") --- the kernel replaces bash with ls in P2 and starts ls running (P2:ls) starts running ... (P2:ls) eventually terminates (P1:bash) wakes up since P2 is finished, continues about its business
The Problem
You begin with:
# unshare --pid /bin/bash bash: fork: Cannot allocate memory bash-5.2#
Notice the error bash: fork: Cannot allocate memory
– that's not a good sign.
In this case, the unshare
program (1) creates a new PID namespace and (2) exec
s /bin/bash
. Recall from the Background section that exec
replaces the current running process (unshare
) with a new program (/bin/bash
) – it doesn't create a new process.
Up to now, no processes are running in the newly-created PID namespace. The namespace exists, but the process that created the namespace hasn't yet fork()
-ed anything.
As bash
starts running, it typically runs some set of programs. Here run
is the fork
/exec
combination described in the Background section. The kernel places the first process that bash
fork()
s in the new PID namespace and that process becomes the init
process for that namespace (the process in that namespace with pid = 1). The program that bash
runs is likely short lived, so it runs, terminates, and the PID namespace is destroyed.
Next bash
tries to run some other commmand. It wants to put those commands in the new PID namespace, but that PID namespace no longer exists. As a result, the fork()
fails resulting in the error message that you see. You'll see it again if you try to run any other command:
bash-5.2# ls bash: fork: Cannot allocate memory bash-5.2#
The Solution
As you note in your question, the unshare
program has another option that is useful in this scenario. From man unshare
:
-f, --fork
Fork the specified program as a child process of unshare
rather than running it directly. This is useful when creating a new PID namespace. Note that when unshare
is waiting for the child process, then it ignores SIGINT
and SIGTERM
and does not forward any signals to the child. It is necessary to send signals to the child process.
You can replace your first command with:
# unshare --fork --pid /bin/bash #
Notice that in this case there is no error.
This option causes unshare
to change its behavior. Instead of immediately using exec()
to replace itself with /bin/bash
, it uses the fork()
/exec()
behavior described in the Background section above:
(P1:unshare) calls fork() --- the kernel creates P2 that is a copy of P1 --- the kernel starts running P2 (P1:unshare) fork() returns with the PID of P2, so it knows it's the parent (P1:unshare) Waits for P2 to finish (detailed elided) (P2:unshare) fork() returns with 0, so it knows is the child. --- P2 is running in the new PID namespace and has pid = 1 (P2:unshare) calls exec("/bin/bash") --- the kernel replaces unshare with /bin/bash in P2 and starts ls running (P2:bash) starts running
You can confirm that in this case /bin/bash
is the init
process (i.e., process with pid 1) by printing its process id:
# echo $$ 1 #
Answers to your questions
- The second window does not show any indication of
unshare --pid /bin/bash
. Is this because the /bin/bash
command or the /bin/bash
process had already terminated? This is why many Linux users on the internet recommend using the --fork
so that the /bin/bash
runs in the newly created namespace?
The second window does not show any indication of unshare
because it is no longer running – it used exec()
to replace itself with /bin/bash
.
The --fork
option changes the behavior of unshare
so that it uses fork()
to first create a new process — a process in the newly-created PID namespace — then that process uses exec()
to replace itself with /bin/bash
.
- The accepted answer stated this: "After bash start to run, bash will fork several new sub-processes to do somethings." I do not understanding the meaning of this sentence. So in the second terminal window, I ran this:
The new sub-processes are likely short-lived, so they're not longer running by the time you run ps
.
-H
option to ps might make things more clear.