6. Linux Namespaces
Namespaces are used to create a sandbox to run user applications.
Sandbox helps us to run apps in an isolated environment in a Linux box.
Linux namespaces are base of all Linux containerization tools like LXC, Docker etc.
They are so simple, well designed, useful, understanding them will allows to use them in our products.
Let us think like this - we know pthreads, fock(), asyncIO, so we use them in our production SW.
In the same way, if we know Linux spaces well, we may get ideas where we can use them in our SW.
Many times, I feel, we must explore these before we jump to Docker!.
Linux namespaces (and control group) are foundations on which any Linux container runs (Docker/LXC). Linux namespaces are provided by Linux Kernel.
6.1 Namespaces
Following is the list of Linux name spaces. Later, we will how to create/use/destroy them.
- pid namespace
All Linux processes are part of PID namespace. So, when we run ps or top command, it is read from
pid name space. /proc FS maintains the pid namespace.
We can create our own pid name space where our applications will not interfere the parent pid namespace.
- mount namespace
This name space is for (un)mounting a file system. Kernel stores that info in a
mount table
.
All our real devices like HDD/SDD or virtual FS like ramfs/ procfs/ sysfs/ etc are mouted for us. We can mount and unmount a new FS based on our need.
6.2 unshare
command
unshare
command allows us create new namespaces, and, newly created namespaces will
not be shared
with parent process (host), unless, we explicitly ask.
Here, not shared
is the meaning of un
in the unshare
word.
unshare
is a wrapper over clone
C api.
It is better to use unshare
compared to using direct C api.
Also, after running unshare, all the namespaces of the parent still visible to child by defualt.
But we can remove parent's mount spaces from child (new unshared environment).
For example:
- after running unshare on a shell, the entire FileSystem on host still visible but we remove this.
- after running unshare on a shell, the entire /proc fs of the parent, which is process namespace used in commands like ps
, top
. pid() api. But we can remove this. We will see this as we move on.
unshare - run program in new namespaces. ie, new namespace will be created but those will not be visible to parent. Also, explicitly we need to remove parents ns if we don't want them to be touched by new program
unshare [options] [child_program [arguments]] options
- --pid: create a new pid name space.
- --mount-proc: if we create pid ns, create a new /proc fs so that child program will not disturb parent's pid. This implies --mount also.
it is something similar to manually running
mount -t proc none /proc
, but it is better than this, because, procfs get mounted even before our program runs and, this job is done by unshare command. Doing so has many benefits like - what if our program disturbs parents procfs, we don't want to take a chance... os using --mount-proc is clean way of doing things. - --fork: before calling child program, do fork. Thus the child program has pid:1
- --mount: create a new mount name space for file system. Here we are not mounting any new FS. this options tells the
unshare that, any subsequent
mount
command should be unshared with parent. - --net: create a network ns, and do not share it with parent.
- --uts: create and unshare UTS (Unix Time Sharing System) ns. This is related to hostname, nothing to do time. UTS was name of older Unix...
- --user | u: create a user ns and unshare it
6.2.1 Example 1
We want the following
- unshare
command to do fork() (--fork) before executing child_program:/bin/bash
- unshare
command to create a new pid ns which will be unshared
with parent ns (--pid),
- unshare
command to create a new unshared
/proc fs for process table. Thus, we don't see parent's /procfs as new unshared procfs
will be replaced by new unshared procfs
Thus, if we do ps -ef
, we will not parent's processes.
6.2.2 running command
sudo unshare --fork --mount-proc --pid /bin/bash
sleep 100 &
ps -ef
6.2.3 Example 2
Now, we will add additional mount capacity (in fact, it is already add when we added --mount-proc).
mount -t type device dir
sudo unshare --fork --mount-proc --pid --mount /bin/bash
mount -t tmpfs tmpfs /mnt
mount | grep mnt
echo "hello" >> /mnt/afile.txt
if we open a new terminal and do ls /mnt/afile.txt
, it will not exits. Reason is,
the mount is unshared
with rest of the system. It is visible only on terminal (processes) where unshare ran.
6.2.4 network and uts ns
--net option used to create a new network name space and un-share it with parent using. Means,
we can have our own ip4/ip6 stack...
--uts option used to create hostname ns and unshare it. means, we can change hostname
sudo unshare --fork --net --uts /bin/bash
hostname sandbox
hostname # yes, it prints sandbox but systems hostname does not changes.
sandbox
6.2.5 user namesapce
To create users, even root and the work.
sudo unshare -Ur /bin/bash
6.3 References
- http://redhatgov.io/workshops/containers_the_hard_way/exercise1/
- http://redhatgov.io/workshops/containers_101/
- https://stackoverflow.com/questions/44666700/unshare-pid-bin-bash-fork-cannot-allocate-memory
- https://www.toptal.com/linux/separation-anxiety-isolating-your-system-with-linux-namespaces