6. Linux Namespaces

Namespaces are used to create a sandbox to run user applications.
Sandbox helps us to run apps in an isolated environment in a Linux box.

Linux namespaces are base of all Linux containerization tools like LXC, Docker etc. They are so simple, well designed, useful, understanding them will allows to use them in our products.
Let us think like this - we know pthreads, fock(), asyncIO, so we use them in our production SW. In the same way, if we know Linux spaces well, we may get ideas where we can use them in our SW.
Many times, I feel, we must explore these before we jump to Docker!.

Linux namespaces (and control group) are foundations on which any Linux container runs (Docker/LXC). Linux namespaces are provided by Linux Kernel.

6.1 Namespaces

Following is the list of Linux name spaces. Later, we will how to create/use/destroy them. - pid namespace All Linux processes are part of PID namespace. So, when we run ps or top command, it is read from pid name space. /proc FS maintains the pid namespace.
We can create our own pid name space where our applications will not interfere the parent pid namespace.

  • mount namespace This name space is for (un)mounting a file system. Kernel stores that info in a mount table.
    All our real devices like HDD/SDD or virtual FS like ramfs/ procfs/ sysfs/ etc are mouted for us. We can mount and unmount a new FS based on our need.

6.2 unshare command

unshare command allows us create new namespaces, and, newly created namespaces will not be shared with parent process (host), unless, we explicitly ask. Here, not shared is the meaning of un in the unshare word.

unshare is a wrapper over clone C api. It is better to use unshare compared to using direct C api.

Also, after running unshare, all the namespaces of the parent still visible to child by defualt. But we can remove parent's mount spaces from child (new unshared environment). For example: - after running unshare on a shell, the entire FileSystem on host still visible but we remove this.
- after running unshare on a shell, the entire /proc fs of the parent, which is process namespace used in commands like ps, top. pid() api. But we can remove this. We will see this as we move on.

unshare - run program in new namespaces. ie, new namespace will be created but those will not be visible to parent. Also, explicitly we need to remove parents ns if we don't want them to be touched by new program

unshare [options] [child_program [arguments]] options

  • --pid: create a new pid name space.
  • --mount-proc: if we create pid ns, create a new /proc fs so that child program will not disturb parent's pid. This implies --mount also. it is something similar to manually running mount -t proc none /proc, but it is better than this, because, procfs get mounted even before our program runs and, this job is done by unshare command. Doing so has many benefits like - what if our program disturbs parents procfs, we don't want to take a chance... os using --mount-proc is clean way of doing things.
  • --fork: before calling child program, do fork. Thus the child program has pid:1
  • --mount: create a new mount name space for file system. Here we are not mounting any new FS. this options tells the unshare that, any subsequent mount command should be unshared with parent.
  • --net: create a network ns, and do not share it with parent.
  • --uts: create and unshare UTS (Unix Time Sharing System) ns. This is related to hostname, nothing to do time. UTS was name of older Unix...
  • --user | u: create a user ns and unshare it

6.2.1 Example 1

We want the following - unshare command to do fork() (--fork) before executing child_program:/bin/bash
- unshare command to create a new pid ns which will be unshared with parent ns (--pid),
- unshare command to create a new unshared /proc fs for process table. Thus, we don't see parent's /procfs as new unshared procfs will be replaced by new unshared procfs

Thus, if we do ps -ef, we will not parent's processes.

6.2.2 running command

sudo unshare --fork --mount-proc --pid /bin/bash
sleep 100 &
ps -ef

6.2.3 Example 2

Now, we will add additional mount capacity (in fact, it is already add when we added --mount-proc).
mount -t type device dir

sudo unshare --fork --mount-proc --pid --mount /bin/bash
mount -t tmpfs tmpfs /mnt
mount | grep mnt
echo "hello" >> /mnt/afile.txt

if we open a new terminal and do ls /mnt/afile.txt, it will not exits. Reason is, the mount is unshared with rest of the system. It is visible only on terminal (processes) where unshare ran.

6.2.4 network and uts ns

--net option used to create a new network name space and un-share it with parent using. Means, we can have our own ip4/ip6 stack...
--uts option used to create hostname ns and unshare it. means, we can change hostname

sudo unshare --fork  --net --uts  /bin/bash
hostname sandbox
hostname # yes, it prints sandbox but systems hostname does not changes.
sandbox

6.2.5 user namesapce

To create users, even root and the work.
sudo unshare -Ur /bin/bash

6.3 References

  • http://redhatgov.io/workshops/containers_the_hard_way/exercise1/
  • http://redhatgov.io/workshops/containers_101/
  • https://stackoverflow.com/questions/44666700/unshare-pid-bin-bash-fork-cannot-allocate-memory
  • https://www.toptal.com/linux/separation-anxiety-isolating-your-system-with-linux-namespaces