# Controlling ill behaving applications with Linux Cgroups

For some time, I have been wanting to read more on Linux Cgroups to explore possibilities of using it to control Ill behaving applications. At this time, while I’m stuck in travel, it has given me some time to look into it.

In our Free Software world, most of the things are do-o-cracy, i.e. when your use case is not the common one, it is typically you who has to explore possible solutions. It could be Bugs , Feature Requests or as is in my case, performance issues. But that is not to assume that we do not have better quality software in Free Software world. Infact, in my opinion, some of the tools available are far much more better than the competition in terms of features, and to add a sweetener (or nutritional facts) to it is the fact that Free Software liberates the user.

One of my favorite tool, for photo management, is Digikam. Digikam is a big project, very featureful, and has some functionalities that may not be available in the competition. But as is with most Free Software projects, Digikam is a tool which underneath consumes many more subprojects from the Free Software ecosystem.

For anyone who has used Digikam, may know some of the bugs that surface on it. Not necessarily a bug in Digikam, but maybe in one of the underneath libraries/tools that it consumes (Exiv, libkface, marble, OpenCV, libPGF etc). But the bottom line is that the overall Digikam experience (and if I may say: the overall GNU/Linux experience) takes a hit.

Digikam has pretty powerful features for annotation, tagging, facial recognition. These features, together with Digikam, make it a compelling product. But the problem is that many of these projects are independent. Thus tight integration is a challenge. And at times, bugs can be hard to find, root cause and fix.

Let’s take a real example here. If you were to use Digikam today (version 4.13.0) with annotation, tagging and facial recognition as some of the core features for your use case, you may run into frustrating overall experience. Not just that, the bugs would also effect your overall GNU/Linux experience.

The facial recognition feature, if triggered, will eat up all your memory. Thus leading you to uncover Linux’s long old memory bug.

The tagging feature, if triggered, again will lead to frequent I/O. Thus again leading to a stalled Linux system because of blocked CPU cycled, for nothing.

So one of the items on my TODO list was to explore Linux Cgroups, and see if it was cleanly possible to tame a process to a confinement, so that even if it was ill behaving (for whatever reasons), your machine does not take the beating.

And now that the cgroups consumer dust has kinda settled down, systemd was my first obvious choice to look at. systemd provides a helper utility, systemd- run , for similar tasks. With systemd-run , you could apply all the resource controller logic to the given process, typically cpu, memory and blkio. And restrict it to a certain set. You can also define what user to run the service as.

rrs@learner:/var/tmp/Debian-Build/Result$systemd-run -p BlockIOWeight=10 find / Running as unit run-23805.service. 2015-10-20 / 21:37:44 ♒♒♒ ☺ rrs@learner:/var/tmp/Debian-Build/Result$ systemctl status -l run-23805.service

● run-23805.service - /usr/bin/find /

Drop-In: /run/systemd/system/run-23805.service.d

└─50-BlockIOWeight.conf, 50-Description.conf, 50-ExecStart.conf

Active: active (running) since Tue 2015-10-20 21:37:44 CEST; 6s ago

Main PID: 23814 (find)

Memory: 12.2M

CPU: 502ms

CGroup: /system.slice/run-23805.service

└─23814 /usr/bin/find /

Oct 20 21:37:45 learner find[23814]: /proc/3/net/raw6

Oct 20 21:37:45 learner find[23814]: /proc/3/net/snmp

Oct 20 21:37:45 learner find[23814]: /proc/3/net/stat

Oct 20 21:37:45 learner find[23814]: /proc/3/net/stat/rt_cache

Oct 20 21:37:45 learner find[23814]: /proc/3/net/stat/arp_cache

Oct 20 21:37:45 learner find[23814]: /proc/3/net/stat/ndisc_cache

Oct 20 21:37:45 learner find[23814]: /proc/3/net/stat/ip_conntrack

Oct 20 21:37:45 learner find[23814]: /proc/3/net/stat/nf_conntrack

Oct 20 21:37:45 learner find[23814]: /proc/3/net/tcp6

Oct 20 21:37:45 learner find[23814]: /proc/3/net/udp6

2015-10-20 / 21:37:51 ♒♒♒  ☺    

But, out of the box, graphical applications do not work. I haven’t looked, but it should be doable by giving it the correct environment details.

Underneath, systemd is using the same Linux Control Groups to limit resources for individual applications. So, in cases where you have a requirement and do not have systemd, or you directly want to make use of cgroups, it could be easily done with basic cgroups tools like cgroup- tools.

With cgroup-tools, I now have a simple cgroups hierarchy set for my current use case, i.e. Digikam

rrs@learner:/var/tmp/Debian-Build/Result$ls /sys/fs/cgroup/memory/rrs_customCG/ cgroup.clone_children memory.kmem.tcp.limit_in_bytes memory.numa_stat cgroup.event_control memory.kmem.tcp.max_usage_in_bytes memory.oom_control cgroup.procs memory.kmem.tcp.usage_in_bytes memory.pressure_level digikam/ memory.kmem.usage_in_bytes memory.soft_limit_in_bytes memory.failcnt memory.limit_in_bytes memory.stat memory.force_empty memory.max_usage_in_bytes memory.swappiness memory.kmem.failcnt memory.memsw.failcnt memory.usage_in_bytes memory.kmem.limit_in_bytes memory.memsw.limit_in_bytes memory.use_hierarchy memory.kmem.max_usage_in_bytes memory.memsw.max_usage_in_bytes notify_on_release memory.kmem.slabinfo memory.memsw.usage_in_bytes tasks memory.kmem.tcp.failcnt memory.move_charge_at_immigrate 2015-10-20 / 21:45:38 ♒♒♒ ☺ rrs@learner:/var/tmp/Debian-Build/Result$ ls /sys/fs/cgroup/memory/rrs_customCG/digikam/

cgroup.clone_children           memory.kmem.tcp.max_usage_in_bytes  memory.oom_control

cgroup.event_control            memory.kmem.tcp.usage_in_bytes      memory.pressure_level

cgroup.procs                    memory.kmem.usage_in_bytes          memory.soft_limit_in_bytes

memory.failcnt                  memory.limit_in_bytes               memory.stat

memory.force_empty              memory.max_usage_in_bytes           memory.swappiness

memory.kmem.failcnt             memory.memsw.failcnt                memory.usage_in_bytes

memory.kmem.limit_in_bytes      memory.memsw.limit_in_bytes         memory.use_hierarchy

memory.kmem.max_usage_in_bytes  memory.memsw.max_usage_in_bytes     notify_on_release

memory.kmem.tcp.failcnt         memory.move_charge_at_immigrate

memory.kmem.tcp.limit_in_bytes  memory.numa_stat

2015-10-20 / 21:45:53 ♒♒♒  ☺

rrs@learner:/var/tmp/Debian-Build/Result$cat /sys/fs/cgroup/cpu/rrs_customCG/cpu.shares 1024 2015-10-20 / 21:48:44 ♒♒♒ ☺ rrs@learner:/var/tmp/Debian-Build/Result$ cat /sys/fs/cgroup/cpu/rrs_customCG/digikam/cpu.shares

512

2015-10-20 / 21:49:05 ♒♒♒  ☺

rrs@learner:/var/tmp/Debian-Build/Result$cat /sys/fs/cgroup/memory/rrs_customCG/memory.limit_in_bytes 9223372036854771712 2015-10-20 / 22:20:14 ♒♒♒ ☺ rrs@learner:/var/tmp/Debian-Build/Result$ cat /sys/fs/cgroup/memory/rrs_customCG/digikam/memory.limit_in_bytes

2764369920

2015-10-20 / 22:20:27 ♒♒♒  ☺

rrs@learner:/var/tmp/Debian-Build/Result$cat /sys/fs/cgroup/blkio/rrs_customCG/blkio.weight 500 2015-10-20 / 21:51:43 ♒♒♒ ☺ rrs@learner:/var/tmp/Debian-Build/Result$ cat /sys/fs/cgroup/blkio/rrs_customCG/digikam/blkio.weight

10

2015-10-20 / 21:51:50 ♒♒♒  ☺    

The base group, \$USER_customCG needs super admin privileges. Which once set appropriately, allows the user to further self-define sub-groups. And users can then also define separate limits per sub-group.

With the resource limitations set in place, my overall experience on very recent hardware (Intel Haswell Core i7, 8 GiB RAM, 500 GB SSHD, 128 GB SSD) has improved considerably. It still is not perfect, but it definitely is a huge improvement over what I had to go through ealire: A stalled machine for hours.

top - 21:54:38 up 1 day,  6:46,  1 user,  load average: 7.22, 7.51, 7.37

Tasks: 299 total,   1 running, 298 sleeping,   0 stopped,   0 zombie

%Cpu0  :  7.1 us,  3.0 sy,  1.0 ni, 11.1 id, 77.8 wa,  0.0 hi,  0.0 si,  0.0 st

%Cpu1  :  6.0 us,  4.0 sy,  2.0 ni, 49.0 id, 39.0 wa,  0.0 hi,  0.0 si,  0.0 st

%Cpu2  :  5.0 us,  2.0 sy,  0.0 ni, 24.8 id, 68.3 wa,  0.0 hi,  0.0 si,  0.0 st

%Cpu3  :  5.9 us,  5.0 sy,  0.0 ni, 21.8 id, 67.3 wa,  0.0 hi,  0.0 si,  0.0 st

MiB Mem : 7908.926 total,   96.449 free, 4634.922 used, 3177.555 buff/cache

MiB Swap: 8579.996 total, 3454.746 free, 5125.250 used. 2753.324 avail Mem

PID to signal/kill [default pid = 8879]

PID  PPID nTH USER        PR  NI S %CPU %MEM     TIME+ COMMAND                           UID

8879  8868  18 rrs         20   0 S  8.2 31.2  37:44.64 digikam                          1000

10255  9960   4 rrs         39  19 S  1.0  0.8  19:47.73 tracker-miner-f                  1000

10157  9960   7 rrs         20   0 S  0.5  3.0  32:29.76 gnome-shell                      1000

7     2   1 root        20   0 S  0.2        0:53.48 rcu_sched                           0

401     1   1 root        20   0 S  0.2  1.3   0:54.93 systemd-journal                     0

10269  9937   4 rrs         20   0 S  0.2  0.4   2:34.50 gnome-terminal-                  1000

15316     1  14 rrs         20   0 S  0.2  3.7  30:05.96 evolution                        1000

23777     2   1 root        20   0 S  0.2        0:05.73 kworker/u16:0                       0

23814     1   1 root        20   0 D  0.2  0.0   0:02.00 find                                0

24049     2   1 root        20   0 S  0.2        0:01.29 kworker/u16:3                       0

24052     2   1 root        20   0 S  0.2        0:02.94 kworker/u16:4                       0

1     0   1 root        20   0 S       0.1   0:18.24 systemd                             0 

The reporting tools may not be correct here. Because from what is being reported above, I should be having a machine stalled, and heavily paging, while the kernel scanning its list of processes to find the best process to kill.

From this approach of jailing processes, the major side effect I can see is that the process (Digikam) is now starved of resources and will take much much much more time than what it would have been usually. But in the usual cases, it takes up all, and ends up starving (and getting killed) for consuming all available resources.

So I guess it is better to be on a balanced resource diet. :-)