For some time, I have been wanting to read more on Linux Cgroups to explore possibilities of using it to control Ill behaving applications. At this time, while I’m stuck in travel, it has given me some time to look into it.
In our Free Software world, most of the things are do-o-cracy, i.e. when your use case is not the common one, it is typically you who has to explore possible solutions. It could be Bugs , Feature Requests or as is in my case, performance issues. But that is not to assume that we do not have better quality software in Free Software world. Infact, in my opinion, some of the tools available are far much more better than the competition in terms of features, and to add a sweetener (or nutritional facts) to it is the fact that Free Software liberates the user.
One of my favorite tool, for photo management, is Digikam. Digikam is a big project, very featureful, and has some functionalities that may not be available in the competition. But as is with most Free Software projects, Digikam is a tool which underneath consumes many more subprojects from the Free Software ecosystem.
For anyone who has used Digikam, may know some of the bugs that surface on it. Not necessarily a bug in Digikam, but maybe in one of the underneath libraries/tools that it consumes (Exiv, libkface, marble, OpenCV, libPGF etc). But the bottom line is that the overall Digikam experience (and if I may say: the overall GNU/Linux experience) takes a hit.
Digikam has pretty powerful features for annotation, tagging, facial recognition. These features, together with Digikam, make it a compelling product. But the problem is that many of these projects are independent. Thus tight integration is a challenge. And at times, bugs can be hard to find, root cause and fix.
Let’s take a real example here. If you were to use Digikam today (version 4.13.0) with annotation, tagging and facial recognition as some of the core features for your use case, you may run into frustrating overall experience. Not just that, the bugs would also effect your overall GNU/Linux experience.
The tagging feature, if triggered, again will lead to frequent I/O. Thus again leading to a stalled Linux system because of blocked CPU cycled, for nothing.
So one of the items on my TODO list was to explore Linux Cgroups, and see if it was cleanly possible to tame a process to a confinement, so that even if it was ill behaving (for whatever reasons), your machine does not take the beating.
And now that the cgroups consumer dust has kinda settled down, systemd was my first obvious choice to look at. systemd provides a helper utility, systemd- run , for similar tasks. With systemd-run , you could apply all the resource controller logic to the given process, typically cpu, memory and blkio. And restrict it to a certain set. You can also define what user to run the service as.
rrs@learner:/var/tmp/Debian-Build/Result$ systemd-run -p BlockIOWeight=10 find / Running as unit run-23805.service. 2015-10-20 / 21:37:44 ♒♒♒ ☺ rrs@learner:/var/tmp/Debian-Build/Result$ systemctl status -l run-23805.service ● run-23805.service - /usr/bin/find / Loaded: loaded Drop-In: /run/systemd/system/run-23805.service.d └─50-BlockIOWeight.conf, 50-Description.conf, 50-ExecStart.conf Active: active (running) since Tue 2015-10-20 21:37:44 CEST; 6s ago Main PID: 23814 (find) Memory: 12.2M CPU: 502ms CGroup: /system.slice/run-23805.service └─23814 /usr/bin/find / Oct 20 21:37:45 learner find: /proc/3/net/raw6 Oct 20 21:37:45 learner find: /proc/3/net/snmp Oct 20 21:37:45 learner find: /proc/3/net/stat Oct 20 21:37:45 learner find: /proc/3/net/stat/rt_cache Oct 20 21:37:45 learner find: /proc/3/net/stat/arp_cache Oct 20 21:37:45 learner find: /proc/3/net/stat/ndisc_cache Oct 20 21:37:45 learner find: /proc/3/net/stat/ip_conntrack Oct 20 21:37:45 learner find: /proc/3/net/stat/nf_conntrack Oct 20 21:37:45 learner find: /proc/3/net/tcp6 Oct 20 21:37:45 learner find: /proc/3/net/udp6 2015-10-20 / 21:37:51 ♒♒♒ ☺
But, out of the box, graphical applications do not work. I haven’t looked, but it should be doable by giving it the correct environment details.
Underneath, systemd is using the same Linux Control Groups to limit resources for individual applications. So, in cases where you have a requirement and do not have systemd, or you directly want to make use of cgroups, it could be easily done with basic cgroups tools like cgroup- tools.
With cgroup-tools, I now have a simple cgroups hierarchy set for my current use case, i.e. Digikam
rrs@learner:/var/tmp/Debian-Build/Result$ ls /sys/fs/cgroup/memory/rrs_customCG/ cgroup.clone_children memory.kmem.tcp.limit_in_bytes memory.numa_stat cgroup.event_control memory.kmem.tcp.max_usage_in_bytes memory.oom_control cgroup.procs memory.kmem.tcp.usage_in_bytes memory.pressure_level digikam/ memory.kmem.usage_in_bytes memory.soft_limit_in_bytes memory.failcnt memory.limit_in_bytes memory.stat memory.force_empty memory.max_usage_in_bytes memory.swappiness memory.kmem.failcnt memory.memsw.failcnt memory.usage_in_bytes memory.kmem.limit_in_bytes memory.memsw.limit_in_bytes memory.use_hierarchy memory.kmem.max_usage_in_bytes memory.memsw.max_usage_in_bytes notify_on_release memory.kmem.slabinfo memory.memsw.usage_in_bytes tasks memory.kmem.tcp.failcnt memory.move_charge_at_immigrate 2015-10-20 / 21:45:38 ♒♒♒ ☺ rrs@learner:/var/tmp/Debian-Build/Result$ ls /sys/fs/cgroup/memory/rrs_customCG/digikam/ cgroup.clone_children memory.kmem.tcp.max_usage_in_bytes memory.oom_control cgroup.event_control memory.kmem.tcp.usage_in_bytes memory.pressure_level cgroup.procs memory.kmem.usage_in_bytes memory.soft_limit_in_bytes memory.failcnt memory.limit_in_bytes memory.stat memory.force_empty memory.max_usage_in_bytes memory.swappiness memory.kmem.failcnt memory.memsw.failcnt memory.usage_in_bytes memory.kmem.limit_in_bytes memory.memsw.limit_in_bytes memory.use_hierarchy memory.kmem.max_usage_in_bytes memory.memsw.max_usage_in_bytes notify_on_release memory.kmem.slabinfo memory.memsw.usage_in_bytes tasks memory.kmem.tcp.failcnt memory.move_charge_at_immigrate memory.kmem.tcp.limit_in_bytes memory.numa_stat 2015-10-20 / 21:45:53 ♒♒♒ ☺ rrs@learner:/var/tmp/Debian-Build/Result$ cat /sys/fs/cgroup/cpu/rrs_customCG/cpu.shares 1024 2015-10-20 / 21:48:44 ♒♒♒ ☺ rrs@learner:/var/tmp/Debian-Build/Result$ cat /sys/fs/cgroup/cpu/rrs_customCG/digikam/cpu.shares 512 2015-10-20 / 21:49:05 ♒♒♒ ☺ rrs@learner:/var/tmp/Debian-Build/Result$ cat /sys/fs/cgroup/memory/rrs_customCG/memory.limit_in_bytes 9223372036854771712 2015-10-20 / 22:20:14 ♒♒♒ ☺ rrs@learner:/var/tmp/Debian-Build/Result$ cat /sys/fs/cgroup/memory/rrs_customCG/digikam/memory.limit_in_bytes 2764369920 2015-10-20 / 22:20:27 ♒♒♒ ☺ rrs@learner:/var/tmp/Debian-Build/Result$ cat /sys/fs/cgroup/blkio/rrs_customCG/blkio.weight 500 2015-10-20 / 21:51:43 ♒♒♒ ☺ rrs@learner:/var/tmp/Debian-Build/Result$ cat /sys/fs/cgroup/blkio/rrs_customCG/digikam/blkio.weight 10 2015-10-20 / 21:51:50 ♒♒♒ ☺
The base group, $USER_customCG needs super admin privileges. Which once set appropriately, allows the user to further self-define sub-groups. And users can then also define separate limits per sub-group.
With the resource limitations set in place, my overall experience on very recent hardware (Intel Haswell Core i7, 8 GiB RAM, 500 GB SSHD, 128 GB SSD) has improved considerably. It still is not perfect, but it definitely is a huge improvement over what I had to go through ealire: A stalled machine for hours.
top - 21:54:38 up 1 day, 6:46, 1 user, load average: 7.22, 7.51, 7.37 Tasks: 299 total, 1 running, 298 sleeping, 0 stopped, 0 zombie %Cpu0 : 7.1 us, 3.0 sy, 1.0 ni, 11.1 id, 77.8 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu1 : 6.0 us, 4.0 sy, 2.0 ni, 49.0 id, 39.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu2 : 5.0 us, 2.0 sy, 0.0 ni, 24.8 id, 68.3 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu3 : 5.9 us, 5.0 sy, 0.0 ni, 21.8 id, 67.3 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 7908.926 total, 96.449 free, 4634.922 used, 3177.555 buff/cache MiB Swap: 8579.996 total, 3454.746 free, 5125.250 used. 2753.324 avail Mem PID to signal/kill [default pid = 8879] PID PPID nTH USER PR NI S %CPU %MEM TIME+ COMMAND UID 8879 8868 18 rrs 20 0 S 8.2 31.2 37:44.64 digikam 1000 10255 9960 4 rrs 39 19 S 1.0 0.8 19:47.73 tracker-miner-f 1000 10157 9960 7 rrs 20 0 S 0.5 3.0 32:29.76 gnome-shell 1000 7 2 1 root 20 0 S 0.2 0:53.48 rcu_sched 0 401 1 1 root 20 0 S 0.2 1.3 0:54.93 systemd-journal 0 10269 9937 4 rrs 20 0 S 0.2 0.4 2:34.50 gnome-terminal- 1000 15316 1 14 rrs 20 0 S 0.2 3.7 30:05.96 evolution 1000 23777 2 1 root 20 0 S 0.2 0:05.73 kworker/u16:0 0 23814 1 1 root 20 0 D 0.2 0.0 0:02.00 find 0 24049 2 1 root 20 0 S 0.2 0:01.29 kworker/u16:3 0 24052 2 1 root 20 0 S 0.2 0:02.94 kworker/u16:4 0 1 0 1 root 20 0 S 0.1 0:18.24 systemd 0
The reporting tools may not be correct here. Because from what is being reported above, I should be having a machine stalled, and heavily paging, while the kernel scanning its list of processes to find the best process to kill.
From this approach of jailing processes, the major side effect I can see is that the process (Digikam) is now starved of resources and will take much much much more time than what it would have been usually. But in the usual cases, it takes up all, and ends up starving (and getting killed) for consuming all available resources.
So I guess it is better to be on a balanced resource diet. :-)