Support for CRIU in Buildroot

A couple of months ago I started to evaluate [1] CRIU [2] for a project I'm working on. The project itself is using Buildroot to build and generate the root filesystem. Unfortunately, Buildroot lacks support for CRIU so there were some work to do.

/media/buildroot-plus-criu.png

To write the package was not straight forward. The package is only supported on certain architectures and the utils/test-pkg script failed for a few toolchains. Julien Olivain was really helpful to sort it out and he even wrote runtime scripts for it. Thanks for that.

I do not understand why projects still use custom Makefiles instead of CMake or Autotools though. Is is something essential that I've completely missed?

Kernel configuration

CRIU makes use of a lot of features that has to be enabled in the Linux kernel for full usage.

CONFIG_CHECKPOINT_RESTORE will be set by the package itself, but there are more configuration options that could be useful depending on how you intend to use the tool.

Relevant configuration options are:

General setup options
  • CONFIG_CHECKPOINT_RESTORE=y (Checkpoint/restore support)
  • CONFIG_NAMESPACES=y (Namespaces support)
  • CONFIG_UTS_NS=y (Namespaces support -> UTS namespace)
  • CONFIG_IPC_NS=y (Namespaces support -> IPC namespace)
  • CONFIG_SYSVIPC_SYSCTL=y
  • CONFIG_PID_NS=y (Namespaces support -> PID namespaces)
  • CONFIG_NET_NS=y (Namespaces support -> Network namespace)
  • CONFIG_FHANDLE=y (Open by fhandle syscalls)
  • CONFIG_EVENTFD=y (Enable eventfd() system call)
  • CONFIG_EPOLL=y (Enable eventpoll support)
  • CONFIG_RSEQ=y (Enable eventpoll support)
Networking support -> Networking options options for sock-diag subsystem
  • CONFIG_UNIX_DIAG=y (Unix domain sockets -> UNIX: socket monitoring interface)
  • CONFIG_INET_DIAG=y (TCP/IP networking -> INET: socket monitoring interface)
  • CONFIG_INET_UDP_DIAG=y (TCP/IP networking -> INET: socket monitoring interface -> UDP: socket monitoring interface)
  • CONFIG_PACKET_DIAG=y (Packet socket -> Packet: sockets monitoring interface)
  • CONFIG_NETLINK_DIAG=y (Netlink socket -> Netlink: sockets monitoring interface)
  • CONFIG_NETFILTER_XT_MARK=y (Networking support -> Networking options -> Network packet filtering framework (Netfilter) -> Core Netfilter Configuration -> Netfilter Xtables support (required for ip_tables) -> nfmark target and match support)
  • CONFIG_TUN=y (Networking support -> Universal TUN/TAP device driver support)

In the beginning of the project, CRIU had their own custom kernel which contained some experimental CRIU related patches. Nowadays many of those patches has been mainlined.

One such patch [3] that I missed in my current kernel verson (v5.10) was introduced in v5.12. It was related to how CRIU gets the process state and is essential to create a checkpoint of a running process:

commit 90f093fa8ea48e5d991332cee160b761423d55c1
Author: Piotr Figiel <figiel@google.com>
Date:   Fri Feb 26 14:51:56 2021 +0100

    rseq, ptrace: Add PTRACE_GET_RSEQ_CONFIGURATION request

    For userspace checkpoint and restore (C/R) a way of getting process state
    containing RSEQ configuration is needed.

    There are two ways this information is going to be used:
     - to re-enable RSEQ for threads which had it enabled before C/R
     - to detect if a thread was in a critical section during C/R

    Since C/R preserves TLS memory and addresses RSEQ ABI will be restored
    using the address registered before C/R.

    Detection whether the thread is in a critical section during C/R is needed
    to enforce behavior of RSEQ abort during C/R. Attaching with ptrace()
    before registers are dumped itself doesn't cause RSEQ abort.
    Restoring the instruction pointer within the critical section is
    problematic because rseq_cs may get cleared before the control is passed
    to the migrated application code leading to RSEQ invariants not being
    preserved. C/R code will use RSEQ ABI address to find the abort handler
    to which the instruction pointer needs to be set.

    To achieve above goals expose the RSEQ ABI address and the signature value
    with the new ptrace request PTRACE_GET_RSEQ_CONFIGURATION.

    This new ptrace request can also be used by debuggers so they're aware
    of stops within restartable sequences in progress.

    Signed-off-by: Piotr Figiel <figiel@google.com>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Michal Miroslaw <emmir@google.com>
    Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Acked-by: Oleg Nesterov <oleg@redhat.com>
    Link: https://lkml.kernel.org/r/20210226135156.1081606-1-figiel@google.com

With that said, to make use of CRIU with the latest features it's highly recommended to use a recent kernel.

And soon it will be available as a package in Buildroot.