Mounting with systemd and udev

/media/systemd.png

Mounting with systemd and udev

Systemd has not allways been my first choice as init system for embedded system, but I cannot ignore that it has many good and handy things that other init systems don't. At the same time, that is just what I don't like with systemd, it does not follow the "Do one thing and do it well"-philosophy that I like so much. I am very thorn about it.

However, when trying to do some things with systemd as you used to do with other systems you sometimes encounter some difficulties. Mostly it is simple because there is another way to accomplish what you want, the "systemd-way", which is usually a better and safer way, but sometimes you simply don't want to.

One such thing I encountered was to mount filesystems with udev. This used to work, but since v239 of systemd, two separate directives were introduced and changed this default behavior.

units: switch from system call blacklist to whitelist

Commit ee8f26180d01e3ddd4e5f20b03b81e5e737657ae [1]

units: switch from system call blacklist to whitelist

This is generally the safer approach, and is what container managers
(including nspawn) do, hence let's move to this too for our own
services. This is particularly useful as this this means the new
@System-service system call filter group will get serious real-life
testing quickly.

This also switches from firing SIGSYS on unexpected syscalls to
returning EPERM. This would have probably been a better default anyway,
but it's hard to change that these days. When whitelisting system calls
SIGSYS is highly problematic as system calls that are newly introduced
to Linux become minefields for services otherwise.

Note that this enables a system call filter for udev for the first time,
and will block @clock, @mount and @swap from it. Some downstream
distributions might want to revert this locally if they want to permit
unsafe operations on udev rules, but in general this shiuld be mostly
safe, as we already set MountFlags=shared for udevd, hence at least
@mount won't change anything.

This patch change the default filter behavior from a blacklist to a whitelist and @mount is no longer allowed

+ SystemCallFilter=@system-service @module @raw-io
+ SystemCallErrorNumber=EPERM

units: switch udev service to use PrivateMounts=yes

Commit b2e8ae7380d009ab9f9260a34e251ac5990b01ca [2]

units: switch udev service to use PrivateMounts=yes

Given that PrivateMounts=yes is the "successor" to MountFlags=slave in
unit files, let's make use of it for udevd.

What does systemd says about PrivateMounts? [3]

PrivateMounts=
Takes a boolean parameter. If set, the processes of this unit will be run in their own private file system (mount) namespace with all mount propagation from the processes towards the host's main file system namespace turned off. This means any file system mount points established or removed by the unit's processes will be private to them and not be visible to the host. However, file system mount points established or removed on the host will be propagated to the unit's processes. See mount_namespaces(7) for details on file system namespaces. Defaults to off.

When turned on, this executes three operations for each invoked process: a new CLONE_NEWNS namespace is created, after which all existing mounts are remounted to MS_SLAVE to disable propagation from the unit's processes to the host (but leaving propagation in the opposite direction in effect). Finally, the mounts are remounted again to the propagation mode configured with MountFlags=, see below.

File system namespaces are set up individually for each process forked off by the service manager. Mounts established in the namespace of the process created by ExecStartPre= will hence be cleaned up automatically as soon as that process exits and will not be available to subsequent processes forked off for ExecStart= (and similar applies to the various other commands configured for units). Similarly, JoinsNamespaceOf= does not permit sharing kernel mount namespaces between units, it only enables sharing of the /tmp/ and /var/tmp/ directories.

Other file system namespace unit settings — PrivateMounts=, PrivateTmp=, PrivateDevices=, ProtectSystem=, ProtectHome=, ReadOnlyPaths=, InaccessiblePaths=, ReadWritePaths=, … — also enable file system namespacing in a fashion equivalent to this option. Hence it is primarily useful to explicitly request this behaviour if none of the other settings are used.

This option is only available for system services, or for services running in per-user instances of the service manager when PrivateUsers= is enabled.

If PrivateMounts=true, then the process has its own mount namespace which will result in that the mounted filesystem is visable only for the process (udevd) itself and will not be propagated to the whole system.

Conclusion

There is reasons to not allow udev mount filesystems for sure, but if you still want to do it you have to revert these changes by modify /lib/systemd/system/systemd-udev.service with:

GPLv2 and GPLv3

Open Source

"Free as in freedom - not as in free beer". Free beer is nice, but freedom is even nicer.

I have been working with companies from different sections including consumer electronics, military applications, automotive and aeronautics. One common question, regardless of section, is "Can we really use Open Source in our product?". The answer is usually Yes, you can, but....

One common misunderstanding is to interpret Open Source as in free beer. This is kind of true for some Open Source, but that is nothing you can take for granted. The "rules for how the code may be used is specified by its license.

Of those who think they had understood the difference, there is a common misunderstanding that no Open Source software is free and does not belong in any commercial products. Both misunderstandings are of course wrong, but you have to make sure that you understand the licenses you are using.

Before you start to work with any source code (not only Open Source) you always have to take the license into consideration. If do your homework you can avoid surprises and practical implications that otherwise can cause you a delayed project or legal inconveniences.

In short, you have to know what you are doing, and that should not differ from other parts of your development.

Open Source Licenses

"Open source licenses are licenses that comply with the Open Source Definition — in brief, they allow software to be freely used, modified, and shared. To be approved by the Open Source Initiative (also known as the OSI), a license must go through the Open Source Initiative's license review process."

This text is taken from the Open Source Initiative webpage [4], which is an organization that works with defining criterea for Open Source and certificate licenses that comply with OSD (Open Source Definition).

Open Source Definition

Many licenses [5] are certified and may have different requirements for its users, but they all comply with these "rules":

Free Redistribution

The license shall not restrict any party from selling or giving away the software as a component of an aggregate software distribution containing programs from several different sources. The license shall not require a royalty or other fee for such sale.

Source Code

The program must include source code, and must allow distribution in source code as well as compiled form. Where some form of a product is not distributed with source code, there must be a well-publicized means of obtaining the source code for no more than a reasonable reproduction cost, preferably downloading via the Internet without charge. The source code must be the preferred form in which a programmer would modify the program. Deliberately obfuscated source code is not allowed. Intermediate forms such as the output of a preprocessor or translator are not allowed.

Derived Works

The license must allow modifications and derived works, and must allow them to be distributed under the same terms as the license of the original software.

Integrity of The Author's Source Code

The license may restrict source-code from being distributed in modified form only if the license allows the distribution of "patch files" with the source code for the purpose of modifying the program at build time. The license must explicitly permit distribution of software built from modified source code. The license may require derived works to carry a different name or version number from the original software.

No Discrimination Against Persons or Groups

The license must not discriminate against any person or group of persons.

No Discrimination Against Fields of Endeavor

The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research.

Distribution of License

The rights attached to the program must apply to all to whom the program is redistributed without the need for execution of an additional license by those parties.

License Must Not Be Specific to a Product

The rights attached to the program must not depend on the program's being part of a particular software distribution. If the program is extracted from that distribution and used or distributed within the terms of the program's license, all parties to whom the program is redistributed should have the same rights as those that are granted in conjunction with the original software distribution.

License Must Not Restrict Other Software

The license must not place restrictions on other software that is distributed along with the licensed software. For example, the license must not insist that all other programs distributed on the same medium must be open-source software.

License Must Be Technology-Neutral

No provision of the license may be predicated on any individual technology or style of interface.

GPL

GPL, or General Public License, is one of the most common Open Source Licenses you will find out there. At least version 2, GPLv2, is something you will encounter for sure if you intend to build an embedded Linux system as the kernel [6] itself is using this license.

GPLv2

So what do you need to comply with GPLv2 code? Basically, you need to provide the source code for all GPLv2 licensed code. Yes, that includes all your modifications too, and this part could seem scary at the first glare.

But will you need to make any changes? Probably. If you want to run Linux on your system you will probably have to make some adaptions to the Linux kernel specific for your board, those changes will follow the GPLv2 license and should be provided as well.

The license is stated as follows:

"The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. "

GPLv2, Sec.3 [2]

If any of those changes is your top-secret algorithm then you have done it wrong by design anyway. Please note that no installation information is required at all which makes it more sufficient for embedded devices.

Tivoization

Tivoization, TiVO ran GPLv2 only Linux kernel, but had HW signature keys that made it possible to only run signed kernels. Even if the TiVO did provide the kernel code, the TiVO customers could not build and install the firmware.

The Free Software Foundation(FSF) found this objectionable as it violates one of the purposes the GPLv2 license had. So FSF ended up with GPLv3 to solve this.

GPLv3

/media/gplv3.png

(Yes, this logo is under Public Domain [7] )

One big difference between v2 and v3 is this part

" “Installation Information” for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source. The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made. "

GPLv3, Sec .6 [1]

Which states that "Installation information" must be provided together with the source code, in short, you have to provide instruction for an end-user to build and replace GPLv3 parts of your product. But there are also a few exceptions. Most of them is more or less hard to make any use of in a real world product.

Exception 1

It only requred for "User Products". It is hard to say if this is an exception or not as most of the products that use GPLv3 is user products. But the license states that it only affects User products. Consult your lawyer as it is not entirely clear what a "User product" really are.

Exception 2

Only if device is sold or offered for long-term lease. As with all lega stuff, things are a bit unclear. Does remote access or temporary possessions qualify for example?

Please note that even long term lease need you to provide installation information.

Exception 3

If non-modifiable by anyone, you don't have to give the information how to reinstalling or modify binaries.

If you want to be able to update your software, then you will probably need to provide the "installation information".

Exception 4

You can void any warranties if binaries are modified, but you can't prevent reinstallation of modigied binaries.

Conclusion

There is a reason why an author of code chose to use any particular license, and it is important (both for principal and legal reasons) to respect that. Some licenses are more or less appropriate for specific products, but the general rule I follow is to avoid any GPLv3 (both GPLv3 and LGPLv3) licensed software in any embedded system as it hard to be fully compliant. The installation information is often something that companies want to keep for themself, with all rights.

What is my opinion about this? Well, I do like to have the freedom to install whatever software I want in the products I own, but there are circumstances where I'm not sure if it is a good idea when it comes to safety and liability. If I buy a car in second hand, I don't want the software for my airbag or braking system to be "fixed" by some random guy. I think that the Open Source has limited use in too restrictive licenses, and that is somehow counterproductive for the Open Source itself.

Parsing command line options

Parsing command line options

Parsing command line options is something allmost every command or applications needs to handle in some way, and there is too many home-made argument parsers out there. As so many programs needs to parse options from the command line, this facility is encapsulated in a standard library function getopt(2).

The GNU C library provides an even more sophisticated API for parsing the command line, argp(), and is described in the glibc manual [1]. However, this function is not portable.

There is also many libraries that provides such facilities, but lets keep us to what the glibc library provides.

Command line options

A typical UNIX command takes options in the following form

command [option] arguments

The options has the form of a hyphen (-) followed by a unique character and a possible argument. If the options take an argument, it may be separated from that argument by a white space. When multiple options is specified, those can be grouped after a single hyphen, and the last option in the group may be the only one that takes an argument.

Example on single option

ls -l

Example on grouped options

ls -lI *hidden* .

In the example above, the -l (long listing format) does not takes an argument, but -I (Ignore) takes *hidden* as argument.

Long options

It is not unusual that a command allows both a short (-I) and a long (--ignore) option syntax. A long option begins with two hyphens, and the option itself is identified using a word. If the options take an argument, it may be separated from that argument by a =.

To parse such options, use the getopt_long(2) glibc function, or the (nonportable) argp().

Example using getopt_long()

getopt_long() is quite simple to use. First we create a struct option and defines the following elements: * name is the name of the long option.

  • has_arg
    is: no_argument (or 0) if the option does not take an argu‐ ment; required_argument (or 1) if the option requires an argu‐ ment; or optional_argument (or 2) if the option takes an optional argument.
  • flag
    specifies how results are returned for a long option. If flag is NULL, then getopt_long() returns val. (For example, the calling program may set val to the equivalent short option character.) Otherwise, getopt_long() returns 0, and flag points to a variable which is set to val if the option is found, but left unchanged if the option is not found.
  • val
    is the value to return, or to load into the variable pointed
    to by flag.

The last element of the array has to be filled with zeros,

The next step is to iterate through all options and take care of the arguments.

Example code

Example code

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <getopt.h>

struct arguments
{
    int a;
    int b;
    int c;
    int area;
    int perimeter;
};


void print_usage() {
    printf("Usage: triangle [Ap] -a num -b num -c num\n");
}

int main(int argc, char *argv[]) {
    int opt= 0;
    struct arguments arguments;

    /* Default values. */
    arguments.a = -1;
    arguments.b = -1;
    arguments.c = -1;
    arguments.area = 0;
    arguments.perimeter = 0;


    static struct option long_options[] = {
        {"area",      no_argument,       0,  'A' },
        {"perimeter", no_argument,       0,  'p' },
        {"hypotenuse",required_argument, 0,  'c' },
        {"opposite",  required_argument, 0,  'a' },
        {"adjecent",  required_argument, 0,  'b' },
        {0,           0,                 0,  0   }
    };

    int long_index =0;
    while ((opt = getopt_long(argc, argv,"Apa:b:c:",
                   long_options, &long_index )) != -1) {
        switch (opt) {
             case 'A':
                 arguments.area = 1;
                 break;
             case 'p':
                arguments.perimeter = 1;
                 break;
             case 'a':
                 arguments.a = atoi(optarg);
                 break;
             case 'b':
                 arguments.b = atoi(optarg);
                 break;
             case 'c':
                 arguments.c = atoi(optarg);
                 break;
             default: print_usage();
                 exit(EXIT_FAILURE);
        }
    }


    if (arguments.a == -1 || arguments.b == -1 || arguments.c == -1) {
        print_usage();
        exit(EXIT_FAILURE);
    }

    if (arguments.area) {
        arguments.area = (arguments.a*arguments.b)/2;
        printf("Area: %d\n",arguments.area);
    }

    if (arguments.perimeter) {
        arguments.perimeter = arguments.a + arguments.b + arguments.c;
        printf("Perimeter: %d\n",arguments.perimeter);
    }

    return 0;
}

Example of usages

Full example with short options

[13:49:00]marcus@little:~/tmp/cmdline$ ./getopt  -Ap -a 3 -b 4 -c 5
Area: 6
Perimeter: 12

Missing -c option

[14:07:37]marcus@little:~/tmp/cmdline$ ./getopt  -Ap -a 3 -b 4
Usage: triangle [Ap] -a num -b num -c num

Full example with long options

[14:09:38]marcus@little:~/tmp/cmdline$ ./getopt  --area --perimeter --opposite 3 --adjecent 4 --hypotenuse 5
Area: 6
Perimeter: 12

Invalid options

[14:10:14]marcus@little:~/tmp/cmdline$ ./getopt  --area --perimeter --opposite 3 --adjecent 4 -j=3
./getopt: invalid option -- 'j'
Usage: triangle [Ap] -a num -b num -c num

Full example with mixed syntaxes

[14:09:38]marcus@little:~/tmp/cmdline$ ./getopt  -A --perimeter --opposite=3 -b4 -c 5
Area: 6
Perimeter: 12

Variants

getopt_long_only() is like getopt_long(), but '-' as well as "--" can indicate a long option. If an option that starts with '-' (not "--") doesn't match a long option, but does match a short option, it is parsed as a short option instead.

Example using argp()

argp() is a more flexible and powerful than getopt() with friends, but it is not part of the POSIX standard and is therefr not portable between different POSIX-compatible operating systems. However, argp() provides a few interresting features that getopt() does not.

These features include automatically producing output in response to the ‘--help’ and ‘--version’ options, as described in the GNU coding standards. Using argp makes it less likely that programmers will neglect to implement these additional options or keep them up to date.

The implementation is pretty much straigt forwards and similiar to getopt() with a few notes.

const char *argp_program_version = Triangle 1.0";
const char *argp_program_bug_address = "<marcus.folkesson@combitech.se>";

Is used in automatic generation for the --help and --version options.

struct argp_option

This structure specifies a single option that an argp parser understands, as well as how to parse and document that option. It has the following fields:

  • const char *name
    The long name for this option, corresponding to the long option --name; this field may be zero if this option only has a short name. To specify multiple names for an option, additional entries may follow this one, with the OPTION_ALIAS flag set. See Argp Option Flags.
  • int key
    The integer key provided by the current option to the option parser. If key has a value that is a printable ASCII character (i.e., isascii (key) is true), it also specifies a short option ‘-char’, where char is the ASCII character with the code key.
  • const char *arg
    If non-zero, this is the name of an argument associated with this option, which must be provided (e.g., with the --name=value or -char value syntaxes), unless the OPTION_ARG_OPTIONAL flag (see Argp Option Flags) is set, in which case it may be provided.
  • int flags
    Flags associated with this option, some of which are referred to above. See Argp Option Flags.
  • const char *doc
    A documentation string for this option, for printing in help messages.

If both the name and key fields are zero, this string will be printed tabbed left from the normal option column, making it useful as a group header. This will be the first thing printed in its group. In this usage, it’s conventional to end the string with a : character.

Example code

Example code with little more comments

#include <stdlib.h>
#include <argp.h>

const char *argp_program_version = "Triangle 1.0";
const char *argp_program_bug_address = "<marcus.folkesson@combitech.se>";

/* Program documentation. */
static char doc[] = "Triangle example";

/* A description of the arguments we accept. */
static char args_doc[] = "ARG1 ARG2";

/* The options we understand. */
static struct argp_option options[] = {
    {"area",        'A',    0,  0,  "Calculate area"},
    {"perimeter",   'p',    0,  0,  "Calculate perimeter"},
    {"hypotenuse",  'c',    "VALUE",  0,  "Specify hypotenuse of the triangle"},
    {"opposite",    'b',    "VALUE",  0,  "Specify opposite of the triangle"},
    {"adjecent",    'a',    "VALUE",  0,  "Specify adjecent of the triangle"},
    { 0 }
};

/* Used by main to communicate with parse_opt. */
struct arguments
{
    int a;
    int b;
    int c;
    int area;
    int perimeter;
};

/* Parse a single option. */
static error_t parse_opt (int key, char *arg, struct argp_state *state)
{
    struct arguments *arguments = (struct arguments*)state->input;

    switch (key) {
        case 'a':
            arguments->a = atoi(arg);
            break;
        case 'b':
            arguments->b = atoi(arg);
            break;
        case 'c':
            arguments->c = atoi(arg);
            break;
        case 'p':
            arguments->perimeter = 1;
            break;
        case 'A':
            arguments->area = 1;
            break;

        default:
            return ARGP_ERR_UNKNOWN;
    }
    return 0;
}

/* Our argp parser. */
static struct argp argp = { options, parse_opt, args_doc, doc };

int
main (int argc, char **argv)
{
    struct arguments arguments;

    /* Default values. */
    arguments.a = -1;
    arguments.b = -1;
    arguments.c = -1;
    arguments.area = 0;
    arguments.perimeter = 0;

    /* Parse our arguments; every option seen by parse_opt will
     *      be reflected in arguments. */
    argp_parse (&argp, argc, argv, 0, 0, &arguments);


    if (arguments.a == -1 || arguments.b == -1 || arguments.c == -1) {
        exit(EXIT_FAILURE);
    }

    if (arguments.area) {
        arguments.area = (arguments.a*arguments.b)/2;
        printf("Area: %d\n",arguments.area);
    }

    if (arguments.perimeter) {
        arguments.perimeter = arguments.a + arguments.b + arguments.c;
        printf("Perimeter: %d\n",arguments.perimeter);
    }

    return EXIT_SUCCESS;
}

Example of usages

This application gives the same output as the getopt() usage, with the following extra features:

The options --help, --usage and --version is automaically generated

[15:53:04]marcus@little:~/tmp/cmdline$ ./argp --help
Usage: argp [OPTION...] ARG1 ARG2
Triangle example

  -a, --adjecent=VALUE       Specify adjecent of the triangle
  -A, --area                 Calculate area
  -b, --opposite=VALUE       Specify opposite of the triangle
  -c, --hypotenuse=VALUE     Specify hypotenuse of the triangle
  -p, --perimeter            Calculate perimeter
  -?, --help                 Give this help list
      --usage                Give a short usage message
  -V, --version              Print program version

Mandatory or optional arguments to long options are also mandatory or optional
for any corresponding short options.

Report bugs to <marcus.folkesson@combitech.se>.

Version information

[15:53:08]marcus@little:~/tmp/cmdline$ ./argp --version
Triangle 1.0

Conclusion

Parsing command line options is simple. argp() provides a log of features that I really appreciate.

When portability is no issue, I allways go for argp() as, besides the extra features, the interface is more appealing.

Embedded Linux Conference 2019

Embedded Linux Conference 2019

Here we go again! This trip got exited even before it begun. I checked my passport the day before we should leave and noticed that my passport has expired. Outch. Fortunataly I was able to get a temporary passport at the airport. I must admit that I'm not traveling that often and do not have these 'must-checks' in my muscle memory..

This time we were heading Lyon in France. The weather is not the best but at least it is not freezing cold as in Sweden as it is this time of the year.

The conference

The conference this year is good as usual. Somehow, my attendence focus has switched from the technical talks to actually connect and talk to people. Of course, I have a group of people that I allways meet (it is mostly the same people that shows up on the conferences, after all), but I have met far more people than I used to. Am I beginning to be social? Anyway, as said before, it is fun to have a face on the patches I've reviewed or got comments on.

The talks

I mostly go for the "heavy technical" talks, but the talk I appreciated most this year had a very low technical level. The talk was performed by Gardena [1] that is doing gardening tools. Yes, water hoses and stuff. They described their journey from a product family that historically have no software at all, to a full blown embedded Linux system with all the legal implications that you can encounter with open source licenses. Gardena became sued for breaking the GPL license, which could be a very costly story. What Gardena did was absolutely the best way to handle it, and it was really nice to hear about. The consensus is that Gardena now have a public github account [2] containing the software for their Garden Gateway products [3]. Support for the SoC that is used were using is not only published, but also mainlined(!!).

Gardena hired contractors from Denx [4] for mainlining U-Boot and Linux kernel and also hired the maintainer of the radiochip that they were using. Thanks to this, all open parts of their product is mainlined and Gardena even managed to get the radio certified, which have helped at least two other companies.

Hiring the right folks for the right tasks was really the best thing Gardena could do. The radiochip maintainer fixed their problem in 48 man-hour, something that could take months to fix for Gardena. The estimated cost of all these "mainlining work" was only 10% of their budget, which is really nothing. It also made it possible for Gardena to put their products on the market in time. One big bonus is that the maintainence is far mor easy when the code is mainlined.

This is also how we work on Combitech. Linux is not part of our customers "core competence", and it really should not be. Linux is our core competence, that is why our customers let us take care of "our thing", IOW Linux development.

But why does all this make me so happy? First of all, the whole Open Source Community is really a big thing to me. It has influenced both my career choices and my view on software. In fact, I'm not even sure that I had enjoyed programming without Open Source.

So, Gardena, Keep up the good work!

/media/elce2019.jpg

ligsegfault.so

libsegfault.so

The dynamic linker [1] in a Linux system is using several environment variables to customize it's behavior. The most commonly used is probably LD_LIBRARY_PATH which is a list of directories where it search for libraries at execution time. Another variable I use quite often is LD_TRACE_LOADED_OBJECTS to let the program list its dynamic dependencies, just like ldd(1).

For example, consider the following output

$ LD_TRACE_LOADED_OBJECTS=1 /bin/bash
    linux-vdso.so.1 (0x00007ffece29e000)
    libreadline.so.7 => /usr/lib/libreadline.so.7 (0x00007fc9b82d1000)
    libdl.so.2 => /usr/lib/libdl.so.2 (0x00007fc9b80cd000)
    libc.so.6 => /usr/lib/libc.so.6 (0x00007fc9b7d15000)
    libncursesw.so.6 => /usr/lib/libncursesw.so.6 (0x00007fc9b7add000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fc9b851f000)
    libtinfo.so.6 => /usr/lib/libtinfo.so.6 (0x00007fc9b78b1000)

LD_PRELOAD

LD_PRELOAD is a list of additional shared objects that should be loaded before all other dynamic dependencies. When the loader is resolving symbols, it sequentially walk through the list of dynamic shared objects and takes the first match. This makes it possible to overide functions in other shared objects and change the behavior of the application completely.

Consider the following example

$ LD_PRELOAD=/usr/lib/libSegFault.so LD_TRACE_LOADED_OBJECTS=1 /bin/bash
    linux-vdso.so.1 (0x00007ffc73f61000)
    /usr/lib/libSegFault.so (0x00007f131c234000)
    libreadline.so.7 => /usr/lib/libreadline.so.7 (0x00007f131bfe6000)
    libdl.so.2 => /usr/lib/libdl.so.2 (0x00007f131bde2000)
    libc.so.6 => /usr/lib/libc.so.6 (0x00007f131ba2a000)
    libncursesw.so.6 => /usr/lib/libncursesw.so.6 (0x00007f131b7f2000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f131c439000)
    libtinfo.so.6 => /usr/lib/libtinfo.so.6 (0x00007f131b5c6000)

Here we have preloaded libSegFault and it is listed in second place. In the first place we have linux-vdso.so.1 which is a Virtual Dynamic Shared Object provided by the Linux kernel. The VDSO deserves it's own separate blog post, it is a cool feature that maps kernel code into the a process's context as a .text segment in a virtual library.

libSegFault.so

libSegFault.so is part of glibc [2] and comes with your toolchain. The library is for debugging purpose and is activated by preload it at runtime. It does not actually overrides functions but register signal handlers in a constructor (yes, you can execute code before main) for specified signals. By default only SIGEGV (see signal(7)) is registered. These registered handlers print a backtrace for the applicaton when the signal is delivered. See its implementation in [3].

Set the environment variable SEGFAULT_SIGNALS to explicit select signals you want to register a handler for.

/media/libsegfault.png

This is an useful feature for debug purpose. The best part is that you don't have to recompile your code.

libSegFault in action

Our application

Consider the following in real life application taken directly from the local nuclear power plant:

void handle_uranium(char *rod)
{
    *rod = 0xAB;
}

void start_reactor()
{
    char *rod = 0x00;
    handle_uranium(rod);
}

int main()
{
    start_reactor();
}

The symptom

We are seeing a segmentation fault when operate on a particular uranium rod, but we don't know why.

Use libSegFault

Start the application with libSegFault preloaded and examine the dump:

$ LD_PRELOAD=/usr/lib/libSegFault.so ./powerplant
*** Segmentation fault
Register dump:

 RAX: 0000000000000000   RBX: 0000000000000000   RCX: 0000000000000000
 RDX: 00007ffdf6aba5a8   RSI: 00007ffdf6aba598   RDI: 0000000000000000
 RBP: 00007ffdf6aba480   R8 : 000055d2ad5e16b0   R9 : 00007f98534729d0
 R10: 0000000000000008   R11: 0000000000000246   R12: 000055d2ad5e14f0
 R13: 00007ffdf6aba590   R14: 0000000000000000   R15: 0000000000000000
 RSP: 00007ffdf6aba480

 RIP: 000055d2ad5e1606   EFLAGS: 00010206

 CS: 0033   FS: 0000   GS: 0000

 Trap: 0000000e   Error: 00000006   OldMask: 00000000   CR2: 00000000

 FPUCW: 0000037f   FPUSW: 00000000   TAG: 00000000
 RIP: 00000000   RDP: 00000000

 ST(0) 0000 0000000000000000   ST(1) 0000 0000000000000000
 ST(2) 0000 0000000000000000   ST(3) 0000 0000000000000000
 ST(4) 0000 0000000000000000   ST(5) 0000 0000000000000000
 ST(6) 0000 0000000000000000   ST(7) 0000 0000000000000000
 mxcsr: 1f80
 XMM0:  00000000000000000000000000000000 XMM1:  00000000000000000000000000000000
 XMM2:  00000000000000000000000000000000 XMM3:  00000000000000000000000000000000
 XMM4:  00000000000000000000000000000000 XMM5:  00000000000000000000000000000000
 XMM6:  00000000000000000000000000000000 XMM7:  00000000000000000000000000000000
 XMM8:  00000000000000000000000000000000 XMM9:  00000000000000000000000000000000
 XMM10: 00000000000000000000000000000000 XMM11: 00000000000000000000000000000000
 XMM12: 00000000000000000000000000000000 XMM13: 00000000000000000000000000000000
 XMM14: 00000000000000000000000000000000 XMM15: 00000000000000000000000000000000

Backtrace:
./powerplant(+0x606)[0x55d2ad5e1606]
./powerplant(+0x628)[0x55d2ad5e1628]
./powerplant(+0x639)[0x55d2ad5e1639]
/usr/lib/libc.so.6(__libc_start_main+0xea)[0x7f9852ec6f6a]
./powerplant(+0x51a)[0x55d2ad5e151a]

Memory map:

55d2ad5e1000-55d2ad5e2000 r-xp 00000000 00:13 14897704                   /home/marcus/tmp/segfault/powerplant
55d2ad7e1000-55d2ad7e2000 r--p 00000000 00:13 14897704                   /home/marcus/tmp/segfault/powerplant
55d2ad7e2000-55d2ad7e3000 rw-p 00001000 00:13 14897704                   /home/marcus/tmp/segfault/powerplant
55d2ada9c000-55d2adabd000 rw-p 00000000 00:00 0                          [heap]
7f9852c8f000-7f9852ca5000 r-xp 00000000 00:13 13977863                   /usr/lib/libgcc_s.so.1
7f9852ca5000-7f9852ea4000 ---p 00016000 00:13 13977863                   /usr/lib/libgcc_s.so.1
7f9852ea4000-7f9852ea5000 r--p 00015000 00:13 13977863                   /usr/lib/libgcc_s.so.1
7f9852ea5000-7f9852ea6000 rw-p 00016000 00:13 13977863                   /usr/lib/libgcc_s.so.1
7f9852ea6000-7f9853054000 r-xp 00000000 00:13 13975885                   /usr/lib/libc-2.26.so
7f9853054000-7f9853254000 ---p 001ae000 00:13 13975885                   /usr/lib/libc-2.26.so
7f9853254000-7f9853258000 r--p 001ae000 00:13 13975885                   /usr/lib/libc-2.26.so
7f9853258000-7f985325a000 rw-p 001b2000 00:13 13975885                   /usr/lib/libc-2.26.so
7f985325a000-7f985325e000 rw-p 00000000 00:00 0
7f985325e000-7f9853262000 r-xp 00000000 00:13 13975827                   /usr/lib/libSegFault.so
7f9853262000-7f9853461000 ---p 00004000 00:13 13975827                   /usr/lib/libSegFault.so
7f9853461000-7f9853462000 r--p 00003000 00:13 13975827                   /usr/lib/libSegFault.so
7f9853462000-7f9853463000 rw-p 00004000 00:13 13975827                   /usr/lib/libSegFault.so
7f9853463000-7f9853488000 r-xp 00000000 00:13 13975886                   /usr/lib/ld-2.26.so
7f9853649000-7f985364c000 rw-p 00000000 00:00 0
7f9853685000-7f9853687000 rw-p 00000000 00:00 0
7f9853687000-7f9853688000 r--p 00024000 00:13 13975886                   /usr/lib/ld-2.26.so
7f9853688000-7f9853689000 rw-p 00025000 00:13 13975886                   /usr/lib/ld-2.26.so
7f9853689000-7f985368a000 rw-p 00000000 00:00 0
7ffdf6a9b000-7ffdf6abc000 rw-p 00000000 00:00 0                          [stack]
7ffdf6bc7000-7ffdf6bc9000 r--p 00000000 00:00 0                          [vvar]
7ffdf6bc9000-7ffdf6bcb000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

At a first glance, the information may feel overwelming, but lets go through the most importat lines.

The backtrace lists the call chain when the the signal was delivered to the application. The first entry is on top of the stack

Backtrace:
./powerplant(+0x606)[0x55d2ad5e1606]
./powerplant(+0x628)[0x55d2ad5e1628]
./powerplant(+0x639)[0x55d2ad5e1639]
/usr/lib/libc.so.6(__libc_start_main+0xea)[0x7f9852ec6f6a]
./powerplant(+0x51a)[0x55d2ad5e151a]

Here we can see that the last executed instruction is at address 0x55d2ad5e1606. The tricky part is that the address is not absolute in the application, but virtual for the whole process. In other words, we need to calculate the address to an offset within the application's .text segment. If we look at the Memory map we see three entries for the powerplant application:

55d2ad5e1000-55d2ad5e2000 r-xp 00000000 00:13 14897704                   /home/marcus/tmp/segfault/powerplant
55d2ad7e1000-55d2ad7e2000 r--p 00000000 00:13 14897704                   /home/marcus/tmp/segfault/powerplant
55d2ad7e2000-55d2ad7e3000 rw-p 00001000 00:13 14897704                   /home/marcus/tmp/segfault/powerplant

Why three? Most ELF files (application or library) has at least three memory mapped sections: - .text, The executable code - .rodata, read only data - .data, read/write data

With help of the permissions it is possible to figure out which mapping correspond to each section.

The last mapping has rw- as permissions and is probably our .data section as it allows both write and read. The middle mapping has r-- and is a read only mapping - probably our .rodata section. The first mapping has r-x which is read-only and executable. This must be our .text section!

Now we can take the address from our backtrace and subtract with the offset address for our .text section: 0x55d2ad5e1606 - 0x55d2ad5e1000 = 0x606

Use addr2line to get the corresponding line our source code

$ addr2line -e ./powerplant -a 0x606
    0x0000000000000606
    /home/marcus/tmp/segfault/main.c:3

If we go back to the source code, we see that line 3 in main.c is

*rod = 0xAB;

Here we have it. Nothing more to say.

Conclusion

libSegFault.so has been a great help over a long time. The biggest benefit is that you don't have to recompile your application when you want to use the feature. However, you cannot get the line number from addr2line if the application is not compiled with debug symbols, but often it is not that hard to figure out the context out from a dissassembly of your application.

Embedded Linux Conference 2018

Embedded Linux Conference 2018

Ok, time for another conference. This time in Edinburgh, Scottland. My travel is limited to Edinburgh, but this city has a lot of things to see, including Edinburgh Castle, the Royal Botanic Garden, the clock that is always is 3 minutes wrong [1] and lots of more. A sidenote, yes I've tried Haggis as it is a must-try-thing and so should you. But be prepared to buy a backup-meal.

The conference this year has a few talks already on sunday. I'm going for Michael Kerrisk talks about cgroups (control groups) [2]. Michael is a great speaker as usual. His book, The Linux Programming Interface [3] is the only book that you need about system programming as it cover everything.

I was listen to a talk about CGROUPS that i appreciated a lot.

Control groups (cgroups)

cgroups is a hierchy of of processes with several controllers applied to it. These controllers restrict resources such as memory usage and cpu utilisation for a certain group.

Worth to tell is that there are currently two version of cgroups, cgroupv1 and cgroupv2. These are documented in Documentation/cgroup-v1/* and Documentation/admin-guide/cgroup-v2.rst respectively. I've been using cgroupv1 on some system configurations and I can just say that I'm glad that we have a v2. There is no consistency between controllers in v1 and the support for multiple hierchies is just messy, just to mention a few issus with v1. v2 is different. It has a design in mind (v1 was more of a design by implementation) and a set of design-rules. Even better - it has maintainers that make sure the design rules are followed. With all that in mind we hopefully won't end up with a v3... :-)

However, v2 does not have all functionality that v1 provides, but it is on its way into the kernel. v1 and v2 can coexist though, as long as they do not use the same controllers.

The conference

I enjoyed the conference and met a lot of interresting people. It is really fun to put a face on those patches I have reviewed on email :-)

/media/edinburgh.jpg

Lund Linux Conference 2018

Lund Linux Conference 2018

It is just two weeks from now til the Lund Linux Conference (LLC) [1] begins! LLC is a two-day conference with the same layout as the bigger Linux conferences - just smaller, but just as nice.

There will be talks about PCIe, The serial device bus, security in cars and a few more topics. My highlights this year is to hear about the XDP (eXpress Data Path) [2] to get really fast packet processing with standard Linux. For the last six months, XDP has had great progress and is a technically cool feature.

Here we are back in 2017:

/media/lund-linuxcon-2018.jpg

OOM-killer

OOM-killer

When the system is running out of memory, the Out-Of-Memory (OOM) killer picks a process to kill based on the current memory footprint. In case of OOM, we will calculate a badness score between 0 (never kill) and 1000 for each process in the system. The process with the highest score will be killed. A score of 0 is reserved for unkillable tasks such as the global init process (see [1]) or kernel threads (processes with PF_KTHREAD flag set).

/media/oomkiller.jpg

The current score of a given process is exposed in procfs, see /proc/[pid]/oom_score, and may be adjusted by setting /proc/[pid]/oom_score_adj. The value of oom_score_adj is added to the score before it is used to determine which task to kill. The value may be set between OOM_SCORE_ADJ_MIN (-1000) and OOM_SCORE_DJ_MAX (+1000). This is useful if you want to guarantee that a process never is selected by the OOM killer.

The calculation is simple (nowadays), if a task is using all its allowed memory, the badness score will be calculated to 1000. If it is using half of its allowed memory, the badness score is calculated to 500 and so on. By setting oom_score_adj to -1000, the badness score sums up to <=0 and the task will never be killed by OOM.

There is one more thing that affects the calculation; if the process is running with the capability CAP_SYS_ADMIN, it gets a 3% discount, but that is simply it.

The old implementation

Before v2.6.36, the calculation of badness score tried to be smarter, besides looking for the total memory usage (task->mm->total_vm), it also considered: - Whether the process creates a lot of children - Whether the process has been running for a long time, or has used a lot of CPU time - Whether the process has a low nice value - Whether the process is privileged (CAP_SYS_ADMIN or CAP_SYS_RESOURCE set) - Whether the process is making direct hardware access

At first glance, all these criteria looks valid, but if you think about it a bit, there is a lot of pitfalls here which makes the selection not so fair. For example: A process that creates a lot of children and consumes some memory could be a leaky webserver. Another process that fits into the description is your session manager for your desktop environment which naturally creates a lot of child processes.

The new implementation

This heuristic selection has evolved over time, instead of looking on mm->total_vm for each task, the task's RSS (resident set size, [2]) and swap space is used instead. RSS and Swap space gives a better indication of the amount that we will be able to free if we chose this task. The drawback with using mm->total_vm is that it includes overcommitted memory ( see [3] for more information ) which is pages that the process has claimed but has not been physically allocated.

The process is now only counted as privileged if CAP_SYS_ADMIN is set, not CAP_SYS_RESOURCE as before.

The code

The whole implementation of OOM killer is located in mm/oom_kill.c. The function oom_badness() will be called for each task in the system and returns the calculated badness score.

Let's go through the function.

unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
              const nodemask_t *nodemask, unsigned long totalpages)
{
    long points;
    long adj;

    if (oom_unkillable_task(p, memcg, nodemask))
        return 0;

Looking for unkillable tasks such as the global init process.

p = find_lock_task_mm(p);
if (!p)
    return 0;

adj = (long)p->signal->oom_score_adj;
if (adj == OOM_SCORE_ADJ_MIN ||
        test_bit(MMF_OOM_SKIP, &p->mm->flags) ||
        in_vfork(p)) {
    task_unlock(p);
    return 0;
}

If proc/[pid]/oom_score_adj is set to OOM_SCORE_ADJ_MIN (-1000), do not even consider this task

points = get_mm_rss(p->mm) + get_mm_counter(p->mm, MM_SWAPENTS) +
    atomic_long_read(&p->mm->nr_ptes) + mm_nr_pmds(p->mm);
task_unlock(p);

Calculate a score based on RSS, pagetables and used swap space

if (has_capability_noaudit(p, CAP_SYS_ADMIN))
    points -= (points * 3) / 100;

If it is root process, give it a 3% discount. We are no mean people after all

adj *= totalpages / 1000;
points += adj;

Normalize and add the oom_score_adj value

return points > 0 ? points : 1;

At last, never return 0 for an eligible task as it is reserved for non killable tasks

}

Conclusion

The OOM logic is quite straightforward and seems to have been stable for a long time (v2.6.36 was released in october 2010). The reason why I was looking at the code was that I did not think the behavior I saw when experimenting corresponds to what was written in the man page for oom_score. It turned out that the manpage was not updated when the new calculation was introduced back in 2010.

I have updated the manpage and it is available in v4.14 of the Linux manpage project [4].

commit 5753354a3af20c8b361ec3d53caf68f7217edf48
Author: Marcus Folkesson <marcus.folkesson@gmail.com>
Date:   Fri Nov 17 13:09:44 2017 +0100

    proc.5: Update description of /proc/<pid>/oom_score

    After Linux 2.6.36, the heuristic calculation of oom_score
    has changed to only consider used memory and CAP_SYS_ADMIN.

    See kernel commit a63d83f427fbce97a6cea0db2e64b0eb8435cd10.

    Signed-off-by: Marcus Folkesson <marcus.folkesson@gmail.com>
    Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

diff --git a/man5/proc.5 b/man5/proc.5
index 82d4a0646..4e44b8fba 100644
--- a/man5/proc.5
+++ b/man5/proc.5
@@ -1395,7 +1395,9 @@ Since Linux 2.6.36, use of this file is deprecated in favor of
 .IR /proc/[pid]/oom_score_adj .
 .TP
 .IR /proc/[pid]/oom_score " (since Linux 2.6.11)"
-.\" See mm/oom_kill.c::badness() in the 2.6.25 sources
+.\" See mm/oom_kill.c::badness() in pre 2.6.36 sources
+.\" See mm/oom_kill.c::oom_badness() after 2.6.36
+.\" commit a63d83f427fbce97a6cea0db2e64b0eb8435cd10
 This file displays the current score that the kernel gives to
 this process for the purpose of selecting a process
 for the OOM-killer.
@@ -1403,7 +1405,16 @@ A higher score means that the process is more likely to be
 selected by the OOM-killer.
 The basis for this score is the amount of memory used by the process,
 with increases (+) or decreases (\-) for factors including:
-.\" See mm/oom_kill.c::badness() in the 2.6.25 sources
+.\" See mm/oom_kill.c::badness() in pre 2.6.36 sources
+.\" See mm/oom_kill.c::oom_badness() after 2.6.36
+.\" commit a63d83f427fbce97a6cea0db2e64b0eb8435cd10
+.RS
+.IP * 2
+whether the process is privileged (\-);
+.\" More precisely, if it has CAP_SYS_ADMIN or (pre 2.6.36) CAP_SYS_RESOURCE
+.RE
+.IP
+Before kernel 2.6.36 the following factors were also used in the calculation of oom_score:
 .RS
 .IP * 2
 whether the process creates a lot of children using
@@ -1413,10 +1424,7 @@ whether the process creates a lot of children using
 whether the process has been running a long time,
 or has used a lot of CPU time (\-);
 .IP *
-whether the process has a low nice value (i.e., > 0) (+);
-.IP *
-whether the process is privileged (\-); and
-.\" More precisely, if it has CAP_SYS_ADMIN or CAP_SYS_RESOURCE
+whether the process has a low nice value (i.e., > 0) (+); and
 .IP *
 whether the process is making direct hardware access (\-).
 .\" More precisely, if it has CAP_SYS_RAWIO

Embedded Linux course in Linköping

Embedded Linux course in Linköping

I tech in our Embedded Linux course on a regular basis, this time in Linköping. It's a fun course with interesting labs where you will write your own linux device driver for a custom board.

The course itself is quite moderate, but with talented participants, we easily slip over to more interesting things like memory management, the MTD subsystem, how ftrace works internally and how to use it, different contexts, introduce perf and much more.

This time was no exception.

/media/embedded-linux-course.jpg

FIT vs legacy image format

FIT vs legacy image format

U-Boot supports several image formats when booting a kernel. However, a Linux system usually need multiple files for booting. Such files may be the kernel itself, an initrd and a device tree blob.

A typical embedded Linux system have all these files in at least two-three different configurations. It is not uncommon to have

  • Default configuration
  • Rescue configuration
  • Development configuration
  • Production configuration
  • ...

Only these four configurations may involve twelve different files. Or maybe the devicetree is shared between two configurations. Or maybe the initrd is... Or maybe the..... you got the point. It has a farily good chance to end up quite messy.

This is a problem with the old legacy formats and is somehow addressed in the FIT (Flattened Image Tree) format.

Lets first look at the old formats before we get into FIT.

zImage

The most well known format for the Linux kernel is the zImage. The zImage contains a small header followed by a self-extracting code and finally the payload itself. U-Boot support this format by the bootz command.

Image layout

zImage format
Header
Decompressing code
Compressed Data

uImage

When talking about U-Boot, the zImage is usually encapsulated in a file called uImage created with the mkimage utility. Beside image data, the uImage also contains information such as OS type, loader information, compression type and so on. Both data and header is checksumed with CRC32.

The uImage format also supports multiple images. The zImage, initrd and devicetree blob may therefor be included in one single monolithic uImage. Drawbacks with this monolith is that it has no flexible indexing, no hash integrity and no support for security at all.

Image layout

uImage format
Header
Header checksum
Data size
Data load address
Entry point address
Data CRC
OS, CPU
Image type
Compression type
Image name
Image data

FIT

The FIT (Flattened Image Tree) has been around for a while but is for unknown reason rarely used in real systems, based on my experience. FIT is a tree structure like Device Tree (before they change format to YAML: https://lwn.net/Articles/730217/) and handles images of various types.

These images is then used in something called configurations. One image may be a used in several configurations.

This format has many benifits compared to the other:

  • Better solutions for multi component images
    • Multiple kernels (productions, feature, debug, rescue...)
    • Multiple devicetrees
    • Multiple initrds
  • Better hash integrity of images. Supports different hash algorithms like
    • SHA1
    • SHA256
    • MD5
  • Support signed images
    • Only boot verified images
    • Detect malware

Image Tree Source

The file describing the structure is called .its (Image Tree Source). Here is an example of a .its file

/dts-v1/;

/ {
    description = "Marcus FIT test";
    #address-cells = <1>;

    images {
        kernel@1 {
            description = "My default kenel";
            data = /incbin/("./zImage");
            type = "kernel";
            arch = "arm";
            os = "linux";
            compression = "none";
            load = <0x83800000>;
            entry = <0x83800000>;
            hash@1 {
                algo = "md5";
            };
        };

        kernel@2 {
            description = "Rescue image";
            data = /incbin/("./zImage");
            type = "kernel";
            arch = "arm";
            os = "linux";
            compression = "none";
            load = <0x83800000>;
            entry = <0x83800000>;
            hash@1 {
                algo = "crc32";
            };
        };

        fdt@1 {
            description = "FDT for my cool board";
            data = /incbin/("./devicetree.dtb");
            type = "flat_dt";
            arch = "arm";
            compression = "none";
            hash@1 {
                algo = "crc32";
            };
        };


    };

    configurations {
        default = "config@1";

        config@1 {
            description = "Default configuration";
            kernel = "kernel@1";
            fdt = "fdt@1";
        };

        config@2 {
            description = "Rescue configuration";
            kernel = "kernel@2";
            fdt = "fdt@1";
        };

    };
};

This .its file has two kernel images (default and rescue) and one FDT image. Note that the default kernel is hashed with md5 and the other with crc32. It is also possible to use several hash-functions per image.

It also specifies two configuration, Default and Rescue. The two configurations is using different kernels (well, it is the same kernel in this case since both is pointing to ./zImage..) but share the same FDT. It is easy to build up new configurations on demand.

The .its file is then passed to mkimage to generate an itb (Image Tree Blob):

mkimage -f kernel.its kernel.itb

which is bootable from U-Boot.

Look at ./doc/uImage.FIT/ in the U-Boot source code for more examples on how a .its-file can look like. It also contains examples on signed images which is worth a look.

Boot from U-Boot

To boot from U-Boot, use the bootm command and specify the physical address for the FIT image and the configuration you want to boot.

Example on booting config@1 from the .its above:

bootm 0x80800000#config@1

Full example with bootlog:

U-Boot 2015.04 (Oct 05 2017 - 14:25:09)

CPU:   Freescale i.MX6UL rev1.0 at 396 MHz
CPU:   Temperature 42 C
Reset cause: POR
Board: Marcus Cool board
fuse address == 21bc400
serialnr-low : e1fe012a
serialnr-high : 243211d4
       Watchdog enabled
I2C:   ready
DRAM:  512 MiB
Using default environment

In:    serial
Out:   serial
Err:   serial
Net:   FEC1
Boot from USB for mfgtools
Use default environment for                              mfgtools
Run bootcmd_mfg: run mfgtool_args;bootz ${loadaddr} - ${fdt_addr};
Hit any key to stop autoboot:  0
=> bootm 0x80800000#config@1
## Loading kernel from FIT Image at 80800000 ...
   Using 'config@1' configuration
   Trying 'kernel@1' kernel subimage
     Description:  My cool kernel
     Type:         Kernel Image
     Compression:  uncompressed
     Data Start:   0x808000d8
     Data Size:    7175328 Bytes = 6.8 MiB
     Architecture: ARM
     OS:           Linux
     Load Address: 0x83800000
     Entry Point:  0x83800000
     Hash algo:    crc32
     Hash value:   f236d022
   Verifying Hash Integrity ... crc32+ OK
## Loading fdt from FIT Image at 80800000 ...
   Using 'config@1' configuration
   Trying 'fdt@1' fdt subimage
     Description:  FDT for my cool board
     Type:         Flat Device Tree
     Compression:  uncompressed
     Data Start:   0x80ed7e4c
     Data Size:    27122 Bytes = 26.5 KiB
     Architecture: ARM
     Hash algo:    crc32
     Hash value:   1837f127
   Verifying Hash Integrity ... crc32+ OK
   Booting using the fdt blob at 0x80ed7e4c
   Loading Kernel Image ... OK
   Loading Device Tree to 9ef86000, end 9ef8f9f1 ... OK

Starting kernel ...

One thing to think about

Since this FIT image tends to grow in size, it is a good idea to set CONFIG_SYS_BOOTM_LEN in the U-Boot configuration.

- CONFIG_SYS_BOOTM_LEN:
        Normally compressed uImages are limited to an
        uncompressed size of 8 MBytes. If this is not enough,
        you can define CONFIG_SYS_BOOTM_LEN in your board config file
        to adjust this setting to your needs.

Conclusion

I advocate to use the FIT format because it solves many problems that can get really awkward. Just the bonus to program one image in production instead of many is a great benifit. The configurations is well defined and there is no chance that you will end up booting one kernel image with an incompatible devicetree or whatever.

Use FIT images should be part of the bring-up-board activity, not the last thing todo before closing a project. The "Use seperate kernel/dts/initrds for now and then move on to FIT when the platform is stable" does not work. It simple not gonna happend, and it is a mistake.