What is libcamera and why should you use it?

What is libcamera and why should you use it

Read out a picture from camera

Once in a time, video devices was not that complex. To use a camera back then, your application software could iterated through /dev/video* devices and pick the camera that you want and then immediately start using it. You could query which pixel formats, frame rates, resolutions and all other properties that are supported by the camera. You could even easely change it if you want.

This still works for some cameras, basically every USB camera and most laptop cameras still works that way.

The problem, especially in embedded systems, is that there is no such thing as "the camera" anymore. The camera system is rather a complex pipeline of different image processing nodes that the image data traverse through to be shaped as you want. Even if the result of this pipeline will end up in a video device, you cannot configure things like cropping, resolution etc. directly on that device as you used to. Instead, you have to use the media controller API to configure and link each of these nodes to build up your pipeline.

To show how it may look like; this is a graph that I had in a previous post [3]:

/media/media-ctl-graph.png

What is libcamera?

/media/libcamera-banner.png

This is how libcamera is described on their website [1]

libcamera is an open source camera stack for many platforms with a core userspace library, and support from the Linux kernel APIs and drivers already in place.
It aims to control the complexity of embedded camera hardware by providing an intuitive API and method of separating untrusted vendor code from the open source core.

libcamera aims to encourage the development of new embedded camera applications by limiting the complexity that developers have to deal with.
The interface is designed around the way that modern embedded camera hardware works.

First time I heard about libcamera was on the Embedded Linux Conference 2019 where Jacopo Mondi had a talk [2] about the Public API for the first stable libcamera release. I have been working with cameras in several embedded Linux products and know for sure how complex [3] these little beast could be. The configuration also differ depending on which platform or camera you are using as there is no common way to setup the image pipe. You will soon have special cases for all your platform variants in your application. Which is not what we strive for.

libcamera is trying to solve this by provide one library that takes care of all that complexity for you.

For example, if you want to adjust a simple thing, say contrast, of a IMX219 camera module connected to a Raspberry Pi. To do that without libcamera, you first have to setup a proper image pipeline that takes the camera module, connect it to the several ISP (Image Signal Processing) blocks that your processor offers in order to get the right image format, resolution and so on. Somewhere between all these configuring, you realise that the camera module nor the ISPs have support for adjust the contrast. Too bad. To achieve this you have to take the image, pass it to a self-written contrast algorithm, create a gamma curve that the IPA (Image Processing Algorithm) understands and actually set gamma. Yes, the contrast is adjusted with a gamma curve for that particular camera on Raspberry Pi. ( Have a look at the implementation of that IPA block [7] for Raspberry Pi )

This is exactly the stuff libcamera understands and abstract for the user. libcamera will figure out what graph it has to build depending on what you want do to and which processing operations that are available at your various nodes. The application that is using libcamera for the video device will set contrast for all cameras and platforms. After all, that is what you wanted.

Camera Stack

As the libcamera library is fully implemented in userspace and use already existing kernel interfaces for communication with hardware, you will need no extra underlying support in terms of separate drivers or kernel support.

libcamera itself exposes several API's depending on how the application want to interface the camera. It even have a V4L2 compatiblity layer to emulate a high-level V4L2 camera device to make a smooth transistion for all those V4L2 applications out there.

/media/libcamera-layer.png

Read more about the camera stack in the libcamera documentation [4].

Conclusion

I really like this project and I think we need an open-source stack that supports many platforms. This vendor-specific drivers/libraries/IPAs-situation we are in right now is not sustainable at all. It takes too much effort to evaluate a few cameras of different vendors just because all vendors has their own way to control the camera with their own closed-source and platform specific layers. Been there done that.

For those vendors that do not want to open-source their secret image processing algorithms, libcamera uses a plugin system for IPA modues which let vendors keep their secrets but still be compatible with libcamera. All open-source modules are identified based on digital signatures, while closed-source modules are instead isolated inside a sandbox environment with restricted access to the system. A Win-Win concept.

The project itself is still quite young and need more work to support more platforms and cameras, but the ground is stable. Raspberry Pi is now a common used platform, both in commersial and hobby, and the fact that Raspberry Pi Foundation has choosen libcamera as their primary camera system [8] must tell us something.

HID report descriptors and Linux

HID report descriptors and Linux

HID Devices

USB HID (Human Interface Device) device class is the type of computer perihpherals that human interacts with, such as keyboards, mice, game controllers and touchscreens. The protocol is probably one of the most simple protocols in the USB specification. Even if HID was originally written for USB in mind, it works with several other transport layers. Your mouse and keyboard do probably use HID over USB, the touchscreen in your smartphone could use HID over I2C. Even Bluetooth/BLE make use of the same protool for this type of devices.

The protocol is popular as it is so simple. The HID descriptor could be stored in ROM and the perihpheral could be implemented using only a small 8-bit MCU.

Dispite how simple and well defined [1] the HID specification is, implementors will still get it wrong as we will see later on. Those noncompliant errors has to be fixed up in order to use the device. It is quite sad as it requires a separate driver for this even if the HID should be able to be handled in a generic way.

We will continue to focus on the HID implementation over USB.

HID Descriptors

The device itself has to identify itself and that information is stored in segments of its ROM (Read Only Memory). This segments, or descriptors as they are called, describe what type of device it is and which interface (Interface Descriptor) it exposes. Such interfaces are called classes and all devices belongs to one USB class. Such class [2] could be Printer, Video, or the one we will focus on - the HID class.

/media/hid-1.png

The HID class device descriptor defines and identifies which other descriptors present. Such other descriptors could be report descriptors and physical descriptors.

Report Descriptors

The report descriptor describes how the data that the device generates should be interpreted. For example, the report descriptor describes how to determine the button state of your mouse or the position of the touchscreen click on your smartphone.

Physical Descriptors

The physical descriptor on the other hand provides information about the physical layout of the device, e.g. what and how your body interact with the device to activate certain functions.

The big picture

There are of cause more descriptors, but most of them are part of the USB specification rather than specific for the HID devices.

/media/hid-2.png

There are much to say about the HID class, subclasses, interfaces and protocols and we will not cover them all. But just to give you a hint of what it is:

HID class

USB devices are grouped into USB classes depending on what type of device and transport requirement it is. For example an Audio device requires isohcronous data pipes which HID devices does not. HID Devices has different and much simplier data transport requirements.

Subclasses

The only subclass for the HID class is the Boot Interface Subclass. It is a small subset of the report descriptor that is easier to parse for code that does not want (or have resource to) parse a full report descriptor. BIOS is one example of such code that want to keep the complexity and footprint as small as possible, but still want to be able to use a keyboard.

Protocol

The HID protocol only has a meaning if subclass is a Boot Interface, then it is used to determine if the device is a mouse or keyboard. Otherwise the Protocol is not used.

Interfaces

The interface is a way for the device and host to communicate.HID devices use either the Control pipe or the Interrupt pipe.

Control pipes are used for:

  • Receive and respond to requests for USB control and class data
  • Transmit data when asked by the HID class driver
  • Receive data from the host

The interrupt pipe are used for:

  • Receiving asynchronous data from the device
  • Transmitting low-latency

Report descriptors

The information passed to and from the device is encapsulated in reports. These reports are organized data where the data layout is described in a report descripor. This report descriptor is one of the first items that the host requests from the device and describes how the data should be interpreted. For example, a mouse could have several buttons represented by one bit each and a wheel represented from -127 to +128. The report descriptor will give you details about which bits are mapped to which button and also which 8 bits should be used for the wheel.

All report descriports are available to read out from sysfs. This is the report from my keyboard

[15:15:02]marcus@goliat:~$ od -t x1 -Anone  /sys/bus/usb/devices/3-12.3.2.1/3-12.3.2.1:1.0/0003:045E:00DB.0009/report_descriptor
 05 01 09 06 a1 01 05 08 19 01 29 03 15 00 25 01
 75 01 95 03 91 02 09 4b 95 01 91 02 95 04 91 01
 05 07 19 e0 29 e7 95 08 81 02 75 08 95 01 81 01
 19 00 29 91 26 ff 00 95 06 81 00 c0

Here is a another (parsed) example of what the report descriptor may look like

0x05, 0x01,        // Usage Page (Generic Desktop Ctrls)
0x09, 0x04,        // Usage (Joystick)
0xA1, 0x01,        // Collection (Application)
0xA1, 0x00,        //   Collection (Physical)
0x09, 0x30,        //     Usage (X)
0x09, 0x31,        //     Usage (Y)
0x15, 0x00,        //     Logical Minimum (0)
0x26, 0xFF, 0x07,  //     Logical Maximum (2047)
0x35, 0x00,        //     Physical Minimum (0)
0x46, 0xFF, 0x00,  //     Physical Maximum (255)
0x75, 0x10,        //     Report Size (16)
0x95, 0x02,        //     Report Count (2)
0x81, 0x02,        //     Input (Data,Var,Abs,No Wrap,Linear,Preferred State,No Null Position)
0xC0,              //   End Collection
0x75, 0x08,        //   Report Size (8)
0x95, 0x03,        //   Report Count (3)
0x81, 0x03,        //   Input (Cnst,Var,Abs)
0xC0,              // End Collection

The descriptor describes a 2 axis (X/Y) joystick controller where each axis could have an absolute value between 0 and 2047.

If we walk through the descriptor step by step.

0x05, 0x01,        // Usage Page (Generic Desktop Ctrls)

As there are a many different types of devices, these are grouped into pages. This to know how the following (Usage) entry should be intepreted.

There are a plenty of groups as seen in the reference manual [3] :

/media/hid-page.png

Our joystick belongs to the Generic Desktop Ctrls group.

The next line is the usage:

0x09, 0x04,        // Usage (Joystick)

Even mouse, keyboard and gamepads are example of Generic Desktop Ctrls as seen in the table below:

/media/hid-page.png

The next entry is the application collection:

0xA1, 0x01,        // Collection (Application)

The application collection is to make a meaningful grouping of Input, Output and Feature items. Each such grouping has at least Report Size and Report Count defined to determine how big in terms of bytes the collections is.

The next entry is the physical collection:

0xA1, 0x00,        //   Collection (Physical)

This provides information about the part or parts of the human body used to activate the controls on the device. In other words - button and knobs.

Now to the more fun part:

0x09, 0x30,        //     Usage (X)
0x09, 0x31,        //     Usage (Y)
0x15, 0x00,        //     Logical Minimum (0)
0x26, 0xFF, 0x07,  //     Logical Maximum (2047)
0x35, 0x00,        //     Physical Minimum (0)
0x46, 0xFF, 0x00,  //     Physical Maximum (255)
0x75, 0x10,        //     Report Size (16)
0x95, 0x02,        //     Report Count (2)
0x81, 0x02,        //     Input (Data,Var,Abs,No Wrap,Linear,Preferred State,No Null Position)

Here we can see that there are two axis, X/Y, that is represented by a value between 0 and 2047 (11 bits). A single axis does ockupy 16 bits and is of the type Input. That is pretty much all what we need to know in order to parse this information.

What is so hard then?

These report descriptors is not that hard to follow and there is no black magic around them. Dispite that, many vendors does not get these report descriptors right and keep deliver a custom driver along with their product.

I used to build and fly tri- and quadcopters, and on way to be better at flying is to use your radio transmitter connected to a simulator as training. The crashes is not that fatal nor costly that way...

I've never seen such a flight controller that actually follow the HID specification, and that makes them useless without a custom driver that can parse the mess. It is not uncommon that the actual reports from the device looks good, it is just that the report descriptor is messed up.

In that case we can write a pretty small Linux Kernel Driver that only fixup the report descriptor and then let the HID layer create and manage the device in a generic way. This is what I did for the VRC2 and HID-PXRC driver [4] which will be available in Linux 6.1.

Such driver could be as simple as (cut out from my VRC2-driver):

static __u8 vrc2_rdesc_fixed[] = {
    0x05, 0x01,        // Usage Page (Generic Desktop Ctrls)
    0x09, 0x04,        // Usage (Joystick)
    0xA1, 0x01,        // Collection (Application)
    0x09, 0x01,        //   Usage (Pointer)
    0xA1, 0x00,        //   Collection (Physical)
    0x09, 0x30,        //     Usage (X)
    0x09, 0x31,        //     Usage (Y)
    0x15, 0x00,        //     Logical Minimum (0)
    0x26, 0xFF, 0x07,  //     Logical Maximum (2047)
    0x35, 0x00,        //     Physical Minimum (0)
    0x46, 0xFF, 0x00,  //     Physical Maximum (255)
    0x75, 0x10,        //     Report Size (16)
    0x95, 0x02,        //     Report Count (2)
    0x81, 0x02,        //     Input (Data,Var,Abs,No Wrap,Linear,Preferred State,No Null Position)
    0xC0,              //   End Collection
    0x75, 0x08,        //   Report Size (8)
    0x95, 0x03,        //   Report Count (3)
    0x81, 0x03,        //   Input (Cnst,Var,Abs)
    0xC0,              // End Collection
};

static __u8 *vrc2_report_fixup(struct hid_device *hdev, __u8 *rdesc,
                unsigned int *rsize)
{
    hid_info(hdev, "fixing up VRC-2 report descriptor\n");
    *rsize = sizeof(vrc2_rdesc_fixed);
    return vrc2_rdesc_fixed;
}

static int vrc2_probe(struct hid_device *hdev, const struct hid_device_id *id)
{
    int ret;

    /*
     * The device gives us 2 separate USB endpoints.
     * One of those (the one with report descriptor size of 23) is just bogus so ignore it
     */
    if (hdev->dev_rsize == 23)
        return -ENODEV;

    ret = hid_parse(hdev);
    if (ret) {
        hid_err(hdev, "parse failed\n");
        return ret;
    }

    ret = hid_hw_start(hdev, HID_CONNECT_DEFAULT);
    if (ret) {
        hid_err(hdev, "hw start failed\n");
        return ret;
    }

    return 0;
}

static const struct hid_device_id vrc2_devices[] = {
    { HID_USB_DEVICE(USB_VENDOR_ID_VRC2, USB_DEVICE_ID_VRC2) },
    { /* sentinel */ }
};
MODULE_DEVICE_TABLE(hid, vrc2_devices);

static struct hid_driver vrc2_driver = {
    .name = "vrc2",
    .id_table = vrc2_devices,
    .report_fixup = vrc2_report_fixup,
    .probe = vrc2_probe,
};
module_hid_driver(vrc2_driver);

MODULE_AUTHOR("Marcus Folkesson <marcus.folkesson@gmail.com>");
MODULE_DESCRIPTION("HID driver for VRC-2 2-axis Car controller");
MODULE_LICENSE("GPL");

BPF for HID drivers

Benjamin Tissoires, One of the maintiners for the HID core layer, has posted his work [6] to introduce eBPF (extended Berkely Packet Filter) support for HID devics which is a really cool thing. As many devices just lack a proper report descriptor, the eBPF let you write such fixup in userspace and simply load the program into the kernel. There is still some parts missing before we can see full support for this feture, but the main part is merged and will be available in 6.1.

See the LWN article [5] for further reading.

Conclusion

HID report descriptors has been a fun subject to dig into. It is still hard to see why different vendors has so hard to follow the specification though.

I also have to thank Benjamin Tissoires for great help and support in understanding how the HID layer and HID devices works.

Industrial I/O and triggers

Industrial I/O and triggers

I've maintained a couple of IIO-drivers (MCP3911 [4] and LTC1660 [5]) for some time now and it is time to give at least the MCP3911 a face-lift.

This time the facelift includes support for:

  • Buffers
  • Triggers
  • Make the driver interrupt driven
  • Add support for setting Oversampling Ratio
  • Add support for setting PGA (Pre Gain Amplifier)

Also clean it up a bit by only using device managed resources.

What is Industrial I/O?

Industrial I/O, or IIO [1], is a subsystem that exposes sensors and actuators in a common way to userspace. The subsystem supports a range of different sensors including ADCs, IMUs, pressure sensors, light sensors, accelerometers and more. Even actuators such as DACs and amplifiers has its place in the IIO subsystem.

The hwmon [2] subsystem provides an interface for a few types of sensors as well, but the framework lack support to cover some use cases that IIO tries to solve, such as:

  • High speed sensors
  • Triggered sampling
  • Data buffering

In short, use hwmon for slow sensors and actuators, otherwise use IIO (preferred for new devices).

The IIO subsystem also provides a stable ABI for various userspace HALs which hwmon does not. libiio [3] is the official and preferred one for the IIO.

/media/iio.png

Sysfs

All IIO devices is exported by sysfs where those can be configured and read single shot values. For example, the raw value of the first channel of an ADC can be read out by:

cat /sys/bus/devices/iio:device/in_voltage0_raw

All other parameters such as oversampling ratio and scaling value is also exposed here.

Scaling value

The value you get from in_voltageX_raw is the raw value, it means that it has to be converted in order to get something meaningful out of it.

To get the value in mV you have to take the scale and offset value into account:

Value in mV = (raw + offset) * scale

All these values are exposed by sysfs in in_voltage_scale and in_voltage_offset respectively.

Triggers

Triggers can be both hardware and software based.

Example on hardware based triggers are:

  • GPIO-based interrupts

Example on software based triggers are:

  • sysfs - you can trig a data poll from userspace
  • hrtimer - let you specify the period and a High Resolution timer will be created and trig a data poll at a given frequency

CONFIG_IIO_SYSFS_TRIGGER

By enable CONFIG_IIO_SYSFS_TRIGGER you can make use of the sysfs trigger

# echo 0 > /sys/bus/iio/devices/iio_sysfs_trigger/add_trigger
# cat /sys/bus/iio/devices/iio_sysfs_trigger/trigger0/name
sysfstrig0

CONFIG_IIO_HRTIMER_TRIGGER

By enable CONFIG_IIO_HRTIMER_TRIGGER you can make use of a timer based trigger

# mount -t configfs none /sys/kernel/config
# mkdir /config/iio/triggers/hrtimer/my_50ms_trigger
# echo 2000 > /sys/bus/iio/devices/trigger0/sampling_frequency

Make use of a trigger

As long as the device supports triggers, there will be an /sys/bus/iio/devices/iio:device0/trigger/current_trigger entry. All available triggers, both hardware and software based, are located in /sys/bus/iio/devices/triggerX.

One nice feature is that one trigger can be used for multiple devices.

In order to activate a trigger for a certain device, simply write the trigger name to the current_trigger entry:

# cat /sys/bus/iio/devices/trigger0/name > /sys/bus/iio/devices/iio:device0/trigger/current_trigger

The next step is to decide and enable those channels you want to scan

# echo 1 > /sys/bus/iio/devices/iio:device0/scan_elements/in_voltage0_en
# echo 1 > /sys/bus/iio/devices/iio:device0/scan_elements/in_voltage1_en
# echo 1 > /sys/bus/iio/devices/iio:device0/scan_elements/in_timestamp_en

And finally, start the sampling process

# echo 1 > /sys/bus/iio/devices/iio:device0/buffer/enable

Now you will get the raw values for voltage0, voltage1 and the timestamp by reading from the /dev/iio:device0 device.

You will read out a stream of data from the device. Before applying the scale value to the raw data, the buffer data may be processed somehow depending on its format. The format for each channel is available as a sysfs entry as well:

# cat /sys/bus/iio/devices/iio:device0/scan_elements/in_voltage0_type
be:s24/32>>0

The buffer format for voltage0 means that each sample is 32 bits wide, does not need any shifting and should be intepreted as a signed 24-bit value.

Conclusion

The IIO subsystem is rather complex. The framework also supports events which makes it possible to trig on specific threshold values. As the subsystem is optimized for performance and the triggers makes it possible to read values at a given frequency or event, this makes a lot more use cases possible than the older hwmon interface.

FYI, the patches for MCP3911 is currently up to be merged into mainline.

Mounting with systemd and udev

/media/systemd.png

Mounting with systemd and udev

Systemd has not allways been my first choice as init system for embedded system, but I cannot ignore that it has many good and handy things that other init systems don't. At the same time, that is just what I don't like with systemd, it does not follow the "Do one thing and do it well"-philosophy that I like so much. I am very thorn about it.

However, when trying to do some things with systemd as you used to do with other systems you sometimes encounter some difficulties. Mostly it is simple because there is another way to accomplish what you want, the "systemd-way", which is usually a better and safer way, but sometimes you simply don't want to.

One such thing I encountered was to mount filesystems with udev. This used to work, but since v239 of systemd, two separate directives were introduced and changed this default behavior.

units: switch from system call blacklist to whitelist

Commit ee8f26180d01e3ddd4e5f20b03b81e5e737657ae [1]

units: switch from system call blacklist to whitelist

This is generally the safer approach, and is what container managers
(including nspawn) do, hence let's move to this too for our own
services. This is particularly useful as this this means the new
@System-service system call filter group will get serious real-life
testing quickly.

This also switches from firing SIGSYS on unexpected syscalls to
returning EPERM. This would have probably been a better default anyway,
but it's hard to change that these days. When whitelisting system calls
SIGSYS is highly problematic as system calls that are newly introduced
to Linux become minefields for services otherwise.

Note that this enables a system call filter for udev for the first time,
and will block @clock, @mount and @swap from it. Some downstream
distributions might want to revert this locally if they want to permit
unsafe operations on udev rules, but in general this shiuld be mostly
safe, as we already set MountFlags=shared for udevd, hence at least
@mount won't change anything.

This patch change the default filter behavior from a blacklist to a whitelist and @mount is no longer allowed

+ SystemCallFilter=@system-service @module @raw-io
+ SystemCallErrorNumber=EPERM

units: switch udev service to use PrivateMounts=yes

Commit b2e8ae7380d009ab9f9260a34e251ac5990b01ca [2]

units: switch udev service to use PrivateMounts=yes

Given that PrivateMounts=yes is the "successor" to MountFlags=slave in
unit files, let's make use of it for udevd.

What does systemd says about PrivateMounts? [3]

PrivateMounts=
Takes a boolean parameter. If set, the processes of this unit will be run in their own private file system (mount) namespace with all mount propagation from the processes towards the host's main file system namespace turned off. This means any file system mount points established or removed by the unit's processes will be private to them and not be visible to the host. However, file system mount points established or removed on the host will be propagated to the unit's processes. See mount_namespaces(7) for details on file system namespaces. Defaults to off.

When turned on, this executes three operations for each invoked process: a new CLONE_NEWNS namespace is created, after which all existing mounts are remounted to MS_SLAVE to disable propagation from the unit's processes to the host (but leaving propagation in the opposite direction in effect). Finally, the mounts are remounted again to the propagation mode configured with MountFlags=, see below.

File system namespaces are set up individually for each process forked off by the service manager. Mounts established in the namespace of the process created by ExecStartPre= will hence be cleaned up automatically as soon as that process exits and will not be available to subsequent processes forked off for ExecStart= (and similar applies to the various other commands configured for units). Similarly, JoinsNamespaceOf= does not permit sharing kernel mount namespaces between units, it only enables sharing of the /tmp/ and /var/tmp/ directories.

Other file system namespace unit settings — PrivateMounts=, PrivateTmp=, PrivateDevices=, ProtectSystem=, ProtectHome=, ReadOnlyPaths=, InaccessiblePaths=, ReadWritePaths=, … — also enable file system namespacing in a fashion equivalent to this option. Hence it is primarily useful to explicitly request this behaviour if none of the other settings are used.

This option is only available for system services, or for services running in per-user instances of the service manager when PrivateUsers= is enabled.

If PrivateMounts=true, then the process has its own mount namespace which will result in that the mounted filesystem is visable only for the process (udevd) itself and will not be propagated to the whole system.

Conclusion

There is reasons to not allow udev mount filesystems for sure, but if you still want to do it you have to revert these changes by modify /lib/systemd/system/systemd-udev.service with:

GPLv2 and GPLv3

Open Source

"Free as in freedom - not as in free beer". Free beer is nice, but freedom is even nicer.

I have been working with companies from different sections including consumer electronics, military applications, automotive and aeronautics. One common question, regardless of section, is "Can we really use Open Source in our product?". The answer is usually Yes, you can, but....

One common misunderstanding is to interpret Open Source as in free beer. This is kind of true for some Open Source, but that is nothing you can take for granted. The "rules for how the code may be used is specified by its license.

Of those who think they had understood the difference, there is a common misunderstanding that no Open Source software is free and does not belong in any commercial products. Both misunderstandings are of course wrong, but you have to make sure that you understand the licenses you are using.

Before you start to work with any source code (not only Open Source) you always have to take the license into consideration. If do your homework you can avoid surprises and practical implications that otherwise can cause you a delayed project or legal inconveniences.

In short, you have to know what you are doing, and that should not differ from other parts of your development.

Open Source Licenses

"Open source licenses are licenses that comply with the Open Source Definition — in brief, they allow software to be freely used, modified, and shared. To be approved by the Open Source Initiative (also known as the OSI), a license must go through the Open Source Initiative's license review process."

This text is taken from the Open Source Initiative webpage [4], which is an organization that works with defining criterea for Open Source and certificate licenses that comply with OSD (Open Source Definition).

Open Source Definition

Many licenses [5] are certified and may have different requirements for its users, but they all comply with these "rules":

Free Redistribution

The license shall not restrict any party from selling or giving away the software as a component of an aggregate software distribution containing programs from several different sources. The license shall not require a royalty or other fee for such sale.

Source Code

The program must include source code, and must allow distribution in source code as well as compiled form. Where some form of a product is not distributed with source code, there must be a well-publicized means of obtaining the source code for no more than a reasonable reproduction cost, preferably downloading via the Internet without charge. The source code must be the preferred form in which a programmer would modify the program. Deliberately obfuscated source code is not allowed. Intermediate forms such as the output of a preprocessor or translator are not allowed.

Derived Works

The license must allow modifications and derived works, and must allow them to be distributed under the same terms as the license of the original software.

Integrity of The Author's Source Code

The license may restrict source-code from being distributed in modified form only if the license allows the distribution of "patch files" with the source code for the purpose of modifying the program at build time. The license must explicitly permit distribution of software built from modified source code. The license may require derived works to carry a different name or version number from the original software.

No Discrimination Against Persons or Groups

The license must not discriminate against any person or group of persons.

No Discrimination Against Fields of Endeavor

The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research.

Distribution of License

The rights attached to the program must apply to all to whom the program is redistributed without the need for execution of an additional license by those parties.

License Must Not Be Specific to a Product

The rights attached to the program must not depend on the program's being part of a particular software distribution. If the program is extracted from that distribution and used or distributed within the terms of the program's license, all parties to whom the program is redistributed should have the same rights as those that are granted in conjunction with the original software distribution.

License Must Not Restrict Other Software

The license must not place restrictions on other software that is distributed along with the licensed software. For example, the license must not insist that all other programs distributed on the same medium must be open-source software.

License Must Be Technology-Neutral

No provision of the license may be predicated on any individual technology or style of interface.

GPL

GPL, or General Public License, is one of the most common Open Source Licenses you will find out there. At least version 2, GPLv2, is something you will encounter for sure if you intend to build an embedded Linux system as the kernel [6] itself is using this license.

GPLv2

So what do you need to comply with GPLv2 code? Basically, you need to provide the source code for all GPLv2 licensed code. Yes, that includes all your modifications too, and this part could seem scary at the first glare.

But will you need to make any changes? Probably. If you want to run Linux on your system you will probably have to make some adaptions to the Linux kernel specific for your board, those changes will follow the GPLv2 license and should be provided as well.

The license is stated as follows:

"The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. "

GPLv2, Sec.3 [2]

If any of those changes is your top-secret algorithm then you have done it wrong by design anyway. Please note that no installation information is required at all which makes it more sufficient for embedded devices.

Tivoization

Tivoization, TiVO ran GPLv2 only Linux kernel, but had HW signature keys that made it possible to only run signed kernels. Even if the TiVO did provide the kernel code, the TiVO customers could not build and install the firmware.

The Free Software Foundation(FSF) found this objectionable as it violates one of the purposes the GPLv2 license had. So FSF ended up with GPLv3 to solve this.

GPLv3

/media/gplv3.png

(Yes, this logo is under Public Domain [7] )

One big difference between v2 and v3 is this part

" “Installation Information” for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source. The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made. "

GPLv3, Sec .6 [1]

Which states that "Installation information" must be provided together with the source code, in short, you have to provide instruction for an end-user to build and replace GPLv3 parts of your product. But there are also a few exceptions. Most of them is more or less hard to make any use of in a real world product.

Exception 1

It only requred for "User Products". It is hard to say if this is an exception or not as most of the products that use GPLv3 is user products. But the license states that it only affects User products. Consult your lawyer as it is not entirely clear what a "User product" really are.

Exception 2

Only if device is sold or offered for long-term lease. As with all lega stuff, things are a bit unclear. Does remote access or temporary possessions qualify for example?

Please note that even long term lease need you to provide installation information.

Exception 3

If non-modifiable by anyone, you don't have to give the information how to reinstalling or modify binaries.

If you want to be able to update your software, then you will probably need to provide the "installation information".

Exception 4

You can void any warranties if binaries are modified, but you can't prevent reinstallation of modigied binaries.

Conclusion

There is a reason why an author of code chose to use any particular license, and it is important (both for principal and legal reasons) to respect that. Some licenses are more or less appropriate for specific products, but the general rule I follow is to avoid any GPLv3 (both GPLv3 and LGPLv3) licensed software in any embedded system as it hard to be fully compliant. The installation information is often something that companies want to keep for themself, with all rights.

What is my opinion about this? Well, I do like to have the freedom to install whatever software I want in the products I own, but there are circumstances where I'm not sure if it is a good idea when it comes to safety and liability. If I buy a car in second hand, I don't want the software for my airbag or braking system to be "fixed" by some random guy. I think that the Open Source has limited use in too restrictive licenses, and that is somehow counterproductive for the Open Source itself.

V4L2 and media controller

V4L2 and media controller

The media infrastructure in the kernel is a giant beast handling many different types of devices involving different busses and electrical interfaces. Providing an interface to handle the complexity of the hardware is not an easy task. Most devices have multiple ICs with different communication protocols... so the device drivers tends to be very complex as well.

Video For Linux 2 (V4L2) is the interface for such media devices. V4L2 is the second version of V4L and is not really compatible with V4L, even if there is a compatibility mode but the support is more than often incomplete. The name Video4Linux is a counterpart to Video4Windows, but is not technically related to it at all.

Here is an example of what a system may look like (borrowed from the Linux kernel documentation)

/media/typical_media_device.svg

Media controller

System-on-Chips (SoC) devices often provides wide range of hardware blocks that can be interconneced in a variety of ways to obtain the desired functionality. To configure these hardware blocks, the kernel provides the Media Controller kernel API which expose detailed information about the media device and let them to be interconnected in a dynamic and complex way at runtime, all from userspace.

Each hardware block, called entity, in the media controller framework has one or more source and sink pads. The API let the user link source to sink pads and set the format of pads.

Here is a the topology exported from my sabresd with an imx219 (camera module) connected:

/media/media-ctl-graph.png

Let's go through the entities in the picture. All these entities is of course specific for the iMX6 SoC. (Partly taken from the kernel documentation)

imx219 1-0010

This is the camera sensor. The sensor is controlled with I2C commands and the data stream is over the MIPI CSI-2 interface. The name tells us that the sensor is connected to I2C bus 1. The device has the address 0x10.

The entity has one source pad.

imx6-mipi-csi2

This is the MIPI CSI-2 receiver entity. It has one sink pad to receive the MIPI CSI-2 stream (usually from a MIPI CSI-2 camera sensor). It has four source pads, corresponding to the four MIPI CSI-2 demuxed virtual channel outputs. Multiple source pads can be enabled to independently stream from multiple virtual channels.

ipuX_csiY_mux

This is the video multiplexers. They have two or more sink pads to select from either camera sensors with a parallel interface, or from MIPI CSI-2 virtual channels from imx6-mipi-csi2 entity. They have a single source pad that routes to a CSI (ipuX_csiY entities).

ipuX_csiY

These are the CSI entities. They have a single sink pad receiving from either a video mux or from a MIPI CSI-2 virtual channel as described above.

ipuX_vdic

The VDIC carries out motion compensated de-interlacing, with three motion compensation modes: low, medium, and high motion. The mode is specified with the menu control V4L2_CID_DEINTERLACING_MODE. The VDIC has two sink pads and a single source pad.

ipuX_ic_prp

This is the IC pre-processing entity. It acts as a router, routing data from its sink pad to one or both of its source pads.

The direct sink pad receives from an ipuX_csiY direct pad. With this link the VDIC can only operate in high motion mode.

ipuX_ic_prpenc

This is the IC pre-processing encode entity. It has a single sink pad from ipuX_ic_prp, and a single source pad. The source pad is routed to a capture device node, with a node name of the format "ipuX_ic_prpenc capture".

This entity performs the IC pre-process encode task operations: color-space conversion, resizing (downscaling and upscaling), horizontal and vertical flip, and 90/270 degree rotation. Flip and rotation are provided via standard V4L2 controls.

Like the ipuX_csiY IDMAC source, this entity also supports simple de-interlace without motion compensation, and pixel reordering.

ipuX_ic_prpvf

This is the IC pre-processing viewfinder entity. It has a single sink pad from ipuX_ic_prp, and a single source pad. The source pad is routed to a capture device node, with a node name of the format "ipuX_ic_prpvf capture".

This entity is identical in operation to ipuX_ic_prpenc, with the same resizing and CSC operations and flip/rotation controls. It will receive and process de-interlaced frames from the ipuX_vdic if ipuX_ic_prp is receiving from ipuX_vdic.

Capture video stream from sensor

In order to capture a video stream from the sensor we need to:

  1. Create links between the needed entities
  2. Configure pads to hold the correct image format

To do this, we use the media-ctl [1] tool.

Configure pads

We also need to configure each pad to the right format. This image sensor is ouput in raw bayer format (SRGGB8).

export fmt=SRGGB8_1X8/640x480
media-ctl --set-v4l2 "'imx219 1-0010':0[fmt:$fmt field:none]"
media-ctl --set-v4l2 "'imx6-mipi-csi2':1[fmt:$fmt field:none]"
media-ctl --set-v4l2 "'ipu1_csi0_mux':5[fmt:$fmt field:none]"
media-ctl --set-v4l2 "'ipu1_csi0':1[fmt:$fmt field:none]"

Stream to framebuffer

Now a full pipe is created from imx219 to the video0 device.

GStreamer is a handy multimedia framework that we can use to test the full chain

gst-launch-1.0 -vvv v4l2src device=/dev/video0 io-mode=dmabuf blocksize=76800 ! "video/x-bayer,format=rggb,width=640,height=480,framerate=30/1" ! queue ! bayer2rgbneon ! videoconvert ! fbdevsink sync=false
/media/imx219.jpg

Parsing command line options

Parsing command line options

Parsing command line options is something allmost every command or applications needs to handle in some way, and there is too many home-made argument parsers out there. As so many programs needs to parse options from the command line, this facility is encapsulated in a standard library function getopt(2).

The GNU C library provides an even more sophisticated API for parsing the command line, argp(), and is described in the glibc manual [1]. However, this function is not portable.

There is also many libraries that provides such facilities, but lets keep us to what the glibc library provides.

Command line options

A typical UNIX command takes options in the following form

command [option] arguments

The options has the form of a hyphen (-) followed by a unique character and a possible argument. If the options take an argument, it may be separated from that argument by a white space. When multiple options is specified, those can be grouped after a single hyphen, and the last option in the group may be the only one that takes an argument.

Example on single option

ls -l

Example on grouped options

ls -lI *hidden* .

In the example above, the -l (long listing format) does not takes an argument, but -I (Ignore) takes *hidden* as argument.

Long options

It is not unusual that a command allows both a short (-I) and a long (--ignore) option syntax. A long option begins with two hyphens, and the option itself is identified using a word. If the options take an argument, it may be separated from that argument by a =.

To parse such options, use the getopt_long(2) glibc function, or the (nonportable) argp().

Example using getopt_long()

getopt_long() is quite simple to use. First we create a struct option and defines the following elements: * name is the name of the long option.

  • has_arg
    is: no_argument (or 0) if the option does not take an argu‐ ment; required_argument (or 1) if the option requires an argu‐ ment; or optional_argument (or 2) if the option takes an optional argument.
  • flag
    specifies how results are returned for a long option. If flag is NULL, then getopt_long() returns val. (For example, the calling program may set val to the equivalent short option character.) Otherwise, getopt_long() returns 0, and flag points to a variable which is set to val if the option is found, but left unchanged if the option is not found.
  • val
    is the value to return, or to load into the variable pointed
    to by flag.

The last element of the array has to be filled with zeros,

The next step is to iterate through all options and take care of the arguments.

Example code

Example code

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <getopt.h>

struct arguments
{
    int a;
    int b;
    int c;
    int area;
    int perimeter;
};


void print_usage() {
    printf("Usage: triangle [Ap] -a num -b num -c num\n");
}

int main(int argc, char *argv[]) {
    int opt= 0;
    struct arguments arguments;

    /* Default values. */
    arguments.a = -1;
    arguments.b = -1;
    arguments.c = -1;
    arguments.area = 0;
    arguments.perimeter = 0;


    static struct option long_options[] = {
        {"area",      no_argument,       0,  'A' },
        {"perimeter", no_argument,       0,  'p' },
        {"hypotenuse",required_argument, 0,  'c' },
        {"opposite",  required_argument, 0,  'a' },
        {"adjecent",  required_argument, 0,  'b' },
        {0,           0,                 0,  0   }
    };

    int long_index =0;
    while ((opt = getopt_long(argc, argv,"Apa:b:c:",
                   long_options, &long_index )) != -1) {
        switch (opt) {
             case 'A':
                 arguments.area = 1;
                 break;
             case 'p':
                arguments.perimeter = 1;
                 break;
             case 'a':
                 arguments.a = atoi(optarg);
                 break;
             case 'b':
                 arguments.b = atoi(optarg);
                 break;
             case 'c':
                 arguments.c = atoi(optarg);
                 break;
             default: print_usage();
                 exit(EXIT_FAILURE);
        }
    }


    if (arguments.a == -1 || arguments.b == -1 || arguments.c == -1) {
        print_usage();
        exit(EXIT_FAILURE);
    }

    if (arguments.area) {
        arguments.area = (arguments.a*arguments.b)/2;
        printf("Area: %d\n",arguments.area);
    }

    if (arguments.perimeter) {
        arguments.perimeter = arguments.a + arguments.b + arguments.c;
        printf("Perimeter: %d\n",arguments.perimeter);
    }

    return 0;
}

Example of usages

Full example with short options

[13:49:00]marcus@little:~/tmp/cmdline$ ./getopt  -Ap -a 3 -b 4 -c 5
Area: 6
Perimeter: 12

Missing -c option

[14:07:37]marcus@little:~/tmp/cmdline$ ./getopt  -Ap -a 3 -b 4
Usage: triangle [Ap] -a num -b num -c num

Full example with long options

[14:09:38]marcus@little:~/tmp/cmdline$ ./getopt  --area --perimeter --opposite 3 --adjecent 4 --hypotenuse 5
Area: 6
Perimeter: 12

Invalid options

[14:10:14]marcus@little:~/tmp/cmdline$ ./getopt  --area --perimeter --opposite 3 --adjecent 4 -j=3
./getopt: invalid option -- 'j'
Usage: triangle [Ap] -a num -b num -c num

Full example with mixed syntaxes

[14:09:38]marcus@little:~/tmp/cmdline$ ./getopt  -A --perimeter --opposite=3 -b4 -c 5
Area: 6
Perimeter: 12

Variants

getopt_long_only() is like getopt_long(), but '-' as well as "--" can indicate a long option. If an option that starts with '-' (not "--") doesn't match a long option, but does match a short option, it is parsed as a short option instead.

Example using argp()

argp() is a more flexible and powerful than getopt() with friends, but it is not part of the POSIX standard and is therefr not portable between different POSIX-compatible operating systems. However, argp() provides a few interresting features that getopt() does not.

These features include automatically producing output in response to the ‘--help’ and ‘--version’ options, as described in the GNU coding standards. Using argp makes it less likely that programmers will neglect to implement these additional options or keep them up to date.

The implementation is pretty much straigt forwards and similiar to getopt() with a few notes.

const char *argp_program_version = Triangle 1.0";
const char *argp_program_bug_address = "<marcus.folkesson@combitech.se>";

Is used in automatic generation for the --help and --version options.

struct argp_option

This structure specifies a single option that an argp parser understands, as well as how to parse and document that option. It has the following fields:

  • const char *name
    The long name for this option, corresponding to the long option --name; this field may be zero if this option only has a short name. To specify multiple names for an option, additional entries may follow this one, with the OPTION_ALIAS flag set. See Argp Option Flags.
  • int key
    The integer key provided by the current option to the option parser. If key has a value that is a printable ASCII character (i.e., isascii (key) is true), it also specifies a short option ‘-char’, where char is the ASCII character with the code key.
  • const char *arg
    If non-zero, this is the name of an argument associated with this option, which must be provided (e.g., with the --name=value or -char value syntaxes), unless the OPTION_ARG_OPTIONAL flag (see Argp Option Flags) is set, in which case it may be provided.
  • int flags
    Flags associated with this option, some of which are referred to above. See Argp Option Flags.
  • const char *doc
    A documentation string for this option, for printing in help messages.

If both the name and key fields are zero, this string will be printed tabbed left from the normal option column, making it useful as a group header. This will be the first thing printed in its group. In this usage, it’s conventional to end the string with a : character.

Example code

Example code with little more comments

#include <stdlib.h>
#include <argp.h>

const char *argp_program_version = "Triangle 1.0";
const char *argp_program_bug_address = "<marcus.folkesson@combitech.se>";

/* Program documentation. */
static char doc[] = "Triangle example";

/* A description of the arguments we accept. */
static char args_doc[] = "ARG1 ARG2";

/* The options we understand. */
static struct argp_option options[] = {
    {"area",        'A',    0,  0,  "Calculate area"},
    {"perimeter",   'p',    0,  0,  "Calculate perimeter"},
    {"hypotenuse",  'c',    "VALUE",  0,  "Specify hypotenuse of the triangle"},
    {"opposite",    'b',    "VALUE",  0,  "Specify opposite of the triangle"},
    {"adjecent",    'a',    "VALUE",  0,  "Specify adjecent of the triangle"},
    { 0 }
};

/* Used by main to communicate with parse_opt. */
struct arguments
{
    int a;
    int b;
    int c;
    int area;
    int perimeter;
};

/* Parse a single option. */
static error_t parse_opt (int key, char *arg, struct argp_state *state)
{
    struct arguments *arguments = (struct arguments*)state->input;

    switch (key) {
        case 'a':
            arguments->a = atoi(arg);
            break;
        case 'b':
            arguments->b = atoi(arg);
            break;
        case 'c':
            arguments->c = atoi(arg);
            break;
        case 'p':
            arguments->perimeter = 1;
            break;
        case 'A':
            arguments->area = 1;
            break;

        default:
            return ARGP_ERR_UNKNOWN;
    }
    return 0;
}

/* Our argp parser. */
static struct argp argp = { options, parse_opt, args_doc, doc };

int
main (int argc, char **argv)
{
    struct arguments arguments;

    /* Default values. */
    arguments.a = -1;
    arguments.b = -1;
    arguments.c = -1;
    arguments.area = 0;
    arguments.perimeter = 0;

    /* Parse our arguments; every option seen by parse_opt will
     *      be reflected in arguments. */
    argp_parse (&argp, argc, argv, 0, 0, &arguments);


    if (arguments.a == -1 || arguments.b == -1 || arguments.c == -1) {
        exit(EXIT_FAILURE);
    }

    if (arguments.area) {
        arguments.area = (arguments.a*arguments.b)/2;
        printf("Area: %d\n",arguments.area);
    }

    if (arguments.perimeter) {
        arguments.perimeter = arguments.a + arguments.b + arguments.c;
        printf("Perimeter: %d\n",arguments.perimeter);
    }

    return EXIT_SUCCESS;
}

Example of usages

This application gives the same output as the getopt() usage, with the following extra features:

The options --help, --usage and --version is automaically generated

[15:53:04]marcus@little:~/tmp/cmdline$ ./argp --help
Usage: argp [OPTION...] ARG1 ARG2
Triangle example

  -a, --adjecent=VALUE       Specify adjecent of the triangle
  -A, --area                 Calculate area
  -b, --opposite=VALUE       Specify opposite of the triangle
  -c, --hypotenuse=VALUE     Specify hypotenuse of the triangle
  -p, --perimeter            Calculate perimeter
  -?, --help                 Give this help list
      --usage                Give a short usage message
  -V, --version              Print program version

Mandatory or optional arguments to long options are also mandatory or optional
for any corresponding short options.

Report bugs to <marcus.folkesson@combitech.se>.

Version information

[15:53:08]marcus@little:~/tmp/cmdline$ ./argp --version
Triangle 1.0

Conclusion

Parsing command line options is simple. argp() provides a log of features that I really appreciate.

When portability is no issue, I allways go for argp() as, besides the extra features, the interface is more appealing.

Embedded Linux Conference 2019

Embedded Linux Conference 2019

Here we go again! This trip got exited even before it begun. I checked my passport the day before we should leave and noticed that my passport has expired. Outch. Fortunataly I was able to get a temporary passport at the airport. I must admit that I'm not traveling that often and do not have these 'must-checks' in my muscle memory..

This time we were heading Lyon in France. The weather is not the best but at least it is not freezing cold as in Sweden as it is this time of the year.

The conference

The conference this year is good as usual. Somehow, my attendence focus has switched from the technical talks to actually connect and talk to people. Of course, I have a group of people that I allways meet (it is mostly the same people that shows up on the conferences, after all), but I have met far more people than I used to. Am I beginning to be social? Anyway, as said before, it is fun to have a face on the patches I've reviewed or got comments on.

The talks

I mostly go for the "heavy technical" talks, but the talk I appreciated most this year had a very low technical level. The talk was performed by Gardena [1] that is doing gardening tools. Yes, water hoses and stuff. They described their journey from a product family that historically have no software at all, to a full blown embedded Linux system with all the legal implications that you can encounter with open source licenses. Gardena became sued for breaking the GPL license, which could be a very costly story. What Gardena did was absolutely the best way to handle it, and it was really nice to hear about. The consensus is that Gardena now have a public github account [2] containing the software for their Garden Gateway products [3]. Support for the SoC that is used were using is not only published, but also mainlined(!!).

Gardena hired contractors from Denx [4] for mainlining U-Boot and Linux kernel and also hired the maintainer of the radiochip that they were using. Thanks to this, all open parts of their product is mainlined and Gardena even managed to get the radio certified, which have helped at least two other companies.

Hiring the right folks for the right tasks was really the best thing Gardena could do. The radiochip maintainer fixed their problem in 48 man-hour, something that could take months to fix for Gardena. The estimated cost of all these "mainlining work" was only 10% of their budget, which is really nothing. It also made it possible for Gardena to put their products on the market in time. One big bonus is that the maintainence is far mor easy when the code is mainlined.

This is also how we work on Combitech. Linux is not part of our customers "core competence", and it really should not be. Linux is our core competence, that is why our customers let us take care of "our thing", IOW Linux development.

But why does all this make me so happy? First of all, the whole Open Source Community is really a big thing to me. It has influenced both my career choices and my view on software. In fact, I'm not even sure that I had enjoyed programming without Open Source.

So, Gardena, Keep up the good work!

/media/elce2019.jpg

ligsegfault.so

libsegfault.so

The dynamic linker [1] in a Linux system is using several environment variables to customize it's behavior. The most commonly used is probably LD_LIBRARY_PATH which is a list of directories where it search for libraries at execution time. Another variable I use quite often is LD_TRACE_LOADED_OBJECTS to let the program list its dynamic dependencies, just like ldd(1).

For example, consider the following output

$ LD_TRACE_LOADED_OBJECTS=1 /bin/bash
    linux-vdso.so.1 (0x00007ffece29e000)
    libreadline.so.7 => /usr/lib/libreadline.so.7 (0x00007fc9b82d1000)
    libdl.so.2 => /usr/lib/libdl.so.2 (0x00007fc9b80cd000)
    libc.so.6 => /usr/lib/libc.so.6 (0x00007fc9b7d15000)
    libncursesw.so.6 => /usr/lib/libncursesw.so.6 (0x00007fc9b7add000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fc9b851f000)
    libtinfo.so.6 => /usr/lib/libtinfo.so.6 (0x00007fc9b78b1000)

LD_PRELOAD

LD_PRELOAD is a list of additional shared objects that should be loaded before all other dynamic dependencies. When the loader is resolving symbols, it sequentially walk through the list of dynamic shared objects and takes the first match. This makes it possible to overide functions in other shared objects and change the behavior of the application completely.

Consider the following example

$ LD_PRELOAD=/usr/lib/libSegFault.so LD_TRACE_LOADED_OBJECTS=1 /bin/bash
    linux-vdso.so.1 (0x00007ffc73f61000)
    /usr/lib/libSegFault.so (0x00007f131c234000)
    libreadline.so.7 => /usr/lib/libreadline.so.7 (0x00007f131bfe6000)
    libdl.so.2 => /usr/lib/libdl.so.2 (0x00007f131bde2000)
    libc.so.6 => /usr/lib/libc.so.6 (0x00007f131ba2a000)
    libncursesw.so.6 => /usr/lib/libncursesw.so.6 (0x00007f131b7f2000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f131c439000)
    libtinfo.so.6 => /usr/lib/libtinfo.so.6 (0x00007f131b5c6000)

Here we have preloaded libSegFault and it is listed in second place. In the first place we have linux-vdso.so.1 which is a Virtual Dynamic Shared Object provided by the Linux kernel. The VDSO deserves it's own separate blog post, it is a cool feature that maps kernel code into the a process's context as a .text segment in a virtual library.

libSegFault.so

libSegFault.so is part of glibc [2] and comes with your toolchain. The library is for debugging purpose and is activated by preload it at runtime. It does not actually overrides functions but register signal handlers in a constructor (yes, you can execute code before main) for specified signals. By default only SIGEGV (see signal(7)) is registered. These registered handlers print a backtrace for the applicaton when the signal is delivered. See its implementation in [3].

Set the environment variable SEGFAULT_SIGNALS to explicit select signals you want to register a handler for.

/media/libsegfault.png

This is an useful feature for debug purpose. The best part is that you don't have to recompile your code.

libSegFault in action

Our application

Consider the following in real life application taken directly from the local nuclear power plant:

void handle_uranium(char *rod)
{
    *rod = 0xAB;
}

void start_reactor()
{
    char *rod = 0x00;
    handle_uranium(rod);
}

int main()
{
    start_reactor();
}

The symptom

We are seeing a segmentation fault when operate on a particular uranium rod, but we don't know why.

Use libSegFault

Start the application with libSegFault preloaded and examine the dump:

$ LD_PRELOAD=/usr/lib/libSegFault.so ./powerplant
*** Segmentation fault
Register dump:

 RAX: 0000000000000000   RBX: 0000000000000000   RCX: 0000000000000000
 RDX: 00007ffdf6aba5a8   RSI: 00007ffdf6aba598   RDI: 0000000000000000
 RBP: 00007ffdf6aba480   R8 : 000055d2ad5e16b0   R9 : 00007f98534729d0
 R10: 0000000000000008   R11: 0000000000000246   R12: 000055d2ad5e14f0
 R13: 00007ffdf6aba590   R14: 0000000000000000   R15: 0000000000000000
 RSP: 00007ffdf6aba480

 RIP: 000055d2ad5e1606   EFLAGS: 00010206

 CS: 0033   FS: 0000   GS: 0000

 Trap: 0000000e   Error: 00000006   OldMask: 00000000   CR2: 00000000

 FPUCW: 0000037f   FPUSW: 00000000   TAG: 00000000
 RIP: 00000000   RDP: 00000000

 ST(0) 0000 0000000000000000   ST(1) 0000 0000000000000000
 ST(2) 0000 0000000000000000   ST(3) 0000 0000000000000000
 ST(4) 0000 0000000000000000   ST(5) 0000 0000000000000000
 ST(6) 0000 0000000000000000   ST(7) 0000 0000000000000000
 mxcsr: 1f80
 XMM0:  00000000000000000000000000000000 XMM1:  00000000000000000000000000000000
 XMM2:  00000000000000000000000000000000 XMM3:  00000000000000000000000000000000
 XMM4:  00000000000000000000000000000000 XMM5:  00000000000000000000000000000000
 XMM6:  00000000000000000000000000000000 XMM7:  00000000000000000000000000000000
 XMM8:  00000000000000000000000000000000 XMM9:  00000000000000000000000000000000
 XMM10: 00000000000000000000000000000000 XMM11: 00000000000000000000000000000000
 XMM12: 00000000000000000000000000000000 XMM13: 00000000000000000000000000000000
 XMM14: 00000000000000000000000000000000 XMM15: 00000000000000000000000000000000

Backtrace:
./powerplant(+0x606)[0x55d2ad5e1606]
./powerplant(+0x628)[0x55d2ad5e1628]
./powerplant(+0x639)[0x55d2ad5e1639]
/usr/lib/libc.so.6(__libc_start_main+0xea)[0x7f9852ec6f6a]
./powerplant(+0x51a)[0x55d2ad5e151a]

Memory map:

55d2ad5e1000-55d2ad5e2000 r-xp 00000000 00:13 14897704                   /home/marcus/tmp/segfault/powerplant
55d2ad7e1000-55d2ad7e2000 r--p 00000000 00:13 14897704                   /home/marcus/tmp/segfault/powerplant
55d2ad7e2000-55d2ad7e3000 rw-p 00001000 00:13 14897704                   /home/marcus/tmp/segfault/powerplant
55d2ada9c000-55d2adabd000 rw-p 00000000 00:00 0                          [heap]
7f9852c8f000-7f9852ca5000 r-xp 00000000 00:13 13977863                   /usr/lib/libgcc_s.so.1
7f9852ca5000-7f9852ea4000 ---p 00016000 00:13 13977863                   /usr/lib/libgcc_s.so.1
7f9852ea4000-7f9852ea5000 r--p 00015000 00:13 13977863                   /usr/lib/libgcc_s.so.1
7f9852ea5000-7f9852ea6000 rw-p 00016000 00:13 13977863                   /usr/lib/libgcc_s.so.1
7f9852ea6000-7f9853054000 r-xp 00000000 00:13 13975885                   /usr/lib/libc-2.26.so
7f9853054000-7f9853254000 ---p 001ae000 00:13 13975885                   /usr/lib/libc-2.26.so
7f9853254000-7f9853258000 r--p 001ae000 00:13 13975885                   /usr/lib/libc-2.26.so
7f9853258000-7f985325a000 rw-p 001b2000 00:13 13975885                   /usr/lib/libc-2.26.so
7f985325a000-7f985325e000 rw-p 00000000 00:00 0
7f985325e000-7f9853262000 r-xp 00000000 00:13 13975827                   /usr/lib/libSegFault.so
7f9853262000-7f9853461000 ---p 00004000 00:13 13975827                   /usr/lib/libSegFault.so
7f9853461000-7f9853462000 r--p 00003000 00:13 13975827                   /usr/lib/libSegFault.so
7f9853462000-7f9853463000 rw-p 00004000 00:13 13975827                   /usr/lib/libSegFault.so
7f9853463000-7f9853488000 r-xp 00000000 00:13 13975886                   /usr/lib/ld-2.26.so
7f9853649000-7f985364c000 rw-p 00000000 00:00 0
7f9853685000-7f9853687000 rw-p 00000000 00:00 0
7f9853687000-7f9853688000 r--p 00024000 00:13 13975886                   /usr/lib/ld-2.26.so
7f9853688000-7f9853689000 rw-p 00025000 00:13 13975886                   /usr/lib/ld-2.26.so
7f9853689000-7f985368a000 rw-p 00000000 00:00 0
7ffdf6a9b000-7ffdf6abc000 rw-p 00000000 00:00 0                          [stack]
7ffdf6bc7000-7ffdf6bc9000 r--p 00000000 00:00 0                          [vvar]
7ffdf6bc9000-7ffdf6bcb000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

At a first glance, the information may feel overwelming, but lets go through the most importat lines.

The backtrace lists the call chain when the the signal was delivered to the application. The first entry is on top of the stack

Backtrace:
./powerplant(+0x606)[0x55d2ad5e1606]
./powerplant(+0x628)[0x55d2ad5e1628]
./powerplant(+0x639)[0x55d2ad5e1639]
/usr/lib/libc.so.6(__libc_start_main+0xea)[0x7f9852ec6f6a]
./powerplant(+0x51a)[0x55d2ad5e151a]

Here we can see that the last executed instruction is at address 0x55d2ad5e1606. The tricky part is that the address is not absolute in the application, but virtual for the whole process. In other words, we need to calculate the address to an offset within the application's .text segment. If we look at the Memory map we see three entries for the powerplant application:

55d2ad5e1000-55d2ad5e2000 r-xp 00000000 00:13 14897704                   /home/marcus/tmp/segfault/powerplant
55d2ad7e1000-55d2ad7e2000 r--p 00000000 00:13 14897704                   /home/marcus/tmp/segfault/powerplant
55d2ad7e2000-55d2ad7e3000 rw-p 00001000 00:13 14897704                   /home/marcus/tmp/segfault/powerplant

Why three? Most ELF files (application or library) has at least three memory mapped sections: - .text, The executable code - .rodata, read only data - .data, read/write data

With help of the permissions it is possible to figure out which mapping correspond to each section.

The last mapping has rw- as permissions and is probably our .data section as it allows both write and read. The middle mapping has r-- and is a read only mapping - probably our .rodata section. The first mapping has r-x which is read-only and executable. This must be our .text section!

Now we can take the address from our backtrace and subtract with the offset address for our .text section: 0x55d2ad5e1606 - 0x55d2ad5e1000 = 0x606

Use addr2line to get the corresponding line our source code

$ addr2line -e ./powerplant -a 0x606
    0x0000000000000606
    /home/marcus/tmp/segfault/main.c:3

If we go back to the source code, we see that line 3 in main.c is

*rod = 0xAB;

Here we have it. Nothing more to say.

Conclusion

libSegFault.so has been a great help over a long time. The biggest benefit is that you don't have to recompile your application when you want to use the feature. However, you cannot get the line number from addr2line if the application is not compiled with debug symbols, but often it is not that hard to figure out the context out from a dissassembly of your application.

Embedded Linux Conference 2018

Embedded Linux Conference 2018

Ok, time for another conference. This time in Edinburgh, Scottland. My travel is limited to Edinburgh, but this city has a lot of things to see, including Edinburgh Castle, the Royal Botanic Garden, the clock that is always is 3 minutes wrong [1] and lots of more. A sidenote, yes I've tried Haggis as it is a must-try-thing and so should you. But be prepared to buy a backup-meal.

The conference this year has a few talks already on sunday. I'm going for Michael Kerrisk talks about cgroups (control groups) [2]. Michael is a great speaker as usual. His book, The Linux Programming Interface [3] is the only book that you need about system programming as it cover everything.

I was listen to a talk about CGROUPS that i appreciated a lot.

Control groups (cgroups)

cgroups is a hierchy of of processes with several controllers applied to it. These controllers restrict resources such as memory usage and cpu utilisation for a certain group.

Worth to tell is that there are currently two version of cgroups, cgroupv1 and cgroupv2. These are documented in Documentation/cgroup-v1/* and Documentation/admin-guide/cgroup-v2.rst respectively. I've been using cgroupv1 on some system configurations and I can just say that I'm glad that we have a v2. There is no consistency between controllers in v1 and the support for multiple hierchies is just messy, just to mention a few issus with v1. v2 is different. It has a design in mind (v1 was more of a design by implementation) and a set of design-rules. Even better - it has maintainers that make sure the design rules are followed. With all that in mind we hopefully won't end up with a v3... :-)

However, v2 does not have all functionality that v1 provides, but it is on its way into the kernel. v1 and v2 can coexist though, as long as they do not use the same controllers.

The conference

I enjoyed the conference and met a lot of interresting people. It is really fun to put a face on those patches I have reviewed on email :-)

/media/edinburgh.jpg