Contiguous Memory Allocator

Contiguous Memory Allocator

Introduction

I do find memory management as one of the most fascinating subsystem in the Linux kernel, and I take every chance I see to talk about it. This post is inspired by a project I'm currently working on; an embedded Linux platform with a camera connected to the CSI-2 bus.

Before we dig into which problems we could trip over, lets talk briefly about how the kernel handles memory.

Memory subsystem

The memory management subsystem handles a wide spectrum of operations which all have impact on the system performance. The subsystem is therefor divided into several parts to sustain operational efficiency and optimized resource handling for different use cases.

Such parts includes:

  • Page allocator
  • Buddy system
  • Kmalloc allocator
  • Slab caches
  • Vmalloc allocator
  • Contiguous memory allocator
  • ...

The smallest allocation unit of memory is a page frame. The Memory Management Unit (MMU) does a terrific job to arrange and map these page frames of the available physical memory into a virtual address space. Most allocations in the kernel are only virtually contiguous which is fine for the most use cases.

Some hardware/IP-blocks requires physically contiguous memory to work though. Direct Memory Access (DMA) transfers are one such case where memory (often) needs to be physically contiguous. Many DMA controllers now supports scatter-gather, which let you hand-pick addresses to make it appear to be contiguous and then let the (IO)MMU do the rest.

To make it works, it requires that the hardware/IP-blocks actually do its memory accesses through the MMU, which is not always the case.

Multimedia devices such as GPU or VPU does often requires huge blocks of physically contiguous memory and do (with exceptions, see Raspberry Pi 4 below) not make use of the (IO)MMU.

Contiguous memory

In order to meet this requirement on big chunks of physically contiguous memory we have to reserve it from the main memory during system boot.

Before CMA, we had to use the mem kernel parameter to limit how much of the system memory that should be available for allocators in the Linux system.

The memory outside this mem-region is not touched by the system and could be remapped into linear address space by the driver.

Here is the documentation for the mem kernel parameter [1]:

mem=nn[KMG]     [KNL,BOOT] Force usage of a specific amount of memory
                Amount of memory to be used in cases as follows:

                1 for test;
                2 when the kernel is not able to see the whole
                system memory;
                3 memory that lies after 'mem=' boundary is
                excluded from the hypervisor, then
                assigned to KVM guests.
                4 to limit the memory available for kdump kernel.

                [ARC,MICROBLAZE] - the limit applies only to low memory,
                high memory is not affected.

                [ARM64] - only limits memory covered by the linear
                mapping. The NOMAP regions are not affected.

                [X86] Work as limiting max address. Use together
                with memmap= to avoid physical address space collisions.
                Without memmap= PCI devices could be placed at addresses
                belonging to unused RAM.

                Note that this only takes effects during boot time since
                in above case 3, memory may need be hot added after boot
                if system memory of hypervisor is not sufficient.

The mem parameter has a few drawbacks. The driver needs details about where to get the reserved memory and the memory lie momentarily unused when the driver is not initiating any access operations.

Therefor the Contiguous Memory Allocator (CMA) was introduced to manage these reserved memory areas.

The benefits by using CMA is that this area is handled by the allocator algorithms instead of the device driver itself. This let both devices and systems to allocate and use memory from this CMA area through the page allocator for regular needs and through the DMA allocation routines when DMA capabilities is needed.

A few words about Raspberry Pi

Raspberry Pi uses a configuration (config.txt) file that is read by the GPU to initialize the system. The configuration file has many tweakable parameters and one of those are gpu_mem.

This parameter specifies how much memory (in megabytes) to reserve exclusively for the GPU. This works pretty much like the mem kernel commandline parameter described above, with the very same drawbacks. The memory reserved for GPU is not available for the ARM CPU and should be kept as low as possible that your application could work with.

One big difference between the variants of the Raspberry Pi modules is that the Raspberry Pi 4 has a GPU with its own MMU, which allows the GPU to use memory that is dynamically allocated within Linux. The gpu_mem could therfor be kept small on that platform.

The GPU is normally used for displays, 3D calculations, codecs and cameras. One important thing regarding the camera is that the default camera stack (libcamera) does use CMA memory to allocate buffers instead of the reserved GPU memory. In cases that the GPU is only for camera purposes, the gpu_mem could be kept small.

How much CMA is already reserved?

The easiest way to determine how much memory that is reserved for CMA is to consult meminfo:

# grep Cma /proc/meminfo
CmaTotal:         983040 kB
CmaFree:          612068 kB

or look at the boot log:

# dmesg | grep CMA
[    0.000000] Reserved memory: created CMA memory pool at 0x0000000056000000, size 960 MiB

Reserve memory with CMA

/media/reserved.jpg

The CMA area is reserved during boot and there are a few ways to do this.

By device tree

This is the preferred way to define CMA areas.

This example is taken from the device tree bindings documentation [2]:

reserved-memory {
    #address-cells = <1>;
    #size-cells = <1>;
    ranges;

    /* global autoconfigured region for contiguous allocations */
    linux,cma {
        compatible = "shared-dma-pool";
        reusable;
        size = <0x4000000>;
        alignment = <0x2000>;
        linux,cma-default;
    };
};

By kernel command line

The CMA area size could also be specified by the kernel command line. There are tons of references out there that states that the command line parameter is overridden by the device tree, but I thought it sounded weird so I looked it up, and the kernel command line overrides device tree, not the other way around.

At least nowadays:

static int __init rmem_cma_setup(struct reserved_mem *rmem)
{
    ...
    if (size_cmdline != -1 && default_cma) {
        pr_info("Reserved memory: bypass %s node, using cmdline CMA params instead\n",
            rmem->name);
        return -EBUSY;
    }
    ...
}

Here is the documentation for the cma kernel parameter [1]:

cma=nn[MG]@[start[MG][-end[MG]]]
                [KNL,CMA]
                Sets the size of kernel global memory area for
                contiguous memory allocations and optionally the
                placement constraint by the physical address range of
                memory allocations. A value of 0 disables CMA
                altogether. For more information, see
                kernel/dma/contiguous.c

By kernel configuration

The kernel configuration could be used to set min/max and even a percentage of how much of the available memory that should be reserved for the CMA area:

CONFIG_CMA
CONFIG_CMA_AREAS
CONFIG_DMA_CMA
CONFIG_DMA_PERNUMA_CMA
CONFIG_CMA_SIZE_MBYTES
CONFIG_CMA_SIZE_SEL_MBYTES
CONFIG_CMA_SIZE_SEL_PERCENTAGE
CONFIG_CMA_SIZE_SEL_MIN
CONFIG_CMA_SIZE_SEL_MAX
CONFIG_CMA_ALIGNMENT

Conclusion

As soon we are using camera devices with higher resolution and do the image manipulation in the VPU/GPU, we almost always have to increase the CMA area size. Otherwise we will end up with errors like this:

cma_alloc: alloc failed, req-size: 8192 pages, ret: -12

Audio and Embedded Linux

Audio and Embedded Linux

Brief

Last time I used wrote kernel drivers for the ASoC (ALSA System on Chip) subsystem, the functionality was split up into these parts:

  • Platform class driver that defines the SoC audio interface for the actual CPU itself. This includes both the DAI (Digital Audio Interface) and any potential audio muxes (e.g. i.MX6 has its AUDMUX).
  • CODEC class driver that controls the actual CODEC.
  • Machine drivers that is the magic glue between the SoC and the CODEC which connect the both interfaces. Such driver had to be written for each SoC-CODEC combination, and that does not scale very well.

Nowadays, most of the CODEC class drivers is now adapted to be described with the simple-audio-card [1] in a device tree, which will completely replace the machine drivers.

The goal with this post is to describe my work to setup a a 20W mono class-D audio amplifier to work with a i.MX8MM board.

General

The configuration of the CODEC is usually done on a I2C bus, even if other simple busses like SPI could be used as well. When configuration is sent over this simple bus, the audio data is sent over a complete different bus.

Audio data could be transferred in many different formats such as AC97, PCM or I2S.

To abstract this bus and handle it in a common way, we will just call it DAI, for Digital Audio Interface.

Different SoCs have of course different names on this as well. For example, Texas Instruments has its McASP, NXP uses SSI, Atmel SSC and so on.. We call it DAI all over.

Serial audio formats

AC97

AC97 is a commonly found interface on many PC cards, it is not that popular in embedded devices though. It is a five wire interface with:

  • A reset line
  • DATA_OUT for playback
  • SDATA_IN for capture
  • BCLK as bit clock, which is always driven by the CODEC

See the specification [4] for further reading.

I2S

I2S is a common 5 wire DAI often used in embedded systems. The TX (SDOUT) and Rx (SDIN) lines are used for audio transmission while the bit and frame clock are used for synchronization.

The signals are:

  • Master clock or system clock, often referred to as MCLK, is the clock which the other clocks is derived from. This also clock the CODEC.
  • Bit clock, often referred to as BCK or BCLK, varies depending on the sample rate.
  • Frame clock, often referred to ass LRCLK (Left-Right Clock), FCLK (Frame clock) or WCLK (Word clock).
  • Audio out, SDOUT
  • Audio In, SDIN

The relationship between BCLK and LRCLK is

bclk = (sample rate) * Nchannels * (bit depth)

Some CODECs are able to use BCLK as their only clock, which leaving MCLK as optional. The CODEC we will use does supports this and is something we have to use due to HW constraints in number of available signals that should fit in a connector.

This is an illustration of the timing on the I2S bus with 64 BCLKs per LRCLK. Borrowed from the datasheet [5]:

/media/i2s.jpg

I2S could be used with TDM format timing to support more audio channels on the same I2S bus. The timing will then look like this [5] :

/media/i2s-tdm.jpg

PCM

PCM is a 4 wire interface that is quite similar to I2S. Same same but different.

Clocks

We have several clocks such as bit clock, frame clock and master clock. It is not written in stone which endpoint of the DAI that should generate these clocks, it is up to us to decide.

Either the SoC or CODEC generates some or all of the clocks, called clock master (e.g. bit clock master or frame clock master).

It is often easiest to let the CODEC generate all clocks, but some SoCs has specialized audio PLLs for this. In our case, the SoC will be clock master.

The Hardware

The SoC

The board we are going to use is an evaluation board for a i.MX8MM module [2]. The CPU module supports two I2S busses and we are going to use one of them.

/media/sm2simx8m.jpg

The CODEC

The CODEC we will use is is the TAS5720L [3] from Texas Instruments which has been supported in mainline since v4.6.

/media/tas5720l.jpg

The TAS5720L device Serial Audio Interface (SAIF) supports a variety of audio formats including I2S, left-justified and Right justified. It also supports the time division multiplexed (TDM) format that is capable of transporting up to 8 channels of audio data on a single bus.

It uses I2C as configuration interface.

We will use I2S with TDM as DAI and I2C as configuration interface.

The Software

As we mostly got rid of the machine drivers and can describe the CODEC bindings using device tree, the setup is mostly a exercise in device tree writing rather than C.

The device tree node to setup the sound card is simple-audio-card [6].

SAI node

The Synchronous Audio Interface (SAI) module is the HW part of the i.MX8 SoC that are used to generate the digital audio.

We are going to use the SAI5 interface as it is routed from the sm2s-imx8mm module. The node is properly configured in an interface (.dtsi)-file, so we only have to enable it:

&sai5 {
    status = "okay";
};

CODEC node

The TAS5720L is connected to the I2C3 bus and respond to the slave address 0x6c. Besides the compatible and reg properties, the node also requires phandle for a 3V3 supply that supplies the digital circuitry and a phandle for the Class-D amp and analog part.

The hardware does not have such controllable supplies so we have to create fixed regulators for that:

/ {
    reg_audio_p: regulator-audio {
        compatible = "regulator-fixed";
        regulator-name = "audio power";
        pinctrl-names = "default";
        regulator-min-microvolt = <12000000>;
        regulator-max-microvolt = <12000000>;
    };

    reg_audio_d: regulator-audio {
        compatible = "regulator-fixed";
        regulator-name = "audio digital";
        pinctrl-names = "default";
        regulator-min-microvolt = <3300000>;
        regulator-max-microvolt = <3300000>;
    };
};

And the device node for the CODEC itself:

&i2c3 {

    tas5720: tas5720@6c {
            #sound-dai-cells = <0>;
            reg = <0x6c>;
            compatible = "ti,tas5720";

            dvdd-supply = <&reg_audio_d>;
            pvdd-supply = <&reg_audio_p>;
    };
};

Sound node

Now it is time to setup the sound node!

First we have to specify which audio format we intend to use by setting simple-audio-card,format to i2s.

We also have to setup the two DAIs (CPU & CODEC) that we are going to use.

This is done by creating sub nodes and refer to the SAI module node and CODEC node as sound-dai respectively.

These sub nodes are referred to when assign frame-master and bitclock-master in the sound node. As we want the SoC to generate both frame- and bit-clock, set cpudai as clock master for both.

/ {
    sound-tas5720 {
        compatible = "simple-audio-card";
        simple-audio-card,name = "tas5720-audio";
        simple-audio-card,format = "i2s";
        simple-audio-card,frame-master = <&cpudai>;
        simple-audio-card,bitclock-master = <&cpudai>;

        cpudai: simple-audio-card,cpu {
            sound-dai = <&sai5>;
            clocks = <&clk IMX8MM_CLK_SAI5_ROOT>;

        };

        simple-audio-card,codec {
            sound-dai = <&tas5720>;
            clocks = <&clk IMX8MM_CLK_SAI5_ROOT>;
        };
    };
};

Sound test

Now we should have everything in place!

Lets use speaker-test, which is part of alsa-utils [8] to test our setup.

root@imx8board:~# speaker-test

speaker-test 1.2.5.1

Playback device is default
Stream parameters are 44000Hz, S16_LE, 1 channels
Using 16 octav es of pink noise
[   12.257438] fsl-sai 30050000.sai: failed to derive required Tx rate: 1411200

That did not turn out well.

Debug clock signals

Lets look what our clock tree looks like:

root@imx8board:~# cat /sys/kernel/debug/clk/clk_summary
    ...
    audio_pll2_ref_sel                0        0        0    24000000          0     0  50000
       audio_pll2                     0        0        0   361267200          0     0  50000
          audio_pll2_bypass           0        0        0   361267200          0     0  50000
             audio_pll2_out           0        0        0   361267200          0     0  50000
    audio_pll1_ref_sel                0        0        0    24000000          0     0  50000
       audio_pll1                     0        0        0   393216000          0     0  50000
          audio_pll1_bypass           0        0        0   393216000          0     0  50000
             audio_pll1_out           0        0        0   393216000          0     0  50000
                sai5                  0        0        0    24576000          0     0  50000
                   sai5_root_clk       0        0        0    24576000          0     0  50000
    ...

The sai5 clock is running at 24576000Hz, and indeed, it is hard to find a working clock divider to get 1411200Hz.

audio_pll2 @ 361267200 looks better. 361267200/1411200=256, allmost perfect!

Then we need to reparent the sai5 module, this is done in the device tree as well:

&sai5 {
    status = "okay";
    assigned-clock-parents = <&clk IMX8MM_AUDIO_PLL2_OUT>;
    assigned-clock-rates = <11289600>;
};

Here is our new clock tree:

root@imx8board:~# cat /sys/kernel/debug/clk/clk_summary
    ...
    audio_pll2_ref_sel                0        0        0    24000000          0     0  50000
       audio_pll2                     0        0        0   361267200          0     0  50000
          audio_pll2_bypass           0        0        0   361267200          0     0  50000
             audio_pll2_out           0        0        0   361267200          0     0  50000
                sai5                  0        0        0    11289600          0     0  50000
                   sai5_root_clk       0        0        0    11289600          0     0  50000
    ...

We can see that the frequency is right and also that we now derive our clock from audio_pll2_out instead of audio_pll1.

The speaker-test software is also happier:

root@imx8board:~# speaker-test

speaker-test 1.2.5.1

Playback device is default
Stream parameters are 44000Hz, S16_LE, 1 channels
Using 16 octaves of pink noise
Rate set to 44000Hz (requested 44000Hz)
Buffer size range from 3840 to 5760
Period size range from 1920 to 1920
Using max buffer size 5760
Periods = 4
was set period_size = 1920
was set buffer_size = 5760
 0 - Front Left

Great!

Use BCLK as MCLK

Due to my hardware constraints, I need to use the bit clock as master clock. If we look in the datasheet [5] :

/media/tas5720-1.png

If the BCLK to LRCLK ratio is 64, we could tie MCLK directly to our BCLK!

We already know our BCLK, it is 1411200Hz, and the frame clock (LRCLK) is the same as the sample rate (44kHz). We could verify that with the oscilloscope.

Bitclock:

/media/bitclock1.png

Frameclock:

/media/frameclock.png

That is not a ratio of 64.

There is not much to do about the frame clock, it will stick to the sample rate. If we make use of TDM though, we can make the bit clock running faster with the same frame clock!

Lets add 2 TDM slots @ 32bit width:

/ {
    sound-tas5720 {
        compatible = "simple-audio-card";
        simple-audio-card,name = "tas5720-audio";
        simple-audio-card,format = "i2s";
        simple-audio-card,frame-master = <&cpudai>;
        simple-audio-card,bitclock-master = <&cpudai>;

        cpudai: simple-audio-card,cpu {
            sound-dai = <&sai5>;
            clocks = <&clk IMX8MM_CLK_SAI5_ROOT>;
            dai-tdm-slot-num = <2>;
            dai-tdm-slot-width = <32>;
        };

        simple-audio-card,codec {
            sound-dai = <&tas5720>;
            clocks = <&clk IMX8MM_CLK_SAI5_ROOT>;
        };
    };
};

Verify the bitclock:

/media/bitclock1.png

Lets calculate: 2820000/44000 ~= 64! We have reached our goal!

Final device tree setup

This is what the final device tree looks like:

/ {
    sound-tas5720 {
        compatible = "simple-audio-card";
        simple-audio-card,name = "tas5720-audio";
        simple-audio-card,format = "i2s";
        simple-audio-card,frame-master = <&cpudai>;
        simple-audio-card,bitclock-master = <&cpudai>;

        cpudai: simple-audio-card,cpu {
            sound-dai = <&sai5>;
            clocks = <&clk IMX8MM_CLK_SAI5_ROOT>;
            dai-tdm-slot-num = <2>;
            dai-tdm-slot-width = <32>;
        };

        simple-audio-card,codec {
            sound-dai = <&tas5720>;
            clocks = <&clk IMX8MM_CLK_SAI5_ROOT>;
        };
    };

    reg_audio_p: regulator-audio {
        compatible = "regulator-fixed";
        regulator-name = "audio power";
        pinctrl-names = "default";
        regulator-min-microvolt = <12000000>;
        regulator-max-microvolt = <12000000>;
    };

    reg_audio_d: regulator-audio {
        compatible = "regulator-fixed";
        regulator-name = "audio digital";
        pinctrl-names = "default";
        regulator-min-microvolt = <3300000>;
        regulator-max-microvolt = <3300000>;
    };

};

&i2c3 {

    tas5720: tas5720@6c {
            #sound-dai-cells = <0>;
            reg = <0x6c>;
            compatible = "ti,tas5720";

            dvdd-supply = <&reg_audio_d>;
            pvdd-supply = <&reg_audio_p>;
    };
};

&sai5 {
    status = "okay";
    assigned-clock-parents = <&clk IMX8MM_AUDIO_PLL2_OUT>;
    assigned-clock-rates = <11289600>;
};

Conclusion

simple-audio-card is a flexible way to describe the audio routing and I strongly prefer this way over write a machine driver for each SoC-CODEC setup.

My example here is kept to a minimum, you probably want to add widgets and routing as well.

simple-audio-card does support rather complex setup with multiple DAI links, amplifier and such. See the device tree bindings [6] for further reading.

Debug kernel with KGDB

Debug kernel with KGDB

What is KGDB?

KGDB intend to be used as a source code level debugger on a running Linux kernel. It works with GDB and allows the user to inspect memory, variables, setup breakpoints, step lines and instructions. Pretty much the same that all application developers are used to, but for the kernel itself.

Almost every embedded Linux system does have a serial port available, and that is all that you need to connect GDB to your kernel.

One thing to keep in mind, as with all debugging, is that everything that is called timing will be messed up. It will be pretty obvious when you actually pause a running kernel that keeps up the communicates with all hardware. Especially if you have any hardware watchdogs enabled...

Compile the kernel with support for KGDB

There are a few kernel options that you have to enable in order to use KGDB:

  • CONFIG_KGDB to enable remote debugging.
  • CONFIG_KGDB_SERIAL_CONSOLE let you share a serial console with GDB.
  • CONFIG_FRAME_POINTER is used to produce more reliable stack backtraces by inserting code to preserve the frame information in registers or on the stack.
  • CONFIG_KALLSYMS_ALL to make sure that all symbols are loaded into the kernel image (i.e. symbols from all sections).
  • CONFIG_MAGIC_SYSRQ to be able to make sysrq. More about this below.

KGDBOC

KGDB over console, or kgdboc, let you use a console port as the debugging port. If we only have one serial port available, we could split the console and gdb communication using agent-proxy [2] .

Agent-proxy

To split a serial port into console and GDB communication we could use agent-proxy. Download and compile agent-proxy

git clone http://git.kernel.org/pub/scm/utils/kernel/kgdb/agent-proxy.git
cd agent-proxy
make

Launch agent-proxy

agent-proxy 4440^4441 0 /dev/ttyS0,115200

If your hardware does not support the line break sequence you have to add the -s003 option. You will find out pretty soon if it is needed - if your target continues to run after sending a break, then you should try to add it, in other words:

agent-proxy 4440^4441 0 /dev/ttyS0,115200 -s003

Where ttyS0 is the serial port on your host.

This will create two TCP sockets, one for serial console and with for GDB, which is listening to port 4440 and 4441 respectively.

Connect to the serial console with your favorite client (socat, netcat, telnet...)

telnet localhost 4440

Setup kgdboc with kernel arguments

kgdboc could be used early in the boot process if it is compiled into the kernel as a built-in (not as module), by providing the kgdboc arguments.

Add kgdboc=<tty-device>,[baud] to your command line argument, e.g.

kgdboc=ttyS0,11500

Where ttyS0 is the serial port on the target.

The kgdbwait argument stops the kernel execution and enter the kernel debugger as earliest as possible. This let you connect to the running kernel with GDB.

See kernel parameters [1] for more information.

Setup kgdboc with kernel module

If the kgdb is not compiled to be built-in but as a module, you provide the same arguments while loading the kernel

modprobe kgdboc=ttyS0,115200

Setup kgdboc in runtime using sysfs

It is also possible to enable kgdboc by echoing parameters into sysfs entries

echo ttyS0 > /sys/module/kgdboc/parameters/kgdboc

Connect GDB to a running kernel

Stop execution and wait for debugger

We have to stop the execution of the kernel in order to connect with gdb.

If gdbwait is provided as boot argument, the kernel will stop its execution and wait.

Otherwise we have to trigger this manually by using SysRq-G. This requires that the CONFIG_MAGIC_SYSRQ is enabled in your kernel config.

Your favorite serial application does probably have some keyboard combination to send SYSRQ requests (GNU Screen has "CTRL+A b" for example), otherwise you can use procfs to send the trigger

echo g > /proc/sysrq-trigger

Connect with GDB

Start GDB and provide the vmlinux from your kernel root directory, remember that you have to use the GDB that came with your toolchain. I do always use the -tui flag to start with a nice terminal user interface

aarch64-linux-gnu-gdb -tui ./vmlinux

Now, if you have a separate serial port for gdb, you could connect to it directly

(gdb) set remotebaud 115200
(gdb) target remote /dev/ttyS0

If you are using agent-proxy, then we should connct to port 4441

(gdb) target remote localhost:4441

Now you are able to set breakpoints, watch variables and use gdb as you used to.

/media/kgdb.jpg

One tip is to set a breakpoint at ksys_sync

(gdb) b ksys_sync

This let you run the sync command as trig to go-to-debug-mode.

What is libcamera and why should you use it?

What is libcamera and why should you use it

Read out a picture from camera

Once in a time, video devices was not that complex. To use a camera back then, your application software could iterated through /dev/video* devices and pick the camera that you want and then immediately start using it. You could query which pixel formats, frame rates, resolutions and all other properties that are supported by the camera. You could even easily change it if you want.

This still works for some cameras, basically every USB camera and most laptop cameras still works that way.

The problem, especially in embedded systems, is that there is no such thing as "the camera" anymore. The camera system is rather a complex pipeline of different image processing nodes that the image data traverse through to be shaped as you want. Even if the result of this pipeline will end up in a video device, you cannot configure things like cropping, resolution etc. directly on that device as you used to. Instead, you have to use the media controller API to configure and link each of these nodes to build up your pipeline.

To show how it may look like; this is a graph that I had in a previous post [3]:

/media/media-ctl-graph.png

What is libcamera?

/media/libcamera-banner.png

This is how libcamera is described on their website [1]

libcamera is an open source camera stack for many platforms with a core userspace library, and support from the Linux kernel APIs and drivers already in place.
It aims to control the complexity of embedded camera hardware by providing an intuitive API and method of separating untrusted vendor code from the open source core.

libcamera aims to encourage the development of new embedded camera applications by limiting the complexity that developers have to deal with.
The interface is designed around the way that modern embedded camera hardware works.

First time I heard about libcamera was on the Embedded Linux Conference 2019 where Jacopo Mondi had a talk [2] about the Public API for the first stable libcamera release. I have been working with cameras in several embedded Linux products and know for sure how complex [3] these little beast could be. The configuration also differ depending on which platform or camera you are using as there is no common way to setup the image pipe. You will soon have special cases for all your platform variants in your application. Which is not what we strive for.

libcamera is trying to solve this by provide one library that takes care of all that complexity for you.

For example, if you want to adjust a simple thing, say contrast, of a IMX219 camera module connected to a Raspberry Pi. To do that without libcamera, you first have to setup a proper image pipeline that takes the camera module, connect it to the several ISP (Image Signal Processing) blocks that your processor offers in order to get the right image format, resolution and so on. Somewhere between all these configuring, you realise that the camera module nor the ISPs have support for adjust the contrast. Too bad. To achieve this you have to take the image, pass it to a self-written contrast algorithm, create a gamma curve that the IPA (Image Processing Algorithm) understands and actually set gamma. Yes, the contrast is adjusted with a gamma curve for that particular camera on Raspberry Pi. ( Have a look at the implementation of that IPA block [7] for Raspberry Pi )

This is exactly the stuff libcamera understands and abstract for the user. libcamera will figure out what graph it has to build depending on what you want do to and which processing operations that are available at your various nodes. The application that is using libcamera for the video device will set contrast for all cameras and platforms. After all, that is what you wanted.

Camera Stack

As the libcamera library is fully implemented in userspace and use already existing kernel interfaces for communication with hardware, you will need no extra underlying support in terms of separate drivers or kernel support.

libcamera itself exposes several API's depending on how the application want to interface the camera. It even have a V4L2 compatiblity layer to emulate a high-level V4L2 camera device to make a smooth transition for all those V4L2 applications out there.

/media/libcamera-layer.png

Read more about the camera stack in the libcamera documentation [4].

Conclusion

I really like this project and I think we need an open-source stack that supports many platforms. This vendor-specific drivers/libraries/IPAs-situation we are in right now is not sustainable at all. It takes too much effort to evaluate a few cameras of different vendors just because all vendors has their own way to control the camera with their own closed-source and platform specific layers. Been there done that.

For those vendors that do not want to open-source their secret image processing algorithms, libcamera uses a plugin system for IPA modules which let vendors keep their secrets but still be compatible with libcamera. All open-source modules are identified based on digital signatures, while closed-source modules are instead isolated inside a sandbox environment with restricted access to the system. A Win-Win concept.

The project itself is still quite young and need more work to support more platforms and cameras, but the ground is stable. Raspberry Pi is now a common used platform, both in commercial and hobby, and the fact that Raspberry Pi Foundation has chosen libcamera as their primary camera system [8] must tell us something.

HID report descriptors and Linux

HID report descriptors and Linux

HID Devices

USB HID (Human Interface Device) device class is the type of computer peripherals that human interacts with, such as keyboards, mice, game controllers and touchscreens. The protocol is probably one of the most simple protocols in the USB specification. Even if HID was originally written for USB in mind, it works with several other transport layers. Your mouse and keyboard do probably use HID over USB, the touchscreen in your smartphone could use HID over I2C. Even Bluetooth/BLE make use of the same protocol for this type of devices.

The protocol is popular as it is so simple. The HID descriptor could be stored in ROM and the peripheral could be implemented using only a small 8-bit MCU.

Despite how simple and well defined [1] the HID specification is, implementors will still get it wrong as we will see later on. Those non compliant errors has to be fixed up in order to use the device. It is quite sad as it requires a separate driver for this even if the HID should be able to be handled in a generic way.

We will continue to focus on the HID implementation over USB.

HID Descriptors

The device itself has to identify itself and that information is stored in segments of its ROM (Read Only Memory). This segments, or descriptors as they are called, describe what type of device it is and which interface (Interface Descriptor) it exposes. Such interfaces are called classes and all devices belongs to one USB class. Such class [2] could be Printer, Video, or the one we will focus on - the HID class.

/media/hid-1.png

The HID class device descriptor defines and identifies which other descriptors present. Such other descriptors could be report descriptors and physical descriptors.

Report Descriptors

The report descriptor describes how the data that the device generates should be interpreted. For example, the report descriptor describes how to determine the button state of your mouse or the position of the touchscreen click on your smartphone.

Physical Descriptors

The physical descriptor on the other hand provides information about the physical layout of the device, e.g. what and how your body interact with the device to activate certain functions.

The big picture

There are of cause more descriptors, but most of them are part of the USB specification rather than specific for the HID devices.

/media/hid-2.png

There are much to say about the HID class, sub classes, interfaces and protocols and we will not cover them all. But just to give you a hint of what it is:

HID class

USB devices are grouped into USB classes depending on what type of device and transport requirement it is. For example an Audio device requires isohcronous data pipes which HID devices does not. HID Devices has different and much simpler data transport requirements.

Subclasses

The only subclass for the HID class is the Boot Interface Subclass. It is a small subset of the report descriptor that is easier to parse for code that does not want (or have resource to) parse a full report descriptor. BIOS is one example of such code that want to keep the complexity and footprint as small as possible, but still want to be able to use a keyboard.

Protocol

The HID protocol only has a meaning if subclass is a Boot Interface, then it is used to determine if the device is a mouse or keyboard. Otherwise the Protocol is not used.

Interfaces

The interface is a way for the device and host to communicate.HID devices use either the Control pipe or the Interrupt pipe.

Control pipes are used for:

  • Receive and respond to requests for USB control and class data
  • Transmit data when asked by the HID class driver
  • Receive data from the host

The interrupt pipe are used for:

  • Receiving asynchronous data from the device
  • Transmitting low-latency

Report descriptors

The information passed to and from the device is encapsulated in reports. These reports are organized data where the data layout is described in a report descripor. This report descriptor is one of the first items that the host requests from the device and describes how the data should be interpreted. For example, a mouse could have several buttons represented by one bit each and a wheel represented from -127 to +128. The report descriptor will give you details about which bits are mapped to which button and also which 8 bits should be used for the wheel.

All report descriptors are available to read out from sysfs. This is the report from my keyboard

[15:15:02]marcus@goliat:~$ od -t x1 -Anone  /sys/bus/usb/devices/3-12.3.2.1/3-12.3.2.1:1.0/0003:045E:00DB.0009/report_descriptor
 05 01 09 06 a1 01 05 08 19 01 29 03 15 00 25 01
 75 01 95 03 91 02 09 4b 95 01 91 02 95 04 91 01
 05 07 19 e0 29 e7 95 08 81 02 75 08 95 01 81 01
 19 00 29 91 26 ff 00 95 06 81 00 c0

Here is a another (parsed) example of what the report descriptor may look like

0x05, 0x01,        // Usage Page (Generic Desktop Ctrls)
0x09, 0x04,        // Usage (Joystick)
0xA1, 0x01,        // Collection (Application)
0xA1, 0x00,        //   Collection (Physical)
0x09, 0x30,        //     Usage (X)
0x09, 0x31,        //     Usage (Y)
0x15, 0x00,        //     Logical Minimum (0)
0x26, 0xFF, 0x07,  //     Logical Maximum (2047)
0x35, 0x00,        //     Physical Minimum (0)
0x46, 0xFF, 0x00,  //     Physical Maximum (255)
0x75, 0x10,        //     Report Size (16)
0x95, 0x02,        //     Report Count (2)
0x81, 0x02,        //     Input (Data,Var,Abs,No Wrap,Linear,Preferred State,No Null Position)
0xC0,              //   End Collection
0x75, 0x08,        //   Report Size (8)
0x95, 0x03,        //   Report Count (3)
0x81, 0x03,        //   Input (Cnst,Var,Abs)
0xC0,              // End Collection

The descriptor describes a 2 axis (X/Y) joystick controller where each axis could have an absolute value between 0 and 2047.

If we walk through the descriptor step by step.

0x05, 0x01,        // Usage Page (Generic Desktop Ctrls)

As there are a many different types of devices, these are grouped into pages. This to know how the following (Usage) entry should be intepreted.

There are a plenty of groups as seen in the reference manual [3] :

/media/hid-page.png

Our joystick belongs to the Generic Desktop Ctrls group.

The next line is the usage:

0x09, 0x04,        // Usage (Joystick)

Even mouse, keyboard and gamepads are example of Generic Desktop Ctrls as seen in the table below:

/media/hid-page.png

The next entry is the application collection:

0xA1, 0x01,        // Collection (Application)

The application collection is to make a meaningful grouping of Input, Output and Feature items. Each such grouping has at least Report Size and Report Count defined to determine how big in terms of bytes the collections is.

The next entry is the physical collection:

0xA1, 0x00,        //   Collection (Physical)

This provides information about the part or parts of the human body used to activate the controls on the device. In other words - button and knobs.

Now to the more fun part:

0x09, 0x30,        //     Usage (X)
0x09, 0x31,        //     Usage (Y)
0x15, 0x00,        //     Logical Minimum (0)
0x26, 0xFF, 0x07,  //     Logical Maximum (2047)
0x35, 0x00,        //     Physical Minimum (0)
0x46, 0xFF, 0x00,  //     Physical Maximum (255)
0x75, 0x10,        //     Report Size (16)
0x95, 0x02,        //     Report Count (2)
0x81, 0x02,        //     Input (Data,Var,Abs,No Wrap,Linear,Preferred State,No Null Position)

Here we can see that there are two axis, X/Y, that is represented by a value between 0 and 2047 (11 bits). A single axis does occupy 16 bits and is of the type Input. That is pretty much all what we need to know in order to parse this information.

What is so hard then?

These report descriptors is not that hard to follow and there is no black magic around them. Despite that, many vendors does not get these report descriptors right and keep deliver a custom driver along with their product.

I used to build and fly tri- and quadcopters, and on way to be better at flying is to use your radio transmitter connected to a simulator as training. The crashes is not that fatal nor costly that way...

I've never seen such a flight controller that actually follow the HID specification, and that makes them useless without a custom driver that can parse the mess. It is not uncommon that the actual reports from the device looks good, it is just that the report descriptor is messed up.

In that case we can write a pretty small Linux Kernel Driver that only fixup the report descriptor and then let the HID layer create and manage the device in a generic way. This is what I did for the VRC2 and HID-PXRC driver [4] which will be available in Linux 6.1.

Such driver could be as simple as (cut out from my VRC2-driver):

static __u8 vrc2_rdesc_fixed[] = {
    0x05, 0x01,        // Usage Page (Generic Desktop Ctrls)
    0x09, 0x04,        // Usage (Joystick)
    0xA1, 0x01,        // Collection (Application)
    0x09, 0x01,        //   Usage (Pointer)
    0xA1, 0x00,        //   Collection (Physical)
    0x09, 0x30,        //     Usage (X)
    0x09, 0x31,        //     Usage (Y)
    0x15, 0x00,        //     Logical Minimum (0)
    0x26, 0xFF, 0x07,  //     Logical Maximum (2047)
    0x35, 0x00,        //     Physical Minimum (0)
    0x46, 0xFF, 0x00,  //     Physical Maximum (255)
    0x75, 0x10,        //     Report Size (16)
    0x95, 0x02,        //     Report Count (2)
    0x81, 0x02,        //     Input (Data,Var,Abs,No Wrap,Linear,Preferred State,No Null Position)
    0xC0,              //   End Collection
    0x75, 0x08,        //   Report Size (8)
    0x95, 0x03,        //   Report Count (3)
    0x81, 0x03,        //   Input (Cnst,Var,Abs)
    0xC0,              // End Collection
};

static __u8 *vrc2_report_fixup(struct hid_device *hdev, __u8 *rdesc,
                unsigned int *rsize)
{
    hid_info(hdev, "fixing up VRC-2 report descriptor\n");
    *rsize = sizeof(vrc2_rdesc_fixed);
    return vrc2_rdesc_fixed;
}

static int vrc2_probe(struct hid_device *hdev, const struct hid_device_id *id)
{
    int ret;

    /*
     * The device gives us 2 separate USB endpoints.
     * One of those (the one with report descriptor size of 23) is just bogus so ignore it
     */
    if (hdev->dev_rsize == 23)
        return -ENODEV;

    ret = hid_parse(hdev);
    if (ret) {
        hid_err(hdev, "parse failed\n");
        return ret;
    }

    ret = hid_hw_start(hdev, HID_CONNECT_DEFAULT);
    if (ret) {
        hid_err(hdev, "hw start failed\n");
        return ret;
    }

    return 0;
}

static const struct hid_device_id vrc2_devices[] = {
    { HID_USB_DEVICE(USB_VENDOR_ID_VRC2, USB_DEVICE_ID_VRC2) },
    { /* sentinel */ }
};
MODULE_DEVICE_TABLE(hid, vrc2_devices);

static struct hid_driver vrc2_driver = {
    .name = "vrc2",
    .id_table = vrc2_devices,
    .report_fixup = vrc2_report_fixup,
    .probe = vrc2_probe,
};
module_hid_driver(vrc2_driver);

MODULE_AUTHOR("Marcus Folkesson <marcus.folkesson@gmail.com>");
MODULE_DESCRIPTION("HID driver for VRC-2 2-axis Car controller");
MODULE_LICENSE("GPL");

BPF for HID drivers

Benjamin Tissoires, One of the maintiners for the HID core layer, has posted his work [6] to introduce eBPF (extended Berkely Packet Filter) support for HID devics which is a really cool thing. As many devices just lack a proper report descriptor, the eBPF let you write such fixup in userspace and simply load the program into the kernel. There is still some parts missing before we can see full support for this feature, but the main part is merged and will be available in 6.1.

See the LWN article [5] for further reading.

Conclusion

HID report descriptors has been a fun subject to dig into. It is still hard to see why different vendors has so hard to follow the specification though.

I also have to thank Benjamin Tissoires for great help and support in understanding how the HID layer and HID devices works.

Industrial I/O and triggers

Industrial I/O and triggers

I've maintained a couple of IIO-drivers (MCP3911 [4] and LTC1660 [5]) for some time now and it is time to give at least the MCP3911 a face-lift.

This time the face lift includes support for:

  • Buffers
  • Triggers
  • Make the driver interrupt driven
  • Add support for setting Oversampling Ratio
  • Add support for setting PGA (Pre Gain Amplifier)

Also clean it up a bit by only using device managed resources.

What is Industrial I/O?

Industrial I/O, or IIO [1], is a subsystem that exposes sensors and actuators in a common way to userspace. The subsystem supports a range of different sensors including ADCs, IMUs, pressure sensors, light sensors, accelerometers and more. Even actuators such as DACs and amplifiers has its place in the IIO subsystem.

The hwmon [2] subsystem provides an interface for a few types of sensors as well, but the framework lack support to cover some use cases that IIO tries to solve, such as:

  • High speed sensors
  • Triggered sampling
  • Data buffering

In short, use hwmon for slow sensors and actuators, otherwise use IIO (preferred for new devices).

The IIO subsystem also provides a stable ABI for various userspace HALs which hwmon does not. libiio [3] is the official and preferred one for the IIO.

/media/iio.png

Sysfs

All IIO devices is exported by sysfs where those can be configured and read single shot values. For example, the raw value of the first channel of an ADC can be read out by:

cat /sys/bus/devices/iio:device/in_voltage0_raw

All other parameters such as oversampling ratio and scaling value is also exposed here.

Scaling value

The value you get from in_voltageX_raw is the raw value, it means that it has to be converted in order to get something meaningful out of it.

To get the value in mV you have to take the scale and offset value into account:

Value in mV = (raw + offset) * scale

All these values are exposed by sysfs in in_voltage_scale and in_voltage_offset respectively.

Triggers

Triggers can be both hardware and software based.

Example on hardware based triggers are:

  • GPIO-based interrupts

Example on software based triggers are:

  • sysfs - you can trig a data poll from userspace
  • hrtimer - let you specify the period and a High Resolution timer will be created and trig a data poll at a given frequency

CONFIG_IIO_SYSFS_TRIGGER

By enable CONFIG_IIO_SYSFS_TRIGGER you can make use of the sysfs trigger

# echo 0 > /sys/bus/iio/devices/iio_sysfs_trigger/add_trigger
# cat /sys/bus/iio/devices/iio_sysfs_trigger/trigger0/name
sysfstrig0

CONFIG_IIO_HRTIMER_TRIGGER

By enable CONFIG_IIO_HRTIMER_TRIGGER you can make use of a timer based trigger

# mount -t configfs none /sys/kernel/config
# mkdir /config/iio/triggers/hrtimer/my_50ms_trigger
# echo 2000 > /sys/bus/iio/devices/trigger0/sampling_frequency

Make use of a trigger

As long as the device supports triggers, there will be an /sys/bus/iio/devices/iio:device0/trigger/current_trigger entry. All available triggers, both hardware and software based, are located in /sys/bus/iio/devices/triggerX.

One nice feature is that one trigger can be used for multiple devices.

In order to activate a trigger for a certain device, simply write the trigger name to the current_trigger entry:

# cat /sys/bus/iio/devices/trigger0/name > /sys/bus/iio/devices/iio:device0/trigger/current_trigger

The next step is to decide and enable those channels you want to scan

# echo 1 > /sys/bus/iio/devices/iio:device0/scan_elements/in_voltage0_en
# echo 1 > /sys/bus/iio/devices/iio:device0/scan_elements/in_voltage1_en
# echo 1 > /sys/bus/iio/devices/iio:device0/scan_elements/in_timestamp_en

And finally, start the sampling process

# echo 1 > /sys/bus/iio/devices/iio:device0/buffer/enable

Now you will get the raw values for voltage0, voltage1 and the timestamp by reading from the /dev/iio:device0 device.

You will read out a stream of data from the device. Before applying the scale value to the raw data, the buffer data may be processed somehow depending on its format. The format for each channel is available as a sysfs entry as well:

# cat /sys/bus/iio/devices/iio:device0/scan_elements/in_voltage0_type
be:s24/32>>0

The buffer format for voltage0 means that each sample is 32 bits wide, does not need any shifting and should be intepreted as a signed 24-bit value.

Conclusion

The IIO subsystem is rather complex. The framework also supports events which makes it possible to trig on specific threshold values. As the subsystem is optimized for performance and the triggers makes it possible to read values at a given frequency or event, this makes a lot more use cases possible than the older hwmon interface.

FYI, the patches for MCP3911 is currently up to be merged into mainline.

GPLv2 and GPLv3

Open Source

"Free as in freedom - not as in free beer". Free beer is nice, but freedom is even nicer.

I have been working with companies from different sections including consumer electronics, military applications, automotive and aeronautics. One common question, regardless of section, is "Can we really use Open Source in our product?". The answer is usually Yes, you can, but....

One common misunderstanding is to interpret Open Source as in free beer. This is kind of true for some Open Source, but that is nothing you can take for granted. The "rules for how the code may be used is specified by its license.

Of those who think they had understood the difference, there is a common misunderstanding that no Open Source software is free and does not belong in any commercial products. Both misunderstandings are of course wrong, but you have to make sure that you understand the licenses you are using.

Before you start to work with any source code (not only Open Source) you always have to take the license into consideration. If do your homework you can avoid surprises and practical implications that otherwise can cause you a delayed project or legal inconveniences.

In short, you have to know what you are doing, and that should not differ from other parts of your development.

Open Source Licenses

"Open source licenses are licenses that comply with the Open Source Definition — in brief, they allow software to be freely used, modified, and shared. To be approved by the Open Source Initiative (also known as the OSI), a license must go through the Open Source Initiative's license review process."

This text is taken from the Open Source Initiative webpage [4], which is an organization that works with defining criterea for Open Source and certificate licenses that comply with OSD (Open Source Definition).

Open Source Definition

Many licenses [5] are certified and may have different requirements for its users, but they all comply with these "rules":

Free Redistribution

The license shall not restrict any party from selling or giving away the software as a component of an aggregate software distribution containing programs from several different sources. The license shall not require a royalty or other fee for such sale.

Source Code

The program must include source code, and must allow distribution in source code as well as compiled form. Where some form of a product is not distributed with source code, there must be a well-publicized means of obtaining the source code for no more than a reasonable reproduction cost, preferably downloading via the Internet without charge. The source code must be the preferred form in which a programmer would modify the program. Deliberately obfuscated source code is not allowed. Intermediate forms such as the output of a preprocessor or translator are not allowed.

Derived Works

The license must allow modifications and derived works, and must allow them to be distributed under the same terms as the license of the original software.

Integrity of The Author's Source Code

The license may restrict source-code from being distributed in modified form only if the license allows the distribution of "patch files" with the source code for the purpose of modifying the program at build time. The license must explicitly permit distribution of software built from modified source code. The license may require derived works to carry a different name or version number from the original software.

No Discrimination Against Persons or Groups

The license must not discriminate against any person or group of persons.

No Discrimination Against Fields of Endeavor

The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research.

Distribution of License

The rights attached to the program must apply to all to whom the program is redistributed without the need for execution of an additional license by those parties.

License Must Not Be Specific to a Product

The rights attached to the program must not depend on the program's being part of a particular software distribution. If the program is extracted from that distribution and used or distributed within the terms of the program's license, all parties to whom the program is redistributed should have the same rights as those that are granted in conjunction with the original software distribution.

License Must Not Restrict Other Software

The license must not place restrictions on other software that is distributed along with the licensed software. For example, the license must not insist that all other programs distributed on the same medium must be open-source software.

License Must Be Technology-Neutral

No provision of the license may be predicated on any individual technology or style of interface.

GPL

GPL, or General Public License, is one of the most common Open Source Licenses you will find out there. At least version 2, GPLv2, is something you will encounter for sure if you intend to build an embedded Linux system as the kernel [6] itself is using this license.

GPLv2

So what do you need to comply with GPLv2 code? Basically, you need to provide the source code for all GPLv2 licensed code. Yes, that includes all your modifications too, and this part could seem scary at the first glare.

But will you need to make any changes? Probably. If you want to run Linux on your system you will probably have to make some adaptions to the Linux kernel specific for your board, those changes will follow the GPLv2 license and should be provided as well.

The license is stated as follows:

"The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. "

GPLv2, Sec.3 [2]

If any of those changes is your top-secret algorithm then you have done it wrong by design anyway. Please note that no installation information is required at all which makes it more sufficient for embedded devices.

Tivoization

Tivoization, TiVO ran GPLv2 only Linux kernel, but had HW signature keys that made it possible to only run signed kernels. Even if the TiVO did provide the kernel code, the TiVO customers could not build and install the firmware.

The Free Software Foundation(FSF) found this objectionable as it violates one of the purposes the GPLv2 license had. So FSF ended up with GPLv3 to solve this.

GPLv3

/media/gplv3.png

(Yes, this logo is under Public Domain [7] )

One big difference between v2 and v3 is this part

" “Installation Information” for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source. The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made. "

GPLv3, Sec .6 [1]

Which states that "Installation information" must be provided together with the source code, in short, you have to provide instruction for an end-user to build and replace GPLv3 parts of your product. But there are also a few exceptions. Most of them is more or less hard to make any use of in a real world product.

Exception 1

It only requred for "User Products". It is hard to say if this is an exception or not as most of the products that use GPLv3 is user products. But the license states that it only affects User products. Consult your lawyer as it is not entirely clear what a "User product" really are.

Exception 2

Only if device is sold or offered for long-term lease. As with all lega stuff, things are a bit unclear. Does remote access or temporary possessions qualify for example?

Please note that even long term lease need you to provide installation information.

Exception 3

If non-modifiable by anyone, you don't have to give the information how to reinstalling or modify binaries.

If you want to be able to update your software, then you will probably need to provide the "installation information".

Exception 4

You can void any warranties if binaries are modified, but you can't prevent reinstallation of modigied binaries.

Conclusion

There is a reason why an author of code chose to use any particular license, and it is important (both for principal and legal reasons) to respect that. Some licenses are more or less appropriate for specific products, but the general rule I follow is to avoid any GPLv3 (both GPLv3 and LGPLv3) licensed software in any embedded system as it hard to be fully compliant. The installation information is often something that companies want to keep for themself, with all rights.

What is my opinion about this? Well, I do like to have the freedom to install whatever software I want in the products I own, but there are circumstances where I'm not sure if it is a good idea when it comes to safety and liability. If I buy a car in second hand, I don't want the software for my airbag or braking system to be "fixed" by some random guy. I think that the Open Source has limited use in too restrictive licenses, and that is somehow counterproductive for the Open Source itself.

ath10k QCA6584 and Wireless network stack

ath10k QCA6584 and Wireless network stack

ATH10K is the mac80211 wireless driver for Qualcomm Atheros QCA988x family of chips, and I'm currently working [1] with the QCA6584 chip which is an automotive graded radio chip with PHY support for the abgn+ac modes. The connection interface to the chip is SDIO which is hardly supported for now, but my friend and kernelhacker, Erik Strömdahl [2] , has got his hands dirty and is currently working on it. There has been some progress, the chip now able to scan, connect, send and receive data. There is still some issues with the link speed but that is coming.

He is also the reason for why I got interested in the network part of the kernel which is quite... big.

Even only the wireless networking subsystem is quite big, and the first you meet when you start to dig is a bunch of terms thrown up in your face. I will try to briefly describe a few of these terms that is fundamental for wireless communication.

In this post will discuss the right side of the this figure:

/media/wireless-stack.png

IEEE 802.11

We will see 802.11 a lot of times, so the first thing is to know where these numbers comes from. IEEE 802.11 is a set of specifications for implementation of wireless networking over several frequency bands. The specifications cover layer 1 (Physical) and layer 2 (Data link) of the OSI model [3].

The Linux kernel MAC subsystem register ieee80211 compliant hardware device with

int ieee80211_register_hw(struct ieee80211_hw *hw)

found in .../net/mac80211/main.c

The Management Layer (MLME)

One more thing that we need to cover is the management layer, since all other layers somehow depend on it.

There are three components in the 802.11 management architecture: - The Physical Layer Management Entity (PLME) - The System Management Entity (SME) - The MAC Layer Management Entity (MLME)

The Management layer assist you in several ways. For instance, it handle things such as scanning, authentication, beacons, associations and much more.

Scanning

Scanning is simply looking for other 802.11 compliant devices in the air. There are two types of scanning; passive and active.

Passive scanning

When performing a passive scanning, the radio is listening passively for beacons, without transmitting packages, as it moves from channel to channel and records all devices that it receives beacons from. Higher frequency bands in the ieee802.11a standard does not allow to transmit anything unless you have heard an Access Point (AP) beacon. Passive scanning is therefore the only way to be aware of the surroundings.

Active scanning

Active scanning on the other hand, is transmitting Probe Request (IEEE80211_STYPE_PROBE_REQ) management packets. This type of scanning is also walking from channel to channel, sending these probe requests management packet for each channel.

These requests is handled by ieee80211_send_probe_req() in .../net/mac80211/util.c:

void ieee80211_send_probe_req(struct ieee80211_sub_if_data *sdata,
                  const u8 *src, const u8 *dst,
                  const u8 *ssid, size_t ssid_len,
                  const u8 *ie, size_t ie_len,
                  u32 ratemask, bool directed, u32 tx_flags,
                  struct ieee80211_channel *channel, bool scan)

Authentication

The authentication procedure sends a management frame of a authentication type (IEEE80211_STYPE_AUTH). There is not only one type of authentication but plenty of them. The ieee80211 specification does only specify one mandatory authentication type; the Open-system authentication (WLAN_AUTH_OPEN). Another common authentication type is Shared key authentication (WLAN_AUTH_SHARED_KEY).

These management frames is handled by ieee80211_send_auth() in .../net/mac80211/util.c:

void ieee80211_send_auth(struct ieee80211_sub_if_data *sdata,
             u16 transaction, u16 auth_alg, u16 status,
             const u8 *extra, size_t extra_len, const u8 *da,
             const u8 *bssid, const u8 *key, u8 key_len, u8 key_idx,
             u32 tx_flags)

Open system authentication

This is the most simple type of authentication, all clients that request authentication will be authenticated. No security is involved at all.

Shared key authentication

In this type of authentication the client and AP is using a shared key, also known as Wired Equivalent Privacy (WEP) key.

Association

The association is started when the station sends management frames of the type IEEE80211_STYPE_ASSOC_REQ. In the kernel code this is handled by ieee80211_send_assoc() in .../net/mac80211/mlme.c

static void ieee80211_send_assoc(struct ieee80211_sub_if_data *sdata)

Reassociation

When the station is roaming, i.e. moving between APs within an ESS (Extended Service Set), it also sends a reassociation request to a new AP of the type IEEE802_STYPE_REASSOC_REQ. Association and reassociation has so much in common that it is both handled by ieee80211_send_assoc().

MAC (Medium Access Control)

All ieee80211 devices needs to implement the Management Layer (MLME), but the implementation could be in device hardware or software. These types of devices are divided into Full MAC device (hardware implementation) and Soft MAC device (software implementation). Most devices today are soft MAC devices.

The MAC layer can be further broken down into two pieces: Upper MAC and Lower MAC. The upper part of the MAC handle the management aspect (all that we covered in the MLME section above), and the lower part handle the time critical operations such as ACK:ing received packets.

Linux does only handle the upper part of MAC, the lower part is operated in device hardware. What we can see in the figure is that the MAC layer is separating data packets from configuration/management packets. The data packets is forwarded to the network device and will travel the same path through the network layer as data packets from all other type of network devices.

The Linux wireless subsystem consists of two major parts, where this, mac80211, is one of them. cfg80211 is the other major part.

CFG80211

cfg80211 is a configuration management service for mac80211 compliant devices. Both Full MAC and Soft MAC devices needs to implement operations to be compatible with the cfg80211 configuration interface in order to let userspace application to configure the device.

The configuration may be done with on of two interfaces, wext and nl80211.

Wireless Extension, WEXT (Legacy)

This is the legacy and ugly way to configure wireless devices. It is still supported only for backward compatibility reasons. Users of this configuration interface are wireless-tools (iwconfig, iwlist).

nl80211

nl80211 on the other hand, is a new netlink interface intended to replace the Wireless Extension (wext) interface. Users of this interface is typically iw and wpa_supplicant.

Conclusion

The whole network stack of the Linux kernel is really complex and optimized for high throughput with low latencies. In this post we only covered what support for wireless devices has complemented the stack with, which is mainly the mac80211 layer for handle all device management, and cfg80211 layer to configure the MAC layer. Packets to wireless devices is divided into data packets and configuration/managment packets. The data packets follow the same path as for all network devices, and the management packets goes to the cfg80211 layer.

Linux driver for PhoenixRC adapter

Linux driver for PhoenixRC adapter

Update: Michael Larabel on Phoronix has written a post [3] about this driver. Go ahead and read it as well!

A few years ago I used to build multi rotors, mostly quad copters and tricopters. It is a fun hobby, both building and flying is incredible satisfying. The first multi rotors i built was nicely made with CNC cutted details. They looked really nice and robust. However, with more than 100+ crashes under the belt, the last ones was made out of sticks and a food box. Easy to repair and just as fun to fly.

This hobby requires practice, and even if the most fun way to practice is by flying, it is really time consuming. A more time efficient way to practice is by a simulator, so I bought PhoenixRC [1], which is a flight simulator. It comes with an USB adapter that let you connect and fly with your own RC controller. I did not run the simulator so much. PhoenixRC is a Windows software and there was no driver for the adapter for Linux. The only instance of Windows I had was on a separate disk that layed on the shelf, but switching disk on your laptop each time you want to fly is simply not going to happened.

This new year eve (2017), my wife became ill and I got some time for my own. Far down into a box I found the adapter and plugged it into my Linux computer. Still no driver.

Reverse engineering the adapter

The reverse engineering was quite simple. It turns out that the adapter only has 1 configuration, 1 interface and 1 endpoint of in-interrupt type. This simply means that it only has an unidirectional communication path, initiated by sending an interrupt URB (USB Request Block) to the device. If you are not familiar with what configurations, interfaces and endpoints are in terms of USB, please google the USB standard specification.

The data from our URB was only 8 bytes. After some testing with my RC controller I got the following mapping between data and the channels on the controller:

data[0] = channel 1
data[1] = ? (Possibly a switch)
data[2] = channel 2
data[3] = channel 3
data[4] = channel 4
data[5] = channel 5
data[6] = channel 6
data[7] = channel 7

So I created a device driver that registered an input device with the following events:

Channel Event
1 ABS_X
2 ABS_Y
3 ABS_RX
4 ABS_RY
5 ABS_RUDDER
6 ABS_THROTTLE
7 ABS_MISC

Using a simulator

Heli-X [2] is an excellent cross platform flight simulator that runs perfect on Linux. Now I have spent several hours with a Goblin 700 Helicopter and it is just as fun as I remembered.

/media/phoenixrc.jpg

Available in Linux kernel 4.17

Of course all code is submitted to the Linux kernel and should be merged in v4.17.

get_maintainers and git send-email

get_maintainers and git send-email

Many with me prefer email as communication channel, especially for patches. Github, Gerrit and all other "nice" and "user friendly" tools that tries to "help" you to manage your submissions does not simply fit my workflow.

As you may already know, all patches to the Linux kernel is by email. scripts/get_maintainer.pl (see [1] for more info about the process) is a handy tool that takes a patch as input and gives back a bunch of emails addresses. These email addresses is usually passed to git send-email [2] for submission.

I have used various scripts to make the output from get_maintainer.pl to fit git send-email, but was not completely satisfied until I found the --to-cmd and --cc-cmd parameters to git send-email:

--to-cmd=<command>
 Specify a command to execute once per patch file which should generate patch file specific "To:" entries. Output of this command must be single email address per line. Default is the value of sendemail.tocmd configuration value.
--cc-cmd=<command>
 Specify a command to execute once per patch file which should generate patch file specific "Cc:" entries. Output of this command must be single email address per line. Default is the value of sendemail.ccCmd configuration value.

I'm very pleased with these parameters. All I have to to is to put these extra lines into my ~/.gitconfig (or use git config):

[sendemail.linux]
    tocmd ="`pwd`/scripts/get_maintainer.pl --nogit --nogit-fallback --norolestats --nol"
    cccmd ="`pwd`/scripts/get_maintainer.pl --nogit --nogit-fallback --norolestats --nom"

To submit a patch, I just type:

git send-email --identity=linux ./0001-my-fancy-patch.patch

and let --to and --cc to be populated automatically.