Capture a picture with V4L2

Capture a picture with V4L2

Brief

As we has seen before, cameras in Linux could be a complex [1] story and you have to watch every step you take to get it right. libcamera [2] does a great job to simplify this in an platform independent way and should be used whenever it's possible.

But not all cameras have a complex flow-chart. Some cameras (e.g. web cameras) are "self-contained" where the image data goes straight from the camera to the user application, without any detours through different IP blocks for image processing on its way.

/media/camera-sketch.png

The V4L2 framework is perfectly suited to those simple cameras.

When I searched around for a simple example application that explained the necessary steps to capture images from a camera, I simple could not find what I was looking for. This is my attempt to provide what I failed to find.

V4L2 user space API

Video devices is represented by character devices in a Linux system. The devices shows up as /dev/video* and supports the following operations:

  • open() - Open a video device
  • close() - Close a video device
  • ioctl() - Send ioctl commands to the device
  • mmap() - Map memory to a driver allocated buffer
  • read() - Read from video device
  • write() - Write to the device

The V4L2 API basically relies on a very large set of IOCTL commands to configure properties and behavior of the camera. The whole API is available from the following header:

#include <linux/videodev2.h>

Here is a list of the most common IOCTL commands:

  • VIDIOC_QUERYCAP - Query a list of the supported capabilities. Always query the capabilities to ensure that the camera supports the buffer mode you intend to use.
  • VIDIOC_ENUM_FMT - Enumerate supported image formats.
  • VIDIOC_G_FMT - Get the current image format.
  • VIDIOC_S_FMT - Set a new image format.
  • VIDIOC_REQBUFS - Request a number of buffers that can later be memory mapped by the user application. The application should always check the actual number that are granted as the driver may allocate mor or less than the requested.
  • VIDIOC_QUERYBUF - Get buffer information for those buffers earlier requested by VIDIOC_REQBUFS. The information could then be passed to the mmap() system call in order to map that buffer to user space.
  • VIDIOC_QBUF - Queue one of the requested buffers to make it available for the driver to fill with image data. Once the buffer is filled, it's no longer available for new data and should be dequeued by the user.
  • VIDEOC_DQBUF - Dequeue a filled buffer. The command will block if no buffer is available unless O_NONBLOCK was passed to open().
  • VIDIOC_STREAMON - Turn on streaming. Queued buffers will be filled as soon data is available.
  • VIDIOC_STREAMOFF - Turn off streaming. This command also flushes the buffer queue.

Buffer management

The V4L2 core maintain two buffer queues internally; one queue (referred to as IN) for incoming (camera->driver) image data and one (referred to as OUT) for outgoing (driver->user) image data.

Buffers are put into the IN queue via the VIDIOC_QBUF command. Once the buffer is filled, the buffer is dequeued from IN and put into the OUT queue, which where the data is available for to the user.

Whenever the user want to dequeue a buffer with VIDIOC_DQBUF, and a buffer is available, it's taken from the OUT queue and pushed to the user application. If no buffer is available the dequeue operation will wait until a buffer is filled and available unless the file descriptor is opened with O_NONBLOCK.

Video data can be pushed to userspace in a few different ways:

  • Read I/O - simply perform a read() operation and do not mess with buffers
  • User pointer - The user application allocates buffers and provide to driver
  • DMA buf - Mostly used for mem2mem devices
  • mmap - Let driver allocate buffers and mmap(2) these to userspace.

This post will *only* focus on mmap:ed buffers!

Typical workflow

We will follow these steps in order to acquire frames from the camera:

/media/v4l2-workflow.png

Query capabilities

VIDIOC_QUERYCAP is used to query the supported capabilities. What is most interesting is to verify that it supports the mode (V4L2_CAP_STREAMING) we want to work with. It's also a good manners to verify that it actually is a capture device (V4L2_CAP_VIDEO_CAPTURE) we have opened and nothing else.

The V4L2 API uses a struct v4l2_capability that is passed to the IOCTL. This structure is defined as follows:

/**
  * struct v4l2_capability - Describes V4L2 device caps returned by VIDIOC_QUERYCAP
  *
  * @driver:           name of the driver module (e.g. "bttv")
  * @card:     name of the card (e.g. "Hauppauge WinTV")
  * @bus_info:         name of the bus (e.g. "PCI:" + pci_name(pci_dev) )
  * @version:          KERNEL_VERSION
  * @capabilities: capabilities of the physical device as a whole
  * @device_caps:  capabilities accessed via this particular device (node)
  * @reserved:         reserved fields for future extensions
  */
struct v4l2_capability {
    __u8    driver[16];
    __u8    card[32];
    __u8    bus_info[32];
    __u32   version;
    __u32   capabilities;
    __u32   device_caps;
    __u32   reserved[3];
};

The v4l2_capability.capabilities field is decoded as follows:

/* Values for 'capabilities' field */
#define V4L2_CAP_VIDEO_CAPTURE              0x00000001  /* Is a video capture device */
#define V4L2_CAP_VIDEO_OUTPUT               0x00000002  /* Is a video output device */
#define V4L2_CAP_VIDEO_OVERLAY              0x00000004  /* Can do video overlay */
#define V4L2_CAP_VBI_CAPTURE                0x00000010  /* Is a raw VBI capture device */
#define V4L2_CAP_VBI_OUTPUT         0x00000020  /* Is a raw VBI output device */
#define V4L2_CAP_SLICED_VBI_CAPTURE 0x00000040  /* Is a sliced VBI capture device */
#define V4L2_CAP_SLICED_VBI_OUTPUT  0x00000080  /* Is a sliced VBI output device */
#define V4L2_CAP_RDS_CAPTURE                0x00000100  /* RDS data capture */
#define V4L2_CAP_VIDEO_OUTPUT_OVERLAY       0x00000200  /* Can do video output overlay */
#define V4L2_CAP_HW_FREQ_SEEK               0x00000400  /* Can do hardware frequency seek  */
#define V4L2_CAP_RDS_OUTPUT         0x00000800  /* Is an RDS encoder */

/* Is a video capture device that supports multiplanar formats */
#define V4L2_CAP_VIDEO_CAPTURE_MPLANE       0x00001000
/* Is a video output device that supports multiplanar formats */
#define V4L2_CAP_VIDEO_OUTPUT_MPLANE        0x00002000
/* Is a video mem-to-mem device that supports multiplanar formats */
#define V4L2_CAP_VIDEO_M2M_MPLANE   0x00004000
/* Is a video mem-to-mem device */
#define V4L2_CAP_VIDEO_M2M          0x00008000

#define V4L2_CAP_TUNER                      0x00010000  /* has a tuner */
#define V4L2_CAP_AUDIO                      0x00020000  /* has audio support */
#define V4L2_CAP_RADIO                      0x00040000  /* is a radio device */
#define V4L2_CAP_MODULATOR          0x00080000  /* has a modulator */

#define V4L2_CAP_SDR_CAPTURE                0x00100000  /* Is a SDR capture device */
#define V4L2_CAP_EXT_PIX_FORMAT             0x00200000  /* Supports the extended pixel format */
#define V4L2_CAP_SDR_OUTPUT         0x00400000  /* Is a SDR output device */
#define V4L2_CAP_META_CAPTURE               0x00800000  /* Is a metadata capture device */

#define V4L2_CAP_READWRITE              0x01000000  /* read/write systemcalls */
#define V4L2_CAP_STREAMING              0x04000000  /* streaming I/O ioctls */
#define V4L2_CAP_META_OUTPUT                0x08000000  /* Is a metadata output device */

#define V4L2_CAP_TOUCH                  0x10000000  /* Is a touch device */

#define V4L2_CAP_IO_MC                      0x20000000  /* Is input/output controlled by the media controller */

#define V4L2_CAP_DEVICE_CAPS            0x80000000  /* sets device capabilities field */

Example code on how to use VIDIOC_QUERYCAP:

void query_capabilites(int fd)
{
    struct v4l2_capability cap;

    if (-1 == ioctl(fd, VIDIOC_QUERYCAP, &cap)) {
        perror("Query capabilites");
        exit(EXIT_FAILURE);
    }

    if (!(cap.capabilities & V4L2_CAP_VIDEO_CAPTURE)) {
        fprintf(stderr, "Device is no video capture device\\n");
        exit(EXIT_FAILURE);
    }

    if (!(cap.capabilities & V4L2_CAP_READWRITE)) {
        fprintf(stderr, "Device does not support read i/o\\n");
    }

    if (!(cap.capabilities & V4L2_CAP_STREAMING)) {
        fprintf(stderr, "Devices does not support streaming i/o\\n");
    }
}

Capabilities could also be read out with v4l2-ctl:

marcus@goliat:~$ v4l2-ctl -d /dev/video4  --info
Driver Info:
    Driver name      : uvcvideo
    Card type        : USB 2.0 Camera: USB Camera
    Bus info         : usb-0000:00:14.0-8.3.1.1
    Driver version   : 6.0.8
    Capabilities     : 0x84a00001
        Video Capture
        Metadata Capture
        Streaming
        Extended Pix Format
        Device Capabilities
    Device Caps      : 0x04200001
        Video Capture
        Streaming
        Extended Pix Format

Set format

The next step after we know for sure that the device is a capture device and supports the certain mode we want to use, is to setup the video format. The application could otherwise receive video frames in a format that it could not deal with.

Supported formats can be quarried with VIDIOC_ENUM_FMT and the current video format can be read out with VIDIOC_G_FMT.

Current format could be fetched by v4l2-ctl:

marcus@goliat:~$ v4l2-ctl -d /dev/video4  --get-fmt-video
Format Video Capture:
    Width/Height      : 320/240
    Pixel Format      : 'YUYV' (YUYV 4:2:2)
    Field             : None
    Bytes per Line    : 640
    Size Image        : 153600
    Colorspace        : sRGB
    Transfer Function : Rec. 709
    YCbCr/HSV Encoding: ITU-R 601
    Quantization      : Default (maps to Limited Range)
    Flags             :

The v4l2_format struct is defined as follows:

/**
 * struct v4l2_format - stream data format
 * @type:   enum v4l2_buf_type; type of the data stream
 * @pix:    definition of an image format
 * @pix_mp: definition of a multiplanar image format
 * @win:    definition of an overlaid image
 * @vbi:    raw VBI capture or output parameters
 * @sliced: sliced VBI capture or output parameters
 * @raw_data:       placeholder for future extensions and custom formats
 * @fmt:    union of @pix, @pix_mp, @win, @vbi, @sliced, @sdr, @meta
 *          and @raw_data
 */
struct v4l2_format {
    __u32    type;
    union {
        struct v4l2_pix_format              pix;     /* V4L2_BUF_TYPE_VIDEO_CAPTURE */
        struct v4l2_pix_format_mplane       pix_mp;  /* V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE */
        struct v4l2_window          win;     /* V4L2_BUF_TYPE_VIDEO_OVERLAY */
        struct v4l2_vbi_format              vbi;     /* V4L2_BUF_TYPE_VBI_CAPTURE */
        struct v4l2_sliced_vbi_format       sliced;  /* V4L2_BUF_TYPE_SLICED_VBI_CAPTURE */
        struct v4l2_sdr_format              sdr;     /* V4L2_BUF_TYPE_SDR_CAPTURE */
        struct v4l2_meta_format             meta;    /* V4L2_BUF_TYPE_META_CAPTURE */
        __u8        raw_data[200];                   /* user-defined */
    } fmt;
};

To set you have to set the v4l2_format.type field to the relevant format.

Example code on how to use VIDIOC_S_FMT:

int set_format(int fd) {
    struct v4l2_format format = {0};
    format.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
    format.fmt.pix.width = 320;
    format.fmt.pix.height = 240;
    format.fmt.pix.pixelformat = V4L2_PIX_FMT_YUYV;
    format.fmt.pix.field = V4L2_FIELD_NONE;
    int res = ioctl(fd, VIDIOC_S_FMT, &format);
    if(res == -1) {
        perror("Could not set format");
        exit(1);
    }
    return res;
}

Request buffers

Next step once we are done with the format preparations we should allocate buffers to have somewhere to store the images.

This is exactly what VIDIOC_REQBUFS ioctl does for you. The command does take a struct v4l2_requestbuffers as argument:

struct v4l2_requestbuffers {
    __u32                   count;
    __u32                   type;           /* enum v4l2_buf_type */
    __u32                   memory;         /* enum v4l2_memory */
    __u32                   capabilities;
    __u8                    flags;
    __u8                    reserved[3];
};

Some of these fields must be populated before we can use it:

  • v4l2_requestbuffers.count - Should be set to the number of memory buffers that should be allocated. It's important to set a number high enough so that frames won't be dropped due to lack of queued buffers. The driver is the one who decides what the minimum number is. The application should always check the return value of this field as the driver could grant a bigger number of buffers than then application actually requested.
  • v4l2_requestbuffers.type - As we are going to use a camera device, set this to V4L2_BUF_TYPE_VIDEO_CAPTURE.
  • v4l2_requestbuffers.memory - Set the streaming method. Available values are V4L2_MEMORY_MMAP, V4L2_MEMORY_USERPTR and V4L2_MEMORY_DMABUF.

Example code on how to use VIDIOC_REQBUF:

int request_buffer(int fd, int count) {
    struct v4l2_requestbuffers req = {0};
    req.count = count;
    req.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
    req.memory = V4L2_MEMORY_MMAP;
    if (-1 == ioctl(fd, VIDIOC_REQBUFS, &req))
    {
        perror("Requesting Buffer");
        exit(1);
    }
    return req.count;
}

Query buffer

After the buffers are allocated by the kernel, we have to query the physical address of each allocated buffer in order to mmap() those.

The VIDIOC_QUERYBUF ioctl works with the struct v4l2_buffer:

/**
 * struct v4l2_buffer - video buffer info
 * @index:  id number of the buffer
 * @type:   enum v4l2_buf_type; buffer type (type == *_MPLANE for
 *          multiplanar buffers);
 * @bytesused:      number of bytes occupied by data in the buffer (payload);
 *          unused (set to 0) for multiplanar buffers
 * @flags:  buffer informational flags
 * @field:  enum v4l2_field; field order of the image in the buffer
 * @timestamp:      frame timestamp
 * @timecode:       frame timecode
 * @sequence:       sequence count of this frame
 * @memory: enum v4l2_memory; the method, in which the actual video data is
 *          passed
 * @offset: for non-multiplanar buffers with memory == V4L2_MEMORY_MMAP;
 *          offset from the start of the device memory for this plane,
 *          (or a "cookie" that should be passed to mmap() as offset)
 * @userptr:        for non-multiplanar buffers with memory == V4L2_MEMORY_USERPTR;
 *          a userspace pointer pointing to this buffer
 * @fd:             for non-multiplanar buffers with memory == V4L2_MEMORY_DMABUF;
 *          a userspace file descriptor associated with this buffer
 * @planes: for multiplanar buffers; userspace pointer to the array of plane
 *          info structs for this buffer
 * @m:              union of @offset, @userptr, @planes and @fd
 * @length: size in bytes of the buffer (NOT its payload) for single-plane
 *          buffers (when type != *_MPLANE); number of elements in the
 *          planes array for multi-plane buffers
 * @reserved2:      drivers and applications must zero this field
 * @request_fd: fd of the request that this buffer should use
 * @reserved:       for backwards compatibility with applications that do not know
 *          about @request_fd
 *
 * Contains data exchanged by application and driver using one of the Streaming
 * I/O methods.
 */
struct v4l2_buffer {
    __u32                   index;
    __u32                   type;
    __u32                   bytesused;
    __u32                   flags;
    __u32                   field;
    struct timeval          timestamp;
    struct v4l2_timecode    timecode;
    __u32                   sequence;

    /* memory location */
    __u32                   memory;
    union {
        __u32           offset;
        unsigned long   userptr;
        struct v4l2_plane *planes;
        __s32               fd;
    } m;
    __u32                   length;
    __u32                   reserved2;
    union {
        __s32               request_fd;
        __u32               reserved;
    };
};

The structure contains a lot of fields, but in our mmap() example, we only need to fill out a few:

  • v4l2_buffer.type - Buffer type, we use V4L2_BUF_TYPE_VIDEO_CAPTURE.
  • v4l2_buffer.memory - Memory method, still go for V4L2_MEMORY_MMAP.
  • v4l2_buffer.index - As we probably have requested multiple buffers and want to mmap each of them we have to distinguish the buffers somehow. The index field is buffer id reaching from 0 to v4l2_requestbuffers.count.

Example code on how to use VIDIOC_QUERYBUF:

int query_buffer(int fd, int index, unsigned char **buffer) {
    struct v4l2_buffer buf = {0};
    buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
    buf.memory = V4L2_MEMORY_MMAP;
    buf.index = index;
    int res = ioctl(fd, VIDIOC_QUERYBUF, &buf);
    if(res == -1) {
        perror("Could not query buffer");
        return 2;
    }


    *buffer = (u_int8_t*)mmap (NULL, buf.length, PROT_READ | PROT_WRITE, MAP_SHARED, fd, buf.m.offset);
    return buf.length;
}

Queue buffers

Before the buffers can be filled with data, the buffers has to be enqueued. Enqueued buffers will lock the memory pages used so that those cannot be swapped out during usage. The buffers remain locked until that are dequeued, the device is closed or streaming is turned off.

VIDIOC_QBUF takes the same argument as VIDIOC_QUERYBUF and has to be populated the same way.

Example code on how to use VIDIOC_QBUF:

int queue_buffer(int fd, int index) {
    struct v4l2_buffer bufd = {0};
    bufd.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
    bufd.memory = V4L2_MEMORY_MMAP;
    bufd.index = index;
    if(-1 == ioctl(fd, VIDIOC_QBUF, &bufd))
    {
        perror("Queue Buffer");
        return 1;
    }
    return bufd.bytesused;
}

Start stream

Finally all preparations is done and we are up to start the stream! VIDIOC_STREAMON is basically informing the v4l layer that it can start acquire video frames and use the queued buffers to store them.

Example code on how to use VIDIOC_STREAMON:

int start_streaming(int fd) {
    unsigned int type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
    if(ioctl(fd, VIDIOC_STREAMON, &type) == -1){
        perror("VIDIOC_STREAMON");
        exit(1);
    }
}

Dequeue buffer

Once buffers are filled with video data, those are ready to be dequeued and consumed by the application. This ioctl will be blocking (unless O_NONBLOCK is used) until a buffer is available.

As soon the buffer is dequeued and processed, the application has to immediately queue back the buffer so that the driver layer can fill it with new frames. This is usually part of the application main-loop.

VIDIOC_DQBUF works similar to VIDIOC_QBUF but it populates the v4l2_buffer.index field with the index number of the buffer that has been dequeued.

Example code on how to use VIDIOC_DQBUF:

int dequeue_buffer(int fd) {
    struct v4l2_buffer bufd = {0};
    bufd.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
    bufd.memory = V4L2_MEMORY_MMAP;
    if(-1 == ioctl(fd, VIDIOC_DQBUF, &bufd))
    {
        perror("DeQueue Buffer");
        return 1;
    }
    return bufd.index;
}

Stop stream

Once we are done with the video capturing, we can stop the streaming. This will unlock all enqueued buffers and stop capture frames.

Example code on how to use VIDIOC_STREAMOFF:

int stop_streaming(int fd) {
    unsigned int type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
    if(ioctl(fd, VIDIOC_STREAMOFF, &type) == -1){
        perror("VIDIOC_STREAMON");
        exit(1);
    }
}

Full example

It's not the most beautiful example, but it's at least something to work with.

#include <stdio.h>
#include <stdlib.h>

#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
#include <sys/mman.h>
#include <sys/ioctl.h>
#include <linux/videodev2.h>

#define NBUF 3

void query_capabilites(int fd)
{
    struct v4l2_capability cap;

    if (-1 == ioctl(fd, VIDIOC_QUERYCAP, &cap)) {
        perror("Query capabilites");
        exit(EXIT_FAILURE);
    }

    if (!(cap.capabilities & V4L2_CAP_VIDEO_CAPTURE)) {
        fprintf(stderr, "Device is no video capture device\\n");
        exit(EXIT_FAILURE);
    }

    if (!(cap.capabilities & V4L2_CAP_READWRITE)) {
        fprintf(stderr, "Device does not support read i/o\\n");
    }

    if (!(cap.capabilities & V4L2_CAP_STREAMING)) {
        fprintf(stderr, "Devices does not support streaming i/o\\n");
        exit(EXIT_FAILURE);
    }
}

int queue_buffer(int fd, int index) {
    struct v4l2_buffer bufd = {0};
    bufd.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
    bufd.memory = V4L2_MEMORY_MMAP;
    bufd.index = index;
    if(-1 == ioctl(fd, VIDIOC_QBUF, &bufd))
    {
        perror("Queue Buffer");
        return 1;
    }
    return bufd.bytesused;
}
int dequeue_buffer(int fd) {
    struct v4l2_buffer bufd = {0};
    bufd.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
    bufd.memory = V4L2_MEMORY_MMAP;
    bufd.index = 0;
    if(-1 == ioctl(fd, VIDIOC_DQBUF, &bufd))
    {
        perror("DeQueue Buffer");
        return 1;
    }
    return bufd.index;
}


int start_streaming(int fd) {
    unsigned int type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
    if(ioctl(fd, VIDIOC_STREAMON, &type) == -1){
        perror("VIDIOC_STREAMON");
        exit(EXIT_FAILURE);
    }
}

int stop_streaming(int fd) {
    unsigned int type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
    if(ioctl(fd, VIDIOC_STREAMOFF, &type) == -1){
        perror("VIDIOC_STREAMON");
        exit(EXIT_FAILURE);
    }
}

int query_buffer(int fd, int index, unsigned char **buffer) {
    struct v4l2_buffer buf = {0};
    buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
    buf.memory = V4L2_MEMORY_MMAP;
    buf.index = index;
    int res = ioctl(fd, VIDIOC_QUERYBUF, &buf);
    if(res == -1) {
        perror("Could not query buffer");
        return 2;
    }


    *buffer = (u_int8_t*)mmap (NULL, buf.length, PROT_READ | PROT_WRITE, MAP_SHARED, fd, buf.m.offset);
    return buf.length;
}

int request_buffer(int fd, int count) {
    struct v4l2_requestbuffers req = {0};
    req.count = count;
    req.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
    req.memory = V4L2_MEMORY_MMAP;
    if (-1 == ioctl(fd, VIDIOC_REQBUFS, &req))
    {
        perror("Requesting Buffer");
        exit(EXIT_FAILURE);
    }
    return req.count;
}

int set_format(int fd) {
    struct v4l2_format format = {0};
    format.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
    format.fmt.pix.width = 320;
    format.fmt.pix.height = 240;
    format.fmt.pix.pixelformat = V4L2_PIX_FMT_YUYV;
    format.fmt.pix.field = V4L2_FIELD_NONE;
    int res = ioctl(fd, VIDIOC_S_FMT, &format);
    if(res == -1) {
        perror("Could not set format");
        exit(EXIT_FAILURE);
    }
    return res;
}

int main() {
    unsigned char *buffer[NBUF];
    int fd = open("/dev/video4", O_RDWR);
    int size;
    int index;
    int nbufs;

    query_capabilites(fd);
    set_format(fd);
    nbufs = request_buffer(fd, NBUF);
    if ( nbufs > NBUF) {
        fprintf(stderr, "Increase NBUF to at least %i\n", nbufs);
        exit(1);

    }

    for (int i = 0; i < NBUF; i++) {

        /* Assume all sizes is equal.. */
        size = query_buffer(fd, i, &buffer[i]);

        queue_buffer(fd, i);
    }

    start_streaming(fd);
    fd_set fds;
    FD_ZERO(&fds);
    FD_SET(fd, &fds);
    struct timeval tv = {0};
    tv.tv_sec = 2;
    int r = select(fd+1, &fds, NULL, NULL, &tv);
    if(-1 == r){
        perror("Waiting for Frame");
        exit(1);
    }

    index = dequeue_buffer(fd);
    int file = open("output.raw", O_RDWR | O_CREAT, 0666);
    fprintf(stderr, "file == %i\n", file);
    write(file, buffer[index], size);

    stop_streaming(fd);

    close(file);
    close(fd);

    return 0;
}

V4L2 and media controller

V4L2 and media controller

The media infrastructure in the kernel is a giant beast handling many different types of devices involving different busses and electrical interfaces. Providing an interface to handle the complexity of the hardware is not an easy task. Most devices have multiple ICs with different communication protocols... so the device drivers tends to be very complex as well.

Video For Linux 2 (V4L2) is the interface for such media devices. V4L2 is the second version of V4L and is not really compatible with V4L, even if there is a compatibility mode but the support is more than often incomplete. The name Video4Linux is a counterpart to Video4Windows, but is not technically related to it at all.

Here is an example of what a system may look like (borrowed from the Linux kernel documentation)

/media/typical_media_device.svg

Media controller

System-on-Chips (SoC) devices often provides wide range of hardware blocks that can be interconnected in a variety of ways to obtain the desired functionality. To configure these hardware blocks, the kernel provides the Media Controller kernel API which expose detailed information about the media device and let them to be interconnected in a dynamic and complex way at runtime, all from userspace.

Each hardware block, called entity, in the media controller framework has one or more source and sink pads. The API let the user link source to sink pads and set the format of pads.

Here is a the topology exported from my sabresd with an imx219 (camera module) connected:

/media/media-ctl-graph.png

Let's go through the entities in the picture. All these entities is of course specific for the iMX6 SoC. (Partly taken from the kernel documentation)

imx219 1-0010

This is the camera sensor. The sensor is controlled with I2C commands and the data stream is over the MIPI CSI-2 interface. The name tells us that the sensor is connected to I2C bus 1. The device has the address 0x10.

The entity has one source pad.

imx6-mipi-csi2

This is the MIPI CSI-2 receiver entity. It has one sink pad to receive the MIPI CSI-2 stream (usually from a MIPI CSI-2 camera sensor). It has four source pads, corresponding to the four MIPI CSI-2 demuxed virtual channel outputs. Multiple source pads can be enabled to independently stream from multiple virtual channels.

ipuX_csiY_mux

This is the video multiplexers. They have two or more sink pads to select from either camera sensors with a parallel interface, or from MIPI CSI-2 virtual channels from imx6-mipi-csi2 entity. They have a single source pad that routes to a CSI (ipuX_csiY entities).

ipuX_csiY

These are the CSI entities. They have a single sink pad receiving from either a video mux or from a MIPI CSI-2 virtual channel as described above.

ipuX_vdic

The VDIC carries out motion compensated de-interlacing, with three motion compensation modes: low, medium, and high motion. The mode is specified with the menu control V4L2_CID_DEINTERLACING_MODE. The VDIC has two sink pads and a single source pad.

ipuX_ic_prp

This is the IC pre-processing entity. It acts as a router, routing data from its sink pad to one or both of its source pads.

The direct sink pad receives from an ipuX_csiY direct pad. With this link the VDIC can only operate in high motion mode.

ipuX_ic_prpenc

This is the IC pre-processing encode entity. It has a single sink pad from ipuX_ic_prp, and a single source pad. The source pad is routed to a capture device node, with a node name of the format "ipuX_ic_prpenc capture".

This entity performs the IC pre-process encode task operations: color-space conversion, resizing (downscaling and upscaling), horizontal and vertical flip, and 90/270 degree rotation. Flip and rotation are provided via standard V4L2 controls.

Like the ipuX_csiY IDMAC source, this entity also supports simple de-interlace without motion compensation, and pixel reordering.

ipuX_ic_prpvf

This is the IC pre-processing viewfinder entity. It has a single sink pad from ipuX_ic_prp, and a single source pad. The source pad is routed to a capture device node, with a node name of the format "ipuX_ic_prpvf capture".

This entity is identical in operation to ipuX_ic_prpenc, with the same resizing and CSC operations and flip/rotation controls. It will receive and process de-interlaced frames from the ipuX_vdic if ipuX_ic_prp is receiving from ipuX_vdic.

Capture video stream from sensor

In order to capture a video stream from the sensor we need to:

  1. Create links between the needed entities
  2. Configure pads to hold the correct image format

To do this, we use the media-ctl [1] tool.

Configure pads

We also need to configure each pad to the right format. This image sensor is ouput in raw bayer format (SRGGB8).

export fmt=SRGGB8_1X8/640x480
media-ctl --set-v4l2 "'imx219 1-0010':0[fmt:$fmt field:none]"
media-ctl --set-v4l2 "'imx6-mipi-csi2':1[fmt:$fmt field:none]"
media-ctl --set-v4l2 "'ipu1_csi0_mux':5[fmt:$fmt field:none]"
media-ctl --set-v4l2 "'ipu1_csi0':1[fmt:$fmt field:none]"

Stream to framebuffer

Now a full pipe is created from imx219 to the video0 device.

GStreamer is a handy multimedia framework that we can use to test the full chain

gst-launch-1.0 -vvv v4l2src device=/dev/video0 io-mode=dmabuf blocksize=76800 ! "video/x-bayer,format=rggb,width=640,height=480,framerate=30/1" ! queue ! bayer2rgbneon ! videoconvert ! fbdevsink sync=false
/media/imx219.jpg