Skip flashing unused blocks with UUU

TL;DR: UUU does now (or will shortly) support blockmaps for flashing images. Use it. It will shorten your flashing time *a lot.*

It will soon be time to manufacture circuit boards for a project I'm currently working on. After manufacturing, it will need some firmware for sure, but how do we flash it in the most efficient way? The board is based on an i.MX.8 chip from NXP and has an eMMC as storage medium.

Overall, short iteration is nice to have as it will benifit the development cycles for the developers on a daily basis, but the time it takes to flash a device with its firmware is not just "nice to have" in the production line, it's critical. We do not want to have a (another?) bottleneck in the production line, so we have to make some effort to make it as fast as possible.

The Scenario

The scenario we have in front of us is that:

  • We want to write a full disk image to the eMMC in the production
  • Most space in the disk image is reserved for empty partitions
  • The disk image is quite big (~12GB)
  • Only ~2GB is of the disk image is "real data" (not just chunks of zeros)
  • Flash firmware to eMMC takes ~25min which is unacceptable

Sparse files is to me a very familiar concept and the .wic file created by the Yocto project is indeed a sparse file. But the flashing tool (UUU) did not seems to support write sparse images without expanding them (that is not completely true, as we will see).

To speed up the production line, I thought of several different solutions including:

  • Do not include empty partitions in the .wic file, create those at the first boot instead
  • Only use UUU to boot the board and then flash the disk image using dfu-utils
  • ...

But all have its own drawbacks. Best would be if we could stick to a uniform way of flashing everything we need, and UUU will be needed regardless.

Before we go any further, lets look at what a sparse file is.

Sparse images

A sparse image [2] is a type of file that contains metadata to describe empty sections (let's call them holes) instead of filling it up with zeros (that take up actual space). The advantage is that sparse files only take as much space as they actually need.

For example; If the logical file size is, say 128 GiB, and the areas with real data is only 8GiB, the physical file size will only be 8GiB+metadata. The allocated size are significantly reduced compared to a full 128GiB image.

/media/sparse-file.png

Most of the filesystems in Linux (btrfs, XFS, ext[234], ...) and even Windows (NTFS) handle sparse files transparently to the user. This means that the filesystem will not allocate more memory than necessary. Reading these holes returns just zeros, and the filesystem will allocate blocks on the filesystem only when someone actually writes to those holes.

Think on it as the Copy-On-Write (COW) feature that many modern filesystems implements.

An example

It could be a little bit confusing when trying to figure out the file size for sparse files.

For example, ls reports 14088667136 bytes, which is quite a big file!

$ ls -alrt image.wic
-rw-r--r-- 1 marcus marcus 14088667136 12 dec 11.08 image.wic

But du only reports 1997888 bytes?!

$ du  image.wic
1997888    image.wic

To get the real occupied size, we need to print the allocated size in blocks (-s):

$ ls -alrts image.wic
1997888 -rw-r--r-- 1 marcus marcus 14088667136 12 dec 11.08 image.wic

If you want to see it with your own eyes, you can easily create a 2GB file with the truncate command:

$ truncate  -s  2G image.sparse
$ stat image.sparse 
  File: image.sparse
    Size: 2147483648	Blocks: 0          IO Block: 4096   regular file

As you can see, the file size is 2GB, but it occupies zero blocks on the disk. The file does only consists of a big chunk of zeros which is described in the metadata.

Handle the holes with care

It's easy to expand the sparse files if you are not careful. The most common applications that is used to copy files have a --sparse option to preserve the holes. If not used, it will allocate blocks and fill them out with zeros. For some applications the --sparse option is set as their default behavior, e.g., see the relevant part of the manpage for cp(1):

--sparse=WHEN
              control creation of sparse files. See below

 ...
 By default, sparse SOURCE files are detected by a crude heuristic and the corresponding DEST file is made sparse as well.
 That  is  the  behavior  selected  by --sparse=auto.
 Specify --sparse=always to create a sparse DEST file whenever the SOURCE file contains a long enough sequence of zero bytes.
 Use --sparse=never to inhibit creation of sparse files.

While some applications does not use it as default, e.g. rsync(1):

--sparse, -S
        Try to handle sparse files efficiently so they take up less space on the destination.  If combined with --inplace the file created might not end  up  with
        sparse  blocks  with some combinations of kernel version and/or filesystem type.  If --whole-file is in effect (e.g. for a local copy) then it will always
        work because rsync truncates the file prior to writing out the updated version.

        Note that versions of rsync older than 3.1.3 will reject the combination of --sparse and --inplace.

All kind of compression/decompression of files usually expands sparse files as well.

bmap-tools

bmap-tools [4] is a handy tool for creating block maps (bmaps) for a file and then use that information and copy that file to a media in a more efficient way.

The advantages of bmap-tools compared to e.g. dd is (as they state on their Github page) :

  • Faster. Depending on various factors, like write speed, image size, how full is the image, and so on, bmaptool was 5-7 times faster than dd in the Tizen IVI project.
  • Integrity. bmaptool verifies data integrity while flashing, which means that possible data corruptions will be noticed immediately.
  • Usability. bmaptool can read images directly from the remote server, so users do not have to download images and save them locally.
  • Protects user's data. Unlike dd, if you make a mistake and specify a wrong block device name, bmaptool will less likely destroy your data because it has protection mechanisms which, for example, prevent bmaptool from writing to a mounted block device.

The tool comes with two sub commands, create and copy. The create subcommand generates the block maps and copy commands copy a file to a certain destinatin.

A typical usage of dd could be to write an image (bigimage) to to a SD-card ( /dev/sdX):

 $ bzcat bigimage | dd of=/dev/sdX bs=1M conv=fsync

The dd command will copy all data, including those zeroes for the empty partitions. Instead of copy all data with dd, we could generate a block map file with metadata that describes all the "empty" sections:

bmaptool create bigimage > bigimage.bmap

The bmap file is a human readable XML file that shows all the block maps and also checksums for each block:

<?xml version="1.0" ?>
<!-- This file contains the block map for an image file, which is basically
     a list of useful (mapped) block numbers in the image file. In other words,
     it lists only those blocks which contain data (boot sector, partition
     table, file-system metadata, files, directories, extents, etc). These
     blocks have to be copied to the target device. The other blocks do not
     contain any useful data and do not have to be copied to the target
     device.

     The block map an optimization which allows to copy or flash the image to
     the image quicker than copying of flashing the entire image. This is
     because with bmap less data is copied: <MappedBlocksCount> blocks instead
     of <BlocksCount> blocks.

     Besides the machine-readable data, this file contains useful commentaries
     which contain human-readable information like image size, percentage of
     mapped data, etc.

     The 'version' attribute is the block map file format version in the
     'major.minor' format. The version major number is increased whenever an
     incompatible block map format change is made. The minor number changes
     in case of minor backward-compatible changes. -->

<bmap version="2.0">
    <!-- Image size in bytes: 13.1 GiB -->
    <ImageSize> 14088667136 </ImageSize>

    <!-- Size of a block in bytes -->
    <BlockSize> 4096 </BlockSize>

    <!-- Count of blocks in the image file -->
    <BlocksCount> 3439616 </BlocksCount>

    <!-- Count of mapped blocks: 1.9 GiB or 14.5%     -->
    <MappedBlocksCount> 499471  </MappedBlocksCount>

    <!-- Type of checksum used in this file -->
    <ChecksumType> sha256 </ChecksumType>

    <!-- The checksum of this bmap file. When it's calculated, the value of
         the checksum has be zero (all ASCII "0" symbols).  -->
    <BmapFileChecksum> 19337986514b3952866af5fb80054c40166f608f42efbf0551956e80678aba46 </BmapFileChecksum>

    <!-- The block map which consists of elements which may either be a
         range of blocks or a single block. The 'chksum' attribute
         (if present) is the checksum of this blocks range. -->
    <BlockMap>
        <Range chksum="a53e6dcf984243a7d87a23fed5d28f3594c72a83081e90e296694256cfc26395"> 0-2 </Range>
        <Range chksum="a60a80a89aa832b3480da91fe0c75e77e24b29c1de0fcdffb9d2b50102013ff5"> 8-452 </Range>
        <Range chksum="7b07df777cd3441ac3053c2c39be52baef308e4befabd500d0c72ef4ab7c5565"> 2048-12079 </Range>
        <Range chksum="cf6ac0e4aec163a228bad0ab85722e011dd785fc8d047dc4d0f86e886fa6684d"> 24576-24902 </Range>
        <Range chksum="7521285590601370cc063cc807237eaf666f879d84d4fcae001026a7bb5a7eff"> 24904-24905 </Range>
        <Range chksum="9661e72b75fe483d53d10585bff79316ea38f15c48547b99da3e0c8b38634ceb"> 24920-26173 </Range>
        <Range chksum="93d69200bd59286ebf66ba57ae7108cc46ef81b3514b1d45864d4200cf4c4182"> 40824-57345 </Range>
        <Range chksum="56ed03479c8d200325e5ef6c1c88d8f86ef243bf11394c6a27faa8f1a98ab30e"> 57656-122881 </Range>
        <Range chksum="cad5eeff8c937438ced0cba539d22be29831b0dae986abdfd3edc8ecf841afdb"> 123192-188417 </Range>
        <Range chksum="88992bab7f6a4d06b55998e74e8a4635271592b801d642c1165528d5e22f23ff"> 188728-253953 </Range>
        <Range chksum="f5332ff218a38bf58e25b39fefc8e00f5f95e01928d0a5edefa29340f4a24b42"> 254264-319489 </Range>
        <Range chksum="714c8373aa5f7b55bd01d280cb182327f407c434e1dcc89f04f9d4c522c24522"> 319800-509008 </Range>
        <Range chksum="b1ec6c807ea2f47e0d937ac0032693a927029ff5d451183a5028b8af59fb3dd2"> 548864 </Range>
        <Range chksum="554448825ffae96d000e4b965ef2c53ae305a6734d1756c72cd68d6de7d2a8b0"> 548867 </Range>
        <Range chksum="f627ca4c2c322f15db26152df306bd4f983f0146409b81a4341b9b340c365a16"> 660672-660696 </Range>
        <Range chksum="fa01901c7f34cdb08467000c3b52a4927e208a6c128892018c0f5039fedb93c0"> 661504-661569 </Range>
        <Range chksum="ee3b76f97d2b039924e60726a165f688c93c87add940880a5d6a4a62fe4f7876"> 661573 </Range>
        <Range chksum="db3aa8c65438e58fc0140b81da36ec37c36fa94549b43ad32d36e206a9ec7819"> 661577-661578 </Range>
        <Range chksum="726f800d4405552f2406fb07e158a3966a89006f9bf474104c822b425aed0f18"> 662593-662596 </Range>
        <Range chksum="e9085dd345859b72dbe86b9c33ce44957dbe742ceb5221330ad87d83676af6cb"> 663552-663553 </Range>
        <Range chksum="1f3660636b045d5fc31af89f6e276850921d34e3e008a2f5337f1a1372b61114"> 667648-667649 </Range>
        <Range chksum="b3f3b8a0818aeade05b5e7af366663130fdc502c63a24a236bb54545d548d458"> 671744-671745 </Range>
        <Range chksum="b4a759d879bc4a47e8b292a5a02565d551350e9387227016d8aed62f3a668a1d"> 675840-675841 </Range>
        <Range chksum="1ff00c7f6feca0b42a957812832b89e7839c6754ca268b06b3c28ba6124c36fd"> 679936-679937 </Range>
        <Range chksum="9bbedd9ac9ab58f06abc2cdc678ac5ddf77c117f0004139504d2aaa321c7475f"> 694272 </Range>
        <Range chksum="c8defb770c630fa1faca3d8c2dff774f7967a0b9902992f313fc8984a9bb43a8"> 696320-698368 </Range>
        <Range chksum="9d5fad916ab4a1f7311397aa3f820ae29ffff4c3937b43a49a0a2a62bf8d6846"> 712704-712705 </Range>
        <Range chksum="48f179784918cf9bf366201edc1585a243f892e6133aa296609486734398be7f"> 716800-716801 </Range>
        <Range chksum="cc0dd36d4891ae52c5626759b21edc4894c76ba145da549f85316646f00e8f2d"> 727040 </Range>
        <Range chksum="955be1419c8df86dc56151737426c53f5ba9df523f66d2b6056d3b2028e5240d"> 759808 </Range>
        <Range chksum="ab6cafa862c5a250b731e41e373ad49767535f6920297376835f04a9e81de023"> 759811 </Range>
        <Range chksum="0ec8e7c7512276ad5fabe76591aeda7ce6c3e031b170454f800a00b330eea2de"> 761856-761857 </Range>
        <Range chksum="de2f256064a0af797747c2b97505dc0b9f3df0de4f489eac731c23ae9ca9cc31"> 789488-789503 </Range>
        <Range chksum="8c8853e6be07575b48d3202ba3e9bb4035b025cd7b546db7172e82f97bd53b1c"> 790527-791555 </Range>
        <Range chksum="918fc7f7a7300f41474636bf6ac4e7bf18e706a341f5ee75420625b85c66c47e"> 791571 </Range>
        <Range chksum="c25cb25875f56f2832b2fcd7417b7970c882cc73ffd6e8eaa7b9b09cbf19f172"> 791587 </Range>
        <Range chksum="52aec994d40f31ce57d57391a8f6db6c23393dfca6d53b99fb8f0ca4faad0253"> 807971-807976 </Range>
        <Range chksum="8ffcde1a642eb739a05bde4d74c73430d5fb4bcffe4293fed7b35e3ce9fe16df"> 823296-823298 </Range>
        <Range chksum="1d21f93952243d3a6ba80e34edcac9a005ad12c82dec79203d9ae6e0c2416043"> 888832-888834 </Range>
        <Range chksum="cdb117909634c406611acef34bedc52de88e8e8102446b3924538041f36a6af9"> 954368-954370 </Range>
        <Range chksum="18945d38f9ac981a16fb5c828b42da4ce86c43fe7b033b7c69b21835c4a3945b"> 1019904-1019906 </Range>
        <Range chksum="1ad3cc0ed83dad0ea03b9fcc4841332361ded18b8e29c382bfbb142a962aef81"> 1085440-1085442 </Range>
        <Range chksum="c19fc180da533935f50551627bd3913d56ff61a3ecf6740fbc4be184c5385bc9"> 1314816 </Range>
        <Range chksum="b045e6499699f9dea08702e15448f5da3e4736f297bbe5d91bb899668d488d22"> 1609728-1609730 </Range>
        <Range chksum="a25229feee84521831f652f3ebcbb23b86192c181ef938ed0af258bfde434945"> 1675264-1675266 </Range>
        <Range chksum="741a9ded7aee9d4812966c66df87632d3e1fde97c68708a10c5f71b9d76b0adf"> 1839104-1839105 </Range>
        <Range chksum="d17c50d57cfb768c3308eff3ccc11cf220ae41fd52fe94de9c36a465408dd268"> 1871872-1888255 </Range>
        <Range chksum="c19fc180da533935f50551627bd3913d56ff61a3ecf6740fbc4be184c5385bc9"> 2363392 </Range>
        <Range chksum="1747cefe9731aa9a0a9e07303199f766beecaef8fd3898d42075f833950a3dd2"> 2396160-2396162 </Range>
        <Range chksum="c19fc180da533935f50551627bd3913d56ff61a3ecf6740fbc4be184c5385bc9"> 2887680 </Range>
        <Range chksum="ad7facb2586fc6e966c004d7d1d16b024f5805ff7cb47c7a85dabd8b48892ca7"> 2887695 </Range>
        <Range chksum="de2f256064a0af797747c2b97505dc0b9f3df0de4f489eac731c23ae9ca9cc31"> 3411952-3411967 </Range>
        <Range chksum="7d501c17f812edb240475aa0f1ee9e15c4c8619df82860b2bd008c20bb58dc6d"> 3414000 </Range>
        <Range chksum="f71e4b72a8f3d4d8d4ef3c045e4a35963b2f5c0e61c9a31d3035c75a834b502c"> 3414016-3414080 </Range>
        <Range chksum="98033bc2b57b7220f131ca9a5daa6172d87ca91ff3831a7e91b153329bf52eb1"> 3414082-3414084 </Range>
        <Range chksum="b5a2b3aaf6dcfba16895e906a5d98d6167408337a3b259013fa3057bafe80918"> 3414087 </Range>
        <Range chksum="81b766d472722dbc326e921a436e16c0d7ad2c061761b2baeb3f6723665997c5"> 3414886-3414890 </Range>
        <Range chksum="2250b13e06cc186d3a6457a16824a3c9686088a1c0866ee4ebe8e36dfe3e17d7"> 3416064 </Range>
        <Range chksum="a82ec47f112bff20b91d748aebc769b2c9003b7eb5851991337586115f31da62"> 3420160 </Range>
        <Range chksum="99dcc3517a44145d057b63675b939baa7faf6b450591ebfccd2269a203cab194"> 3424256 </Range>
        <Range chksum="f5a94286ab129fb41eea23bf4195ca85e425148b151140dfa323405e6c34b01c"> 3426304-3427328 </Range>
        <Range chksum="d261d607d8361715c880e288ccb5b6602af50ce99e13362e8c97bce7425749d2"> 3428352 </Range>
        <Range chksum="d6ff528a483c3a6c581588000d925f38573e7a8fd84f6bb656b19e61cb36bd06"> 3432448 </Range>
        <Range chksum="de2f256064a0af797747c2b97505dc0b9f3df0de4f489eac731c23ae9ca9cc31"> 3439600-3439615 </Range>
    </BlockMap>
</bmap>

Once the bmap file is generated, we can copy the file to , e.g. our SD-card (/dev/sdX) using the copy subcommand:

bmaptool copy bigimage /dev/sdX

In my case, this was 6 times faster compared to dd.

Yocto

Yocto supports OpenEmbedded Kickstart Reference (.wks) to describe how to create a disk image.

A .wks file could look like this:

part u-boot --source rawcopy --sourceparams="file="${IMAGE_BOOTLOADER}   --no-table    --align ${IMX_BOOT_SEEK}
part /boot  --source bootimg-partition                        --fstype=vfat --label boot --active --align 8192 --size 64
part        --source rootfs                                   --fstype=ext4 --label root    --align 8192 --size 1G
part                                                          --fstype=ext4 --label overlay --align 8192 --size 500M
part /mnt/data                                                --fstype=ext4 --label data --align 8192 --size 10G
bootloader --ptable msdos

The format is rather self explained. Pay some attention to the --size parameters. Here we create several (empty) partitions which is actually quite big.

The resulting image occupy 14088667136 bytes in total, but only 1997888 bytes is non-sparse data.

Yocto is able to generate a block map file using the bmaptool for the resulting ẁic` file. Just specify that you want to generate a wic.bmap image among your other fs types:

IMAGE_FSTYPES = "wic.gz wic.bmap"

UUU

The Universal Update Utility (UUU) [1] is a image deploy tool for Freescale/NXP I.MX chips. It allows downloading and executing code via the Serial Download Protocol (SDP) when booted into manufacturing mode.

The tool let you flash all of most the common types of storage devices such as NAND, NOR, eMMC and SD-cards.

That is good, but the USB-OTG port that is used for data transfer is only USB 2.0 (Hi-Speed) which limit the transfer speed to 480Mb/s.

A rather common case is to write a full disk image to a storage device using the tool. A full disk image contains, as you may guessed, a full disk, this includes all unpartionated data, partition table, all partitions, both those that contains data and eventually any empty partitions. In other words, if you have a 128MiB storage, your disk image will occupy 128MiB, and 128MiB of data will be downloaded via SDP.

Support for sparse files could be handy here.

UUU claims to support sparse images by its raw2sparse type, but that is simply a lie, the code that detects continous chunks of data is actually commented out [3] for unknown reason:

//int type = is_same_value(data, pheader->blk_sz) ? CHUNK_TYPE_FILL : CHUNK_TYPE_RAW;
int type = CHUNK_TYPE_RAW;

Pending pull request

Luckily enough, there is a pull request [5] that add support for using bmap files with UUU!

The PR is currently (2023-12-11) not merged but I hope is is soon. I've tested it out and it works like a charm.

EDIT: The PR was merged 2023-12-12, one day after I wrote this blog entry :-)

With this PR you can provide a bmap file to the flash command via the -bmap <image> option in your .uuu script. E.g. :

uuu_version 1.2.39

SDPS: boot -f ../imx-boot

FB: ucmd setenv emmc_dev 2
FB: ucmd setenv mmcdev ${emmc_dev}
FB: ucmd mmc dev ${emmc_dev}
FB: flash -raw2sparse -bmap rawimage.wic.bmap all rawimage.wic
FB: done

It cut the time flashing down to a sixth!

As this is a rather new feature in the UUU tool, I would like to promote it and thank dnbazhenov @Github for implement this.