Lund Linux Conference 2018

Lund Linux Conference 2018

It is just two weeks from now til the Lund Linux Conference (LLC) [1] begins! LLC is a two-day conference with the same layout as the bigger Linux conferences - just smaller, but just as nice.

There will be talks about PCIe, The serial device bus, security in cars and a few more topics. My highlights this year is to hear about the XDP (eXpress Data Path) [2] to get really fast packet processing with standard Linux. For the last six months, XDP has had great progress and is a technically cool feature.

Here we are back in 2017:

/media/lund-linuxcon-2018.jpg

OOM-killer

OOM-killer

When the system is running out of memory, the Out-Of-Memory (OOM) killer picks a process to kill based on the current memory footprint. In case of OOM, we will calculate a badness score between 0 (never kill) and 1000 for each process in the system. The process with the highest score will be killed. A score of 0 is reserved for unkillable tasks such as the global init process (see [1]) or kernel threads (processes with PF_KTHREAD flag set).

/media/oomkiller.jpg

The current score of a given process is exposed in procfs, see /proc/[pid]/oom_score, and may be adjusted by setting /proc/[pid]/oom_score_adj. The value of oom_score_adj is added to the score before it is used to determine which task to kill. The value may be set between OOM_SCORE_ADJ_MIN (-1000) and OOM_SCORE_DJ_MAX (+1000). This is useful if you want to guarantee that a process never is selected by the OOM killer.

The calculation is simple (nowadays), if a task is using all its allowed memory, the badness score will be calculated to 1000. If it is using half of its allowed memory, the badness score is calculated to 500 and so on. By setting oom_score_adj to -1000, the badness score sums up to <=0 and the task will never be killed by OOM.

There is one more thing that affects the calculation; if the process is running with the capability CAP_SYS_ADMIN, it gets a 3% discount, but that is simply it.

The old implementation

Before v2.6.36, the calculation of badness score tried to be smarter, besides looking for the total memory usage (task->mm->total_vm), it also considered: - Whether the process creates a lot of children - Whether the process has been running for a long time, or has used a lot of CPU time - Whether the process has a low nice value - Whether the process is privileged (CAP_SYS_ADMIN or CAP_SYS_RESOURCE set) - Whether the process is making direct hardware access

At first glance, all these criteria looks valid, but if you think about it a bit, there is a lot of pitfalls here which makes the selection not so fair. For example: A process that creates a lot of children and consumes some memory could be a leaky webserver. Another process that fits into the description is your session manager for your desktop environment which naturally creates a lot of child processes.

The new implementation

This heuristic selection has evolved over time, instead of looking on mm->total_vm for each task, the task's RSS (resident set size, [2]) and swap space is used instead. RSS and Swap space gives a better indication of the amount that we will be able to free if we chose this task. The drawback with using mm->total_vm is that it includes overcommitted memory ( see [3] for more information ) which is pages that the process has claimed but has not been physically allocated.

The process is now only counted as privileged if CAP_SYS_ADMIN is set, not CAP_SYS_RESOURCE as before.

The code

The whole implementation of OOM killer is located in mm/oom_kill.c. The function oom_badness() will be called for each task in the system and returns the calculated badness score.

Let's go through the function.

unsigned long oom_badness(struct task_struct *p, struct mem_cgroup *memcg,
              const nodemask_t *nodemask, unsigned long totalpages)
{
    long points;
    long adj;

    if (oom_unkillable_task(p, memcg, nodemask))
        return 0;

Looking for unkillable tasks such as the global init process.

p = find_lock_task_mm(p);
if (!p)
    return 0;

adj = (long)p->signal->oom_score_adj;
if (adj == OOM_SCORE_ADJ_MIN ||
        test_bit(MMF_OOM_SKIP, &p->mm->flags) ||
        in_vfork(p)) {
    task_unlock(p);
    return 0;
}

If proc/[pid]/oom_score_adj is set to OOM_SCORE_ADJ_MIN (-1000), do not even consider this task

points = get_mm_rss(p->mm) + get_mm_counter(p->mm, MM_SWAPENTS) +
    atomic_long_read(&p->mm->nr_ptes) + mm_nr_pmds(p->mm);
task_unlock(p);

Calculate a score based on RSS, pagetables and used swap space

if (has_capability_noaudit(p, CAP_SYS_ADMIN))
    points -= (points * 3) / 100;

If it is root process, give it a 3% discount. We are no mean people after all

adj *= totalpages / 1000;
points += adj;

Normalize and add the oom_score_adj value

return points > 0 ? points : 1;

At last, never return 0 for an eligible task as it is reserved for non killable tasks

}

Conclusion

The OOM logic is quite straightforward and seems to have been stable for a long time (v2.6.36 was released in october 2010). The reason why I was looking at the code was that I did not think the behavior I saw when experimenting corresponds to what was written in the man page for oom_score. It turned out that the manpage was not updated when the new calculation was introduced back in 2010.

I have updated the manpage and it is available in v4.14 of the Linux manpage project [4].

commit 5753354a3af20c8b361ec3d53caf68f7217edf48
Author: Marcus Folkesson <marcus.folkesson@gmail.com>
Date:   Fri Nov 17 13:09:44 2017 +0100

    proc.5: Update description of /proc/<pid>/oom_score

    After Linux 2.6.36, the heuristic calculation of oom_score
    has changed to only consider used memory and CAP_SYS_ADMIN.

    See kernel commit a63d83f427fbce97a6cea0db2e64b0eb8435cd10.

    Signed-off-by: Marcus Folkesson <marcus.folkesson@gmail.com>
    Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

diff --git a/man5/proc.5 b/man5/proc.5
index 82d4a0646..4e44b8fba 100644
--- a/man5/proc.5
+++ b/man5/proc.5
@@ -1395,7 +1395,9 @@ Since Linux 2.6.36, use of this file is deprecated in favor of
 .IR /proc/[pid]/oom_score_adj .
 .TP
 .IR /proc/[pid]/oom_score " (since Linux 2.6.11)"
-.\" See mm/oom_kill.c::badness() in the 2.6.25 sources
+.\" See mm/oom_kill.c::badness() in pre 2.6.36 sources
+.\" See mm/oom_kill.c::oom_badness() after 2.6.36
+.\" commit a63d83f427fbce97a6cea0db2e64b0eb8435cd10
 This file displays the current score that the kernel gives to
 this process for the purpose of selecting a process
 for the OOM-killer.
@@ -1403,7 +1405,16 @@ A higher score means that the process is more likely to be
 selected by the OOM-killer.
 The basis for this score is the amount of memory used by the process,
 with increases (+) or decreases (\-) for factors including:
-.\" See mm/oom_kill.c::badness() in the 2.6.25 sources
+.\" See mm/oom_kill.c::badness() in pre 2.6.36 sources
+.\" See mm/oom_kill.c::oom_badness() after 2.6.36
+.\" commit a63d83f427fbce97a6cea0db2e64b0eb8435cd10
+.RS
+.IP * 2
+whether the process is privileged (\-);
+.\" More precisely, if it has CAP_SYS_ADMIN or (pre 2.6.36) CAP_SYS_RESOURCE
+.RE
+.IP
+Before kernel 2.6.36 the following factors were also used in the calculation of oom_score:
 .RS
 .IP * 2
 whether the process creates a lot of children using
@@ -1413,10 +1424,7 @@ whether the process creates a lot of children using
 whether the process has been running a long time,
 or has used a lot of CPU time (\-);
 .IP *
-whether the process has a low nice value (i.e., > 0) (+);
-.IP *
-whether the process is privileged (\-); and
-.\" More precisely, if it has CAP_SYS_ADMIN or CAP_SYS_RESOURCE
+whether the process has a low nice value (i.e., > 0) (+); and
 .IP *
 whether the process is making direct hardware access (\-).
 .\" More precisely, if it has CAP_SYS_RAWIO

Embedded Linux course in Linköping

Embedded Linux course in Linköping

I tech in our Embedded Linux course on a regular basis, this time in Linköping. It's a fun course with interesting labs where you will write your own linux device driver for a custom board.

The course itself is quite moderate, but with talented participants, we easily slip over to more interesting things like memory management, the MTD subsystem, how ftrace works internally and how to use it, different contexts, introduce perf and much more.

This time was no exception.

/media/embedded-linux-course.jpg

FIT vs legacy image format

FIT vs legacy image format

U-Boot supports several image formats when booting a kernel. However, a Linux system usually need multiple files for booting. Such files may be the kernel itself, an initrd and a device tree blob.

A typical embedded Linux system have all these files in at least two-three different configurations. It is not uncommon to have

  • Default configuration
  • Rescue configuration
  • Development configuration
  • Production configuration
  • ...

Only these four configurations may involve twelve different files. Or maybe the devicetree is shared between two configurations. Or maybe the initrd is... Or maybe the..... you got the point. It has a farily good chance to end up quite messy.

This is a problem with the old legacy formats and is somehow addressed in the FIT (Flattened Image Tree) format.

Lets first look at the old formats before we get into FIT.

zImage

The most well known format for the Linux kernel is the zImage. The zImage contains a small header followed by a self-extracting code and finally the payload itself. U-Boot support this format by the bootz command.

Image layout

zImage format
Header
Decompressing code
Compressed Data

uImage

When talking about U-Boot, the zImage is usually encapsulated in a file called uImage created with the mkimage utility. Beside image data, the uImage also contains information such as OS type, loader information, compression type and so on. Both data and header is checksumed with CRC32.

The uImage format also supports multiple images. The zImage, initrd and devicetree blob may therefor be included in one single monolithic uImage. Drawbacks with this monolith is that it has no flexible indexing, no hash integrity and no support for security at all.

Image layout

uImage format
Header
Header checksum
Data size
Data load address
Entry point address
Data CRC
OS, CPU
Image type
Compression type
Image name
Image data

FIT

The FIT (Flattened Image Tree) has been around for a while but is for unknown reason rarely used in real systems, based on my experience. FIT is a tree structure like Device Tree (before they change format to YAML: https://lwn.net/Articles/730217/) and handles images of various types.

These images is then used in something called configurations. One image may be a used in several configurations.

This format has many benifits compared to the other:

  • Better solutions for multi component images
    • Multiple kernels (productions, feature, debug, rescue...)
    • Multiple devicetrees
    • Multiple initrds
  • Better hash integrity of images. Supports different hash algorithms like
    • SHA1
    • SHA256
    • MD5
  • Support signed images
    • Only boot verified images
    • Detect malware

Image Tree Source

The file describing the structure is called .its (Image Tree Source). Here is an example of a .its file

/dts-v1/;

/ {
    description = "Marcus FIT test";
    #address-cells = <1>;

    images {
        kernel@1 {
            description = "My default kenel";
            data = /incbin/("./zImage");
            type = "kernel";
            arch = "arm";
            os = "linux";
            compression = "none";
            load = <0x83800000>;
            entry = <0x83800000>;
            hash@1 {
                algo = "md5";
            };
        };

        kernel@2 {
            description = "Rescue image";
            data = /incbin/("./zImage");
            type = "kernel";
            arch = "arm";
            os = "linux";
            compression = "none";
            load = <0x83800000>;
            entry = <0x83800000>;
            hash@1 {
                algo = "crc32";
            };
        };

        fdt@1 {
            description = "FDT for my cool board";
            data = /incbin/("./devicetree.dtb");
            type = "flat_dt";
            arch = "arm";
            compression = "none";
            hash@1 {
                algo = "crc32";
            };
        };


    };

    configurations {
        default = "config@1";

        config@1 {
            description = "Default configuration";
            kernel = "kernel@1";
            fdt = "fdt@1";
        };

        config@2 {
            description = "Rescue configuration";
            kernel = "kernel@2";
            fdt = "fdt@1";
        };

    };
};

This .its file has two kernel images (default and rescue) and one FDT image. Note that the default kernel is hashed with md5 and the other with crc32. It is also possible to use several hash-functions per image.

It also specifies two configuration, Default and Rescue. The two configurations is using different kernels (well, it is the same kernel in this case since both is pointing to ./zImage..) but share the same FDT. It is easy to build up new configurations on demand.

The .its file is then passed to mkimage to generate an itb (Image Tree Blob):

mkimage -f kernel.its kernel.itb

which is bootable from U-Boot.

Look at ./doc/uImage.FIT/ in the U-Boot source code for more examples on how a .its-file can look like. It also contains examples on signed images which is worth a look.

Boot from U-Boot

To boot from U-Boot, use the bootm command and specify the physical address for the FIT image and the configuration you want to boot.

Example on booting config@1 from the .its above:

bootm 0x80800000#config@1

Full example with bootlog:

U-Boot 2015.04 (Oct 05 2017 - 14:25:09)

CPU:   Freescale i.MX6UL rev1.0 at 396 MHz
CPU:   Temperature 42 C
Reset cause: POR
Board: Marcus Cool board
fuse address == 21bc400
serialnr-low : e1fe012a
serialnr-high : 243211d4
       Watchdog enabled
I2C:   ready
DRAM:  512 MiB
Using default environment

In:    serial
Out:   serial
Err:   serial
Net:   FEC1
Boot from USB for mfgtools
Use default environment for                              mfgtools
Run bootcmd_mfg: run mfgtool_args;bootz ${loadaddr} - ${fdt_addr};
Hit any key to stop autoboot:  0
=> bootm 0x80800000#config@1
## Loading kernel from FIT Image at 80800000 ...
   Using 'config@1' configuration
   Trying 'kernel@1' kernel subimage
     Description:  My cool kernel
     Type:         Kernel Image
     Compression:  uncompressed
     Data Start:   0x808000d8
     Data Size:    7175328 Bytes = 6.8 MiB
     Architecture: ARM
     OS:           Linux
     Load Address: 0x83800000
     Entry Point:  0x83800000
     Hash algo:    crc32
     Hash value:   f236d022
   Verifying Hash Integrity ... crc32+ OK
## Loading fdt from FIT Image at 80800000 ...
   Using 'config@1' configuration
   Trying 'fdt@1' fdt subimage
     Description:  FDT for my cool board
     Type:         Flat Device Tree
     Compression:  uncompressed
     Data Start:   0x80ed7e4c
     Data Size:    27122 Bytes = 26.5 KiB
     Architecture: ARM
     Hash algo:    crc32
     Hash value:   1837f127
   Verifying Hash Integrity ... crc32+ OK
   Booting using the fdt blob at 0x80ed7e4c
   Loading Kernel Image ... OK
   Loading Device Tree to 9ef86000, end 9ef8f9f1 ... OK

Starting kernel ...

One thing to think about

Since this FIT image tends to grow in size, it is a good idea to set CONFIG_SYS_BOOTM_LEN in the U-Boot configuration.

- CONFIG_SYS_BOOTM_LEN:
        Normally compressed uImages are limited to an
        uncompressed size of 8 MBytes. If this is not enough,
        you can define CONFIG_SYS_BOOTM_LEN in your board config file
        to adjust this setting to your needs.

Conclusion

I advocate to use the FIT format because it solves many problems that can get really awkward. Just the bonus to program one image in production instead of many is a great benifit. The configurations is well defined and there is no chance that you will end up booting one kernel image with an incompatible devicetree or whatever.

Use FIT images should be part of the bring-up-board activity, not the last thing todo before closing a project. The "Use seperate kernel/dts/initrds for now and then move on to FIT when the platform is stable" does not work. It simple not gonna happend, and it is a mistake.

Magic SysRq

Magic SysRq

Every kernel-hacker should knows about the magic sysrq already, so this post is kind of unnecessary. To enable the magic, make sure CONFIG_MAGIC_SYSRQ is set in your Kernel hacking tab.

I use this feature... a lot. Mostly for set loglevel and reboot the system, but it is also very useful when debugging. So, how to use it?

As everybody is using GNU Screen (what else) as their TTY terminal, the keyboard combination is: ctrl+A B. And here the magic begins! This combination simply sends a SysRq keycode to the target system.

To get information about available commands, press ctrl+A B h. h as in help.:

[669510.910125] SysRq : HELP : loglevel(0-9) reBoot Crash terminate-all-tasks(E) memory-full-oom-kill(F) kill-all-tasks(I) thaw-filesystems(J) saK show-memory-usage(M) nice-all-RT-tasks(N) powerOff show-registers(P) show-all-timers(Q) unRaw Sync show-task-states(T) Unmount show-blocked-tasks(W) dump-ftrace-buffer(Z)
And here we go! My favorites are:
0-9, set debug level
b, reboot the system (ingrained...)
p, show a register dump ( useful if the system hangs)
g, switch console to KGDB (I love KGDB)
1, Show all timers (good when looking for power-thieves)

The best part is that Magic SysRq works in allmost every situation, even if the system is frozen.

For further details see the kernel source file Documentation/sysrq.txt

Take control of your Buffalo Linkstation NAS

Take control of your Buffalo Linkstation NAS

I finally bought a NAS for all of my super-important stuff. It became a Buffalo Linkstation LS200, most because of the price ($300 for 4TB). It supports all of the standard protocols such as FTP, SAMBA, ATP and so on.

However, it would be really useful to use some sane protocols like sftp so you could use rsync for your backup scripts.

Bring a big coffee mug and let the hacking begin....

I knew that the NAS was based on the ARM architecture and supports a whole set of high level protocols, so one qualified guess is that there lives a little penguin in the box.

Lets start with download the latest firmware from the Buffalo webpage. When the firmware is unzipped we have these files:

marcus@tuxie:~/shared/buffalo\$ ls -al
total 780176
drwxr-xr-x  4 marcus marcus      4096 Jun 13 00:06 .
drwxr-xr-x 18 marcus marcus      4096 Jun 12 23:27 ..
-rw-r--r--  1 marcus marcus 190409979 Jun 13 00:06 hddrootfs.img
drwxr-xr-x  2 marcus marcus      4096 Apr  8 13:25 img
-rw-r--r--  1 marcus marcus  12602325 Apr  1 15:25 initrd.img
-rw-r--r--  1 marcus marcus       656 Apr  1 15:25 linkstation_version.ini
-rw-r--r--  1 marcus marcus       198 Apr  1 15:25 linkstation_version.txt
-rw-r--r--  1 marcus marcus 205568610 Jun 12 23:13 LS200_series_FW_1.44.zip
-rw-r--r--  1 marcus marcus    350104 Apr  1 15:25 LSUpdater.exe
-rw-r--r--  1 marcus marcus       327 Apr  1 15:25 LSUpdater.ini
-rw-r--r--  1 marcus marcus    674181 Apr  1 15:25 u-boot.img
-rw-r--r--  1 marcus marcus   2861933 Apr  1 15:25 uImage.img
-rw-r--r--  1 marcus marcus      4880 Apr  8 13:23 update.html

Ok, u-boot.img, uImage.img, initrd.img and hddrootfs.img tells us that I have a black little penguin cage in front of me. First of all, find out what kind of file these *.img files really are.:

marcus@tuxie:~/shared/buffalo$ file ./hddrootfs.img
./hddrootfs.img: Zip archive data, at least v2.0 to extract

Really? It is just an zip-file. Lets extract it then.:

marcus@tuxie:~/shared/buffalo$ unzip hddrootfs.img
Archive:  hddrootfs.img
[hddrootfs.img] hddrootfs.buffalo.updated password:

Of course, it is protected with a password. I will let my old friend John the Ripper take a look at it (I guess I had a great luck, the brute force attack only took 2.5 hours). The password for the file is: aAhvlM1Yp7_2VSm6BhgkmTOrCN1JyE0C5Q6cB3oBB

marcus@tuxie:~/shared/buffalo$ unzip hddrootfs.img
Archive:  hddrootfs.img
[hddrootfs.img] hddrootfs.buffalo.updated password:
  inflating: hddrootfs.buffalo.updated

Terrific. We got a hddrootfs.buffalo.updated file. What is it anyway?:

marcus@tuxie:~/shared/buffalo$ file hddrootfs.buffalo.updated
hddrootfs.buffalo.updated: gzip compressed data, was "rootfs.tar", from Unix, last modified: Tue Apr  1 08:24:05 2014, max compression

It is just a gzip compressed tar archive, couldn't be better! Extract it.:

marcus@tuxie:~/shared/buffalo$ mkdir rootfs
marcus@tuxie:~/shared/buffalo$ tar -xz --numeric-owner -f hddrootfs.buffalo.updated  -C ./rootfs/
marcus@tuxie:~/shared/buffalo$ ls -1l rootfs/
total 80
drwxr-xr-x  2 root root 4096 Apr  1 08:23 bin
-rwxr-xr-x  1 root root 1140 Feb  3 07:51 chroot.sh
drwxr-xr-x  2 root root 4096 Apr  1 08:23 debugtool
drwxr-xr-x  5 root root 4096 Apr  1 08:23 dev
drwxr-xr-x 33 root root 4096 Jun 12 23:17 etc
drwxr-xr-x  4 root root 4096 Apr  1 08:23 home
drwxr-xr-x  9 root root 4096 Apr  1 08:23 lib
drwxr-xr-x  3 root root 4096 Feb  3 07:51 mnt
drwxr-xr-x  2 root root 4096 Apr  1 07:35 opt
-rwxr-xr-x  1 root root 2741 Feb  3 07:51 prepare.sh
drwxr-xr-x  2 root root 4096 Apr  1 07:35 proc
drwxr-xr-x  3 root root 4096 Apr  1 08:23 root
drwxr-xr-x  3 root root 4096 Apr  1 08:22 run
drwxr-xr-x  2 root root 4096 Apr  1 08:23 sbin
drwxr-xr-x  2 root root 4096 Apr  1 07:35 sys
-rwxr-xr-x  1 root root 3751 Feb  3 07:51 test.sh
drwxrwxrwt  3 root root 4096 Apr  1 08:23 tmp
drwxr-xr-x 11 root root 4096 Apr  1 08:23 usr
drwxr-xr-x  9 root root 4096 Apr  1 08:22 var
drwxrwxrwx  6 root root 4096 Apr  1 07:53 www

Here we go!

Modify the root filesystem

First of all, I really would like to have SSH access to the box, and I found that there is a SSH daemon in here (/usr/bin/sshd), but why is it not activated? Take a look in one of the scripts that seems to be related to ssh:

marcus@tuxie:~/shared/buffalo/rootfs$ head -n 20 etc/init.d/sshd.sh
#!/bin/sh
[ -f /etc/nas_feature ] && . /etc/nas_feature
SSHD_DSA=/etc/ssh_host_dsa_key
SSHD_RSA=/etc/ssh_host_rsa_key
SSHD_KEY=/etc/ssh_host_key
SSHD=`which sshd`
if [ "${SSHD}" = "" -o ! -x ${SSHD} ] ; then
 echo "sshd is not supported on this platform!!!"
fi
if [ "${SUPPORT_SFTP}" = "0" ] ; then
        echo "Not support sftp on this model." > /dev/console
        exit 0
fi
umask 000

What about the second if-statement? SUPPORT_SFTP? And what is the /etc/nas_feature file? It does not exist in the package. Is it auto generated at boot? Anyway, I remove the second statement, it seems evil. So, if this starts up the ssh daemon, we would like to login as root, uncomment PermitRootLogin in sshd_config:

marcus@tuxie:~/shared/buffalo/rootfs$ sed -i 's/#PermitRootLogin/ PermitRootLogin/' etc/sshd_config

Then copy your public rsa-key to /root/.ssh/authorized_keys. If you don't have a key, generate it with:

marcus@tuxie:~/shared/buffalo/rootfs$ ssh-keygen

Copy the key to target:

marcus@tuxie:~/shared/buffalo/rootfs$ mkdir ./root/.ssh
marcus@tuxie:~/shared/buffalo/rootfs$ cat ~/.ssh/id_rsa.pub > ./root/.ssh/authorized_keys

This will let you to login without give any password.

Lets try to re-pack the whole thing.:

marcus@tuxie:~/shared/buffalo/rootfs$ mv ../hddrootfs.buffalo.updated{,.old}
marcus@tuxie:~/shared/buffalo/rootfs$ tar -czf ../hddrootfs.buffalo.updated *
marcus@tuxie:~/shared/buffalo/rootfs$ cd ..
marcus@tuxie:~/shared/buffalo$ zip -e hddrootfs.img hddrootfs.buffalo.updated

I encrypt the file with the same password as before, I dare not think about what happens if I don't.

Time to update firmware

I execute the LSUpdater.exe from a virtual Windows machine and hold my thumbs... The update process takes about 8 minutes and is a real pain, would it brick my NAS..?

After a while the power LED is indicating that the NAS is up and running. Wow. Quick! Do a portscan!:

marcus@tuxie:~/shared/buffalo$ sudo nmap -sS 10.0.0.4
[sudo] password for marcus:
Starting Nmap 5.21 ( http://nmap.org ) at 2014-06-13 11:59 CEST
Nmap scan report for nas (10.0.0.4)
Host is up (0.00049s latency).
Not shown: 990 closed ports
PORT      STATE SERVICE
21/tcp    open  ftp
22/tcp    open  ssh
80/tcp    open  http
139/tcp   open  netbios-ssn
443/tcp   open  https
445/tcp   open  microsoft-ds
548/tcp   open  afp
873/tcp   open  rsync
8873/tcp  open  unknown
22939/tcp open  unknown
MAC Address: DC:FB:02:EB:06:A8 (Unknown)
Nmap done: 1 IP address (1 host up) scanned in 0.27 seconds

And there it is! The SSH daemon is running on port 22.:

marcus@tuxie:~$ ssh admin@10.0.0.4
admin@10.0.0.4's password:
[admin@LS220D6A8 ~]$ ls /
bin/        debugtool/  home/       mnt/        proc/       sbin/       tmp/        www/
boot/       dev/        lib/        opt/        root/       sys/        usr/
chroot.sh*  etc/        lost+found/ prepare.sh* run/        test.sh*    var/
[admin@LS220D6A8 ~]$

It is just beautiful!

Wait, what about the /etc/nas_features file?

[admin@LS220D6A8 ~]$ cat /etc/nas_feature
DEFAULT_LANG=english
DEFAULT_CODEPAGE=CP437
REGION_CODE=EU
PRODUCT_CAPACITY="040"
PID=0x0000300D
SERIES_NAME="LinkStation"
PRODUCT_SERIES="LS200"
PRODUCT_NAME="LS220D(SANJO)"
SUPPORT_NTFS_WRITE=on
NTFS_DRIVER="tuxera"
SUPPORT_DIRECT_COPY=on
SUPPORT_RAID=on
SUPPORT_RAID_DEGRADE=off
SUPPORT_FAN=on
SUPPORT_AUTOIP=on
SUPPORT_NEW_DISK_AUTO_REBUILD=off
SUPPORT_2STEP_INSPECTION=off
SUPPORT_RESYNC_DELAY=off
SUPPORT_PRINTER_SERVER=on
SUPPORT_ITUNES_SERVER=on
SUPPORT_DLNA_SERVER=on
SUPPORT_NAS_FIREWALL=off
SUPPORT_IPV6=off
SUPPORT_DHCPS=off
SUPPORT_UPNP=off
SUPPORT_SLIDE_POWER_SWITCH=on
SUPPORT_BITTORRENT=on
BITTORRENT_CLIENT="transmission"
SUPPORT_USER_QUOTA=on
SUPPORT_GROUP_QUOTA=on
SUPPORT_ACL=on
SUPPORT_TIME_MACHINE=on
SUPPORT_SLEEP_TIMER=on
SUPPORT_AD_NT_DOMAIN=on
SUPPORT_RAID0=1
SUPPORT_RAID1=1
SUPPORT_RAID5=0
SUPPORT_RAID6=0
SUPPORT_RAID10=0
SUPPORT_RAID50=0
SUPPORT_RAID60=0
SUPPORT_RAID51=0
SUPPORT_RAID61=0
SUPPORT_NORAID=0
SUPPORT_RAID_REBUILD=1
SUPPORT_AUTH_EXTERNAL=1
SUPPORT_SAMBA_DFS=0
SUPPORT_LINKDEREC_ANALOG=0
SUPPORT_LINKDEREC_DIGITAL=0
SUPPORT_WEBAXS=1
SUPPORT_UPS_SERIAL=0
SUPPORT_UPS_USB=0
SUPPORT_UPS_RECOVER=0
SUPPORT_NUT=0
SUPPORT_SYSLOG_FORWARD=0
SUPPORT_SYSLOG_DOWNLOAD=0
SUPPORT_SHUTDOWN_FROMWEB=0
SUPPORT_REBOOT_FROMWEB=1
SUPPORT_IMHERE=0
SUPPORT_POWER_INTERLOCK=1
SUPPORT_SMTP_AUTH=1
NEED_MICONMON=off
ROOTFS_FS=ext3
USERLAND_FS=xfs
NASFEAT_VM_WRITEBACK=default
NASFEAT_VM_EXPIRE=default
MAX_DISK_NUM=2
MAX_USBDISK_NUM=1
MAX_ARRAY_NUM=1
DEV_BOOT=md0
DEV_ROOTFS1=md1
DEV_SWAP1=md2
SDK_VERSION=2
DEVICE_NETWORK_PRIMARY=eth1
DEVICE_NETWORK_SECONDARY=
DEVICE_NETWORK_NUM=1
DEVICE_HDD1_LINK=disk1_6
DEVICE_HDD2_LINK=disk2_6
DEVICE_HDD3_LINK=disk3_6
DEVICE_HDD4_LINK=disk4_6
DEVICE_HDD5_LINK=disk5_6
DEVICE_HDD6_LINK=disk6_6
DEVICE_HDD7_LINK=disk7_6
DEVICE_HDD8_LINK=disk8_6
DEVICE_HDD_BASE_EDP=md100
DEVICE_HDD1_EDP=md101
DEVICE_HDD2_EDP=md102
DEVICE_HDD3_EDP=md103
DEVICE_HDD4_EDP=md104
DEVICE_HDD5_EDP=md105
DEVICE_HDD6_EDP=md106
DEVICE_HDD7_EDP=md107
DEVICE_HDD8_EDP=md108
DEVICE_MD1_REAL=md10
DEVICE_MD2_REAL=md20
DEVICE_MD3_REAL=md30
DEVICE_MD4_REAL=md40
DEVICE_USB1_LINK=usbdisk1_1
DEVICE_USB2_LINK=usbdisk2_1
DEVICE_USB3_LINK=usbdisk1_5
DEVICE_USB4_LINK=usbdisk2_5
MOUNT_GLOBAL=/mnt
MOUNT_LVM_BASE=/mnt/lvm
MOUNT_HDD1=/mnt/disk1
MOUNT_HDD2=/mnt/disk2
MOUNT_HDD3=/mnt/disk3
MOUNT_HDD4=/mnt/disk4
MOUNT_HDD5=/mnt/disk5
MOUNT_HDD6=/mnt/disk6
MOUNT_HDD7=/mnt/disk7
MOUNT_HDD8=/mnt/disk8
MOUNT_ARRAY1=/mnt/array1
MOUNT_ARRAY2=/mnt/array2
MOUNT_ARRAY3=/mnt/array3
MOUNT_ARRAY4=/mnt/array4
MOUNT_USB1=/mnt/usbdisk1
MOUNT_USB2=/mnt/usbdisk2
MOUNT_USB3=/mnt/usbdisk3
MOUNT_USB4=/mnt/usbdisk4
MOUNT_USB5=/mnt/usbdisk5
MOUNT_MC_BASE=/mnt/mediacartridge
SUPPORT_MC_VER=1
SUPPORT_INTERNAL_DISK_APPEND=0
STORAGE_TYPE=HDD
BODY_COLOR=NORMAL
SUPPORT_MICON=0
SUPPORT_LCD=0
SUPPORT_USER_QUOTA_SOFT=0
SUPPORT_GROUP_QUOTA_SOFT=0
SUPPORT_NFS=0
SUPPORT_LVM=0
SUPPORT_OFFLINEFILE=0
SUPPORT_HIDDEN_SHARE=0
SUPPORT_HOT_SWAP=0
SUPPORT_LCD_LED=0
SUPPORT_ALERT=0
SUPPORT_PORT_TRUNKING=0
SUPPORT_REPLICATION=0
SUPPORT_USER_GROUP_CSV=0
SUPPORT_SFTP=0
SUPPORT_SERVICE_MAPPING=0
SUPPORT_SSLKEY_IMPORT=1
SUPPORT_SLEEPTIMER_DATE=0
SUPPORT_TERA_SEARCH=0
SUPPORT_SECURE_BOOT=0
SUPPORT_PACKAGE_UPDATE=0
SUPPORT_HDD_SPINDOWN=0
SUPPORT_DISK_ENCRYPT=0
SUPPORT_FTPS=1
SUPPORT_CLEANUP_ALL_TRASHBOX=0
SUPPORT_WAKEUP_BY_REBOOT=1
SUPPORT_DTCP_IP=0
SUPPORT_MYSQL=0
SUPPORT_APACHE=0
SUPPORT_PHP=0
SUPPORT_UPS_STANDBY=0
SUPPORT_HIDDEN_RAID_MENU=0
SUPPORT_ISCSI=0
SUPPORT_ISCSI_TYPE=
DEFAULT_WORKINGMODE=
MAX_LVM_VOLUME_NUM=0
MAX_ISCSI_VOLUME_NUM=0
INTERNAL_SCSI_TYPE=multi-host
SUPPORT_ELIMINATE_ADLIMIT=
USB_TREE_TYPE=
SUPPORT_WOL=0
WOL_TYPE=
SUPPORT_HARDLINK_BACKUP=0
SUPPORT_SNMP=0
SUPPORT_EDP=1
SUPPORT_POCKETU=0
SUPPORT_MC=0
SUPPORT_FOFB=0
SUPPORT_EDP_PLUS=0
SUPPORT_INIT_SW=1
SUPPORT_SQUEEZEBOX=0
SUPPORT_OL_UPDATE=1
SUPPORT_AMAZONS3=0
SUPPORT_SURVEILLANCE=0
SUPPORT_WAFS=0
SUPPORT_SUGARSYNC=0
SUPPORT_INTERNAL_DISK_REMOVE=1
SUPPORT_SETTING_RECOVERY_USB=0
SUPPORT_PASSWORD_RECOVERY_USB=0
SUPPORT_AV=
SUPPORT_TUNEUP_RAID=on
SUPPORT_FLICKRFS=0
SUPPORT_WOL_INT=1
TUNE=0
SUPPORT_EDP_PLUS=0
SUPPORT_SXUPTP=0
SUPPORT_SQUEEZEBOX=0
SUPPORT_EYEFI=0
SUPPORT_OL_UPDATE=1
SUPPORT_INIT_SW=1
SUPPORT_USB=1
SUPPORT_WAFS=0
SUPPORT_INFO_LED=1
SUPPORT_ALARM_LED=1
POWER_SWITCH_TYPE=none
SUPPORT_FUNC_SW=1
DLNA_SERVER="twonky"
SUPPORT_TRANSCODER=0
SUPPORT_LAYOUT_SWITCH=1
SUPPORT_UTILITY_DOWNLOAD=1
SUPPORT_BT_CLOUD=0
DEFAULT_VALUE_DLNA=1
DEFAULT_VALUE_BT_CLOUD=0
SUPPORT_EXCLUSION_LED_POWER_INFO_ERROR=1
DEFAULT_DLNA_SERVICE=off
SUPPORT_MOBILE_WEBUI=1
SUPPORT_SHUTDOWN_DEPEND_ON_SW=1
SUPPORT_SPARE_DISK=0

It seems that the files is generated. It also has the SUPPORT_SFTP config that we saw in sshd.sh.

What about the kernel

In the current vanilla kernel, there is devicetrees that seems to be for the Buffalo linkstation.:

marcus@tuxie:~/marcus/linux/linux$ grep -i buffalo arch/arm/boot/dts/*.dts
arch/arm/boot/dts/kirkwood-lschlv2.dts: model = "Buffalo Linkstation LS-CHLv2";
arch/arm/boot/dts/kirkwood-lschlv2.dts: compatible = "buffalo,lschlv2", "buffalo,lsxl", "marvell,kirkwood-88f6281", "marvell,kirkwood";
arch/arm/boot/dts/kirkwood-lsxhl.dts:   model = "Buffalo Linkstation LS-XHL";
arch/arm/boot/dts/kirkwood-lsxhl.dts:   compatible = "buffalo,lsxhl", "buffalo,lsxl", "marvell,kirkwood-88f6281", "marvell,kirkwood";

The device tree seems to match the current CPU:

[admin@LS220D6A8 ~]$ cat /proc/cpuinfo
Processor : Marvell PJ4Bv7 Processor rev 1 (v7l)
BogoMIPS : 795.44
Features : swp half thumb fastmult vfp edsp vfpv3 vfpv3d16 tls
CPU implementer : 0x56
CPU architecture: 7
CPU variant : 0x1
CPU part : 0x581
CPU revision : 1
Hardware : Marvell Armada-370
Revision : 0000
Serial  : 0000000000000000

Also, the kernel is not tainted indicating that there is no out-of-tree modules.:

[admin@LS220D6A8 ~]$ cat /proc/sys/kernel/tainted
0

It could therefor be possible to compile a custom kernel with support for more USB-devices that may be plugged into the NAS.

Other tips

The LSUpdater.exe will refuse to update if the same version of software is already on the target. This means that you cannot upload the same firmware again... unless you change the version!

Together with the uploader application, there is a linkstation_version.ini file that contains information about each of the *.img. The first thing I tried was just to increase the VERSION by one. This makes the LSUpdater.exe move a little forward, It stops complain about the same version, instead it complains about that this firmware is already on the target. However, I needed to change the timestamp of each binary (increased the day by one), then it updates the firmware without any problem.

marcus@tuxie:~/shared/buffalo$ cat linkstation_version.ini
[COMMON]
VERSION=1.44-0.36
BOOT=0.20
KERNEL=2014/04/01 14:33:46
INITRD=2014/04/01 14:35:10
ROOTFS=2014/04/01 15:23:41
FILE_BOOT = u-boot.img
FILE_KERNEL = uImage.img
FILE_INITRD = initrd.img
FILE_ROOTFS = hddrootfs.img

[TARGET_INFO1]
PID=0x0000001D
FILE_KERNEL=uImage.img
KERNEL=2014/04/01 14:33:46
FILE_BOOT_APPLY=u-boot.img
BOOT=0.20
BOOT_UP_CMD=""
[TARGET_INFO2]
PID=0x0000300D
FILE_KERNEL=uImage.img
KERNEL=2014/04/01 14:33:46
FILE_BOOT_APPLY=u-boot.img
BOOT=0.20
BOOT_UP_CMD=""
[TARGET_INFO3]
PID=0x0000300E
FILE_KERNEL=uImage.img
KERNEL=2014/04/01 14:33:46
FILE_BOOT_APPLY=u-boot.img
BOOT=0.20
BOOT_UP_CMD=""

NAT with Linux

NAT with Linux

To share an internet connection may sometimes be very practical when working with embedded devices. The network may have restrictions/authentications that stops you from plug in your device into the network of the big company you are working for.

But what about creating your own network and use your computer as NAT (Network Address Translation)? I was surprised how easy it is to set up your Linux host as a NAT, it is just a few command lines.

OK, here is the setup on Host: - eth0 has ip address 192.168.1.50 and is connected to the company network - eth1 has ip address 10.2.234.1 ans is connected to the target

Setup on Target: - eth0 has ip address 10.2.234.100 ans is connected to host

First of all, we need to setup a default gateway on our target, do this as you allways do - with route.:

Target$ route add default gw 10.2.234.1 eth0

Next, we need to create a post-routing rule in the to the NAT table that masquerades all traffic to the eth0 interface. iptables is your friend:

Host$ sudo iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

Thats it! Well, allmost. We just need to enable ip-forwarding.:

Host$ echo 1 | sudo tee /proc/sys/net/ipv4/ip_forward

-ENOENT, but believe me, it's there

-ENOENT, but believe me, it's there

Almost every ELF-file in a Linux environment is dynamically linked, and the operating system has to locate all dynamic libraries in order to execute the file. To its help, it has the runtime dynamic linker, whose only job is to interpret the ELF file format, load the shared objects with unresolved references, and at last, execute and pass over the control to the ELF file.

The expected path to the dynamic linker is hardcoded into the ELF itself, so if the linker is missing, it leads to an obvious but maybe confusing error.

If I want to run an executable that has its path to a dynamic linker that simple does not exists, we got this:

root@cgtqmx6:/# ./01_SimpleTriangle -sh: ./01_SimpleTriangle: No such file or directory

Hue? But the file does exist! OK, lets find out which system calls is made.:

root@cgtqmx6:/# ./01_SimpleTriangle -sh: ./01_SimpleTriangle: No such file or directoryroot@cgtqmx6:/# strace ./01_SimpleTriangle execve("./01_SimpleTriangle", ["./01_SimpleTriangle"], [/* 15 vars */]) = -1 ENOENT (No such file or directory)write(2, "strace: exec: No such fil40strace: exec: No such file or directory) = 40exit_group(1)                           = ?+++ exited with 1 +++

Not much valuable information. Strace calls execve that says the file does not exist. We are talking about the file, but which file is it? As told before, the runtime dynamic linker is always executed first to load libraries for unresolved symbols. So one qualified guess is that it is the dynamic linker that is missing.

Readelf is a handy tool to read out information from an ELF file such as sections, which architecture it is compiled for and so on. It also shows us the expected path to the dynamic linker.:

root@cgtqmx6:/# readelf -l 01_SimpleTriangle Elf file type is EXEC (Executable file)Entry point 0x9288There are 8 program headers, starting at offset 52
Program Headers:  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align  EXIDX          0x0022ec 0x0000a2ec 0x0000a2ec 0x00088 0x00088 R   0x4  PHDR           0x000034 0x00008034 0x00008034 0x00100 0x00100 R E 0x4  INTERP         0x000134 0x00008134 0x00008134 0x00013 0x00013 R   0x1    >>>>>>  [Requesting program interpreter: /lib/ld-linux.so.3] <<<<<<<  LOAD           0x000000 0x00008000 0x00008000 0x02378 0x02378 R E 0x8000         0x002378 0x00012378 0x00012378 0x002a0 0x00358 RW  0x8000  DYNAMIC        0x002384 0x00012384 0x00012384 0x00118 0x00118 RW  0x4  NOTE           0x000148 0x00008148 0x00008148 0x00020 0x00020 R   0x4  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x4
 Section to Segment mapping:  Segment Sections...   00     .ARM.exidx    01        02     .interp    03     .interp .note.ABI-tag .hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .ARM.extab .ARM.exidx .eh_frame    04     .init_array .fini_array .jcr .dynamic .got .data .bss    05     .dynamic    06     .note.ABI-tag    07

Here we go! The expected path is /lib/ld-linux.so.3. Our system is currently using /lib/ld-linux-armhf.so.3, so lets create a symbolic link for now.:

ln -s /lib/ld-linux-armhf.so.3 /lib/ld-linux.so.3

And finally, the program is now executing!:

root@cgtqmx6:/# ./01_SimpleTriangle root@cgtqmx6:/#

Anyway, who decide which runtime dynamic linker to use? It is decided in the linker-stage, usually by ld or gcc and may vary between different toolchains,

Modules with parameters

Modules with parameters

Everybody knows that modules can take parameters, either via /sys/modules/<module>/parameters or via cmdline to the kernel, but how are these parameters created?

Parameters without callbacks

The Linux kernel provides the module_param() macro. The syntax is:

module_param(name, type, perm)

Which will simply create the module parameter and expose it as an entry in /sys/modules/<module>/parameters.

Code example

int debug_flag;
module_param(debug_flag, bool, S_IRUSR | S_IWUSR | S_IRGRP)
MODULE_PARM_DESC(debug_flag, "Set to 1 if debug should be enabled, 0 otherwise");

MODULE_PARM_DESC() is a short description of the parameter. Modinfo will read the description and present it for you.

Parameters with callbacks

Sometimes it may be useful to actually notify the driver that the value of a parameter has changed, which not the regular module_param() macro does.

module_param_cb is the way to go. The macro takes two callbacks functions, set and get, that is called when the user (or kernel if in cmdline) interact with the parameters. This is done by passing a struct kernel_param_ops to the macro. The syntax is:

module_param_cb(name, ops, arg, perm)
The module_param_cb is not heavily used in the kernel if we look in the drivers:
[06:40:35]marcus@tuxie:~/kernel$ git grep module_param_cb drivers/
drivers/acpi/sysfs.c:module_param_cb(debug_layer, &param_ops_debug_layer, &acpi_dbg_layer, 0644);
drivers/acpi/sysfs.c:module_param_cb(debug_level, &param_ops_debug_level, &acpi_dbg_level, 0644);
drivers/char/ipmi/ipmi_watchdog.c:module_param_cb(action, &param_ops_str, action_op, 0644);
drivers/char/ipmi/ipmi_watchdog.c:module_param_cb(preaction, &param_ops_str, preaction_op, 0644);
drivers/char/ipmi/ipmi_watchdog.c:module_param_cb(preop, &param_ops_str, preop_op, 0644);

In fact, there are just 5 entries, don't ask me why, I think the macro is terrific. The interface is really simple, just fill the kernel_param_ops struct and pass it to the module_param_cb macro. I think the code is quite self-explained, so I just post an example taken from drivers/acpi/sysfs.c.

Code example

static int param_get_debug_level(char *buffer, const struct kernel_param *kp)
{
 int result = 0;
 int i;
 result = sprintf(buffer, "%-25s\tHex        SET\n", "Description");
 for (i = 0; i < ARRAY_SIZE(acpi_debug_levels); i++) {
  result += sprintf(buffer + result, "%-25s\t0x%08lX [%c]\n",
      acpi_debug_levels[i].name,
      acpi_debug_levels[i].value,
      (acpi_dbg_level & acpi_debug_levels[i].value)
      ? '*' : ' ');
 }
 result +=
     sprintf(buffer + result, "--\ndebug_level = 0x%08X (* = enabled)\n",
      acpi_dbg_level);
 return result;
}

static struct kernel_param_ops param_ops_debug_level = {
 .set = param_set_uint,
 .get = param_get_debug_level,
};
module_param_cb(debug_level, &param_ops_debug_level, &acpi_dbg_level, 0644);

There is also a set of standard set/get-functions (the code above use param_set_uint for example). These are called param_(set|get)_XXX where XXX is byte, short, int, long and so on.

Take a look in include/linux/moduleparam.h for further reading!

Terminate a hanging SSH session

Terminate a hanging SSH session

It may be very frustrating when SSH sessions just hangs because the target is power cycling or something. Lucky for you there is a "secret" escape sequence that allows you to terminate the session (and a few other things).

The escape sequence is <enter>~X where X is a command letter. To see all available key sequences, type <enter>~?. Example output:

marcus@Ilos:~$ ~?
Supported escape sequences:
  ~.  - terminate connection (and any multiplexed sessions)
  ~B  - send a BREAK to the remote system
  ~C  - open a command line
  ~R  - Request rekey (SSH protocol 2 only)
  ~^Z - suspend ssh
  ~#  - list forwarded connections
  ~&  - background ssh (when waiting for connections to terminate)
  ~?  - this message
  ~~  - send the escape character by typing it twice
(Note that escapes are only recognized immediately after newline.)

<enter>~. is my favorit. It terminates the connection and keep your mood cheerful.