ToBe MyOwnKernel

OS from Scratch

This article describes the step to build Bare Metal OS environment, for ARM and x86, from scratch... Without any underlying OS.

It covers the various topics involved in SW development from HW bring up, toolchain and environment tuning, and several key OS concepts.

There are 2 versions of this tutorial, covering either the x86 or the ARM architectures. Selection of the version can be done there:

Preparation
Development host

Development requires a Linux host to execute this tutorial, such as:

Linux Mint http://www.linuxmint.com
Ubuntu http://www.ubuntu.com
Ubuntu (Mate Flavor) https://ubuntu-mate.org
Debian https://www.debian.org
Fedora https://getfedora.org/en/

Microsoft user may run this Linux system from a virtual machine. Popular virtualization environment includes:

Sun VirtualBox https://www.virtualbox.org/
VMware Player https://www.vmware.com/products/player/

This article assumes to use Linux Mint running in VirtualBox virtual machine, however, any other combination should be usable with minimal changes

Environment settings

The following (optional) settings are recommended:

sudo su
# So that sudo is not asking for password
echo "user ALL = NOPASSWD: ALL" >> /etc/sudoers
exit

# For VirtualBox users only, allows user "user" to access host shared directories
sudo groupadd vboxsf
sudo adduser `id -un` vboxsf

Install some tools:

# Geany editor / IDE
sudo apt-get install geany
sudo apt-get install gtk+-2.0 intltool
wget http://download.geany.org/geany-1.26.tar.bz2
wget http://plugins.geany.org/geany-plugins/geany-plugins-1.26.tar.bz2
tar xvjf geany-1.*.tar.bz2
cd geany-1*
./configure --prefix=/home/bertrand/local/geany
make
make install
cd ../geany-plugins*/debugger
./configure --prefix=/home/bertrand/local/geany
make
sudo make install
wget http://mirrors.kernel.org/ubuntu/pool/main/v/vte/libvte-dev_0.28.2-5ubuntu1_amd64.deb
sudo dpkg -i libvte-dev_0.28.2-5ubuntu1_amd64.deb
./configure --prefix=/home/bertrand/local/geany
make
sudo make install
cp /usr/local/lib/geany/debugger.so /home/bertrand/local/geany/lib/geany/


# Meld diff/merge utility
sudo apt-get install meld

On Linux Mint, to set google as default search engine, follow this link:
https://addons.mozilla.org/en-US/firefox/addon/google-default/?src=search

Whenever development tree must be portable, we can create a filesystem in a file:

# To create a 1GB portable file system (that can be stored on a USB key)
dd if=/dev/zero of=/media/usbkey/linux.ext4 bs=1024 count=1M
mkfs.ext4 -F /media/usbkey/linux.ext4
mkdir ~/dev
sudo mount -o loop,rw /media/usbkey/linux.ext4 ~/dev
Installing a toolchain

Bare system development requires a cross-toolchain to build software. That includes the compiler (C, C++), system libraries ('libc') and binary utilities (Assembler, ...) targeted for 'bare metal' SW development.

Note: The toolchain must be configured with the correct components (architecture, library, ABI, ...), so even if compilation host and target share the same CPU architecture (typically x86), a cross-toolchain is nonetheless required (the default OS 'gcc' toolchain also includes OS-specific components and libraries).

Compiler Naming Convention
Compiler are named after a "target-triplet" (of up to 4 ?!?! elements) as follow: Arch-Vendor-OS-Environment.
We use, for ARM, arm-none-eabi-gcc: Arch=arm, Vendor=n.a., OS=none (i.e. bare metal) and Env=eabi (ARM Embedded Application Binary Interface).
For x86, we use i686-elf-gcc when native compiler, obtained as 'gcc -dumpmachine' gives x86_64-linux-gnu

More detailed about toolchain configuration can be found at osdev.org

Toolchains can be downloaded from:

Toolchainx86arm
OSDev.org http://wiki.osdev.org/GCC_Cross-Compiler#Prebuilt_Toolchains
GCC ARM Embedded project https://launchpad.net/gcc-arm-embedded
Mentor Sourcery CodeBench http://www.mentor.com/embedded-software/sourcery-tools/sourcery-codebench/editions/lite-edition
GCC (GNU Compiler Collection)SrcSrc https://gcc.gnu.org/
Clang/LLVM compilerSrcSrc http://clang.llvm.org/

Note: Installing a toolchain from sources is beyond the scope of this article, so we will use pre-built GCC.

GCC toolchain will be used (Clang should be compatible), and we will need one version for ARM and two for x86 to cover 32-bits and 64-bits versions of the architecture (GCC can be configured to support both using -m32 and -m64 options, but it is not always enabled). Here is how to install pre-compiled toolchains for both ARM and x86:

# Download toolchains
wget https://launchpad.net/gcc-arm-embedded/5.0/5-2015-q4-major/+download/gcc-arm-none-eabi-5_2-2015q4-20151219-linux.tar.bz2
wget http://newos.org/toolchains/i686-elf-5.2.0-Linux-x86_64.tar.xz
wget http://newos.org/toolchains/x86_64-elf-5.2.0-Linux-x86_64.tar.xz
# On x86-64 you need 32-bits libs for launchpad toolchain, so enter either:
sudo apt-get install ia32-libs
# or just: sudo apt-get install libc6:i386 libncurses5:i386
# For a x86 host (32-bit), that repository can be used:
# git clone https://github.com/rm-hull/barebones-toolchain.git
# GCC 6.3 ARM toolchain
# wget https://releases.linaro.org/components/toolchain/binaries/latest/arm-eabi/gcc-linaro-6.3.1-2017.05-x86_64_arm-eabi.tar.xz

# Install toolchains
cd ~/local
tar xvjf ~/gcc-arm-none-eabi-*.tar.bz2
tar xvJf ~/i686-elf-*.tar.xz
tar xvJf ~/x86_64-elf-*.tar.xz
( mv gcc-arm-none-eabi* ~/local/gcc; cp -R -n i686-elf-*/* x86_64-elf-*/* ~/local/gcc; rm -rf i686-elf-* x86_64-elf-*; sudo chown -R root:root ~/local/gcc )

# Add this to .bashrc
echo 'export PATH=$PATH:~/local/gcc/bin' >> ~/.bashrc 

# Check the toolchain
arm-none-eabi-gcc --version
arm-none-eabi-gcc (GNU Tools for ARM Embedded Processors) 4.9.3 20150303 (release) [ARM/embedded-4_9-branch revision 221220]
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

i686-elf-gcc --version
i686-elf-gcc (GCC) 4.9.2
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

x86_64-elf-gcc --version
x86_64-elf-gcc (GCC) 4.9.1
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

x86_64-elf-gcc -dumpmachine; gcc -dumpmachine
x86_64-elf
x86_64-linux-gnu
CPU architecture
ARM architecture
x86 architecture

The Intel ia-32 architecture defines 16 registers:

  • Development platforms
    ARM VersatilePB QEMU model

    Platform description

    The ARM Versatile Platform Baseboard is a development system for the ARM926EJ-S CPU. It includes several peripheral for application prototyping:
    • ARM926EJ-S CPU based of ARMv5 architecture (with MMU)
    • an ARM Color LCD Controller (CLCDC) for display
    • 3x UART for input / output.
    • a TransDimension OTG243 USB2 controller
    • an Xilinx Spartan PCI32 controller
    • an SMSC 91C911 Fast Ethernet network interface

    This board has good support in QEMU emulator. QEMU emulator allows to test the real HW without needed the hardware, and it also provides extra debug capabilities.

    Note: This tutorial is only intended to execute on Qemu emulated VersatilePB model, the real hardware will require additional system initialisation.

    VersatilePB booting

    JTAG debugger is used to initialize the real HW and to program the flash memory. Once done, ARM executes directly the code stored on flash.

    Embedded systems like VersatilePB generally execute a bootloader (that can reside in ROM, flash or both) that will initialize the Hardware (SDRAM ...) before loading the OS from an available storage location (Local filesystem, network, ...)

    When using QEMU, a kernel image (in raw or elf format) is passed to the emulator and is executed. Software on VersatilePB should normally start at address 0x1000 (GCC Linker needs to be configured to use the correct address). However when loading ELF file, QEMU will automatically use the correct entry point.

    Raspberry Pi Computer

    Platform description

    The Raspberry Pi is a low cost credit-card size computer board based on ARM-Based SOC chip. It includes:
    • DualCore 700MHz ARM1176JZF-S CPU (ARMv6) for RaspberryPi
    • QuadCore 900MHz ARM Cortex-A7 CPU (ARMv7) for RaspberryPi 2
    • Broadcom VideoCore IV GPU with 1080p support via HDMI
    • 1x to 4x USB ports
    • 1x Fast Ethernet port (Models B/B+ only)
    • Several peripherals (I2C, SPI, GPIO, ...)

    RaspberryPi booting

    The RaspberryPi SoC (BCM2835/6) boot in multiple stage from an SD Card:

    In order to boot another kernel, the binary image can simply be added to SD Card

    Intel x86

    Platform description

    x86 platform is an architecture that evolved from Intel 8086 released in 1979. This is the CPU used in IBM¨PC, and the architecture evolved in parallel with PC improvements as well as its main OS (MS DOS, Windows). Several other vendors have also produced x86 CPU such as AMD. The major evolutions of the x86 architecture are:
    • Intel 8086: The original architecture, 16-bit only (family also includes 8088, 80186, 80286).
    • i386, i686: i386 adds 32-bit support and MMU (note: 80286 also had a MMU), i486 adds L1 cache and FPU on chip, and Pentium family adds instruction extensions (MMX, SSE, ...) and SMD multi-core support.
    • AMD Athlon, Intel Core: introduces 64-bit extensions
    x86 software can be run on any PC. For ease of development, PC can boot from a USB key or CD image (and even from floppy if available and if size is small enough). QEMU also has extensive support of the x86 architecture.

    Note: only recent evolutions of the x86 architecture will be covered in this tutorial: either i686 or x86-64.

    CPU Modes

    x86 architecture can run in several modes:

    Kernel described here will run in Protected mode (for 32-bit version) or Long mode (for 64-bit variant)

    Startup Sequence

    Master Boot Record 1 sector (512 bytes) FAT Partition FAT Boot Sector 1 sector (512 bytes) FAT1 FAT2 Root Dir (Fat12/16) Clusters Bootstrap Code 446 bytes Partition #1 Partition #2 Partition #3 Partition #4 55h AAh Flag (CHS) Start Type (CHS) End (LBA) Start (LBA) Size 0 446 462 478 494 510 bytesPerSect sectPerClust reservedSect nbFAT nbRootEntries nbSect mediaType sectPerFAT sectPerTrack nbHeads hiddenSect nbSect sectPerFAT rootDirClust FAT12/16 only FAT32 only Bootstrap Code 420 bytes 55h AAh 0 11-12 13 14-15 16 17-18 19-20 21 22-23 24-25 26-27 28-31 32-35 36-39 44-47 90 510 next next FFFF next FFFF Root Directory Table File Ext Attr 1stClust High 1stClust Low Size 0-7 8-10 11 20-21 26-27 28-31 FAT32 only Root Dir (Fat32) FILE.EXT
    Conventional Memory Area (640 kB) Upper Memory Area (384 kB) Extended Memory Area 0 0000h 0 0400h 0 0500h 0 7C00h 0 7E00h 8 0000h 9 FC00h A 0000h C 0000h C 8000h F 0000h 10 0000h 1 kB 256 B 512 B 1 kB IVT Interrupt Vector Table BDA Bios Data Area RAM (Usable) if available (>512k RAM) EBDA Extended Bios Data Area Video RAM Video BIOS Reserved BIOS ROM HMA High Memory Area MBR Sector (Copied by BIOS) Not available in Real Mode
    ROM Real Mode RAM Real Mode Privileged (Ring 0) Protected/Long Mode User (Ring 3) Protected/Long Mode Operating System Hardware Power On Reset Devices BIOS / UEFI System initialization @(FFFF:0000) Bootloader @(0000:7C00) Stage 2/3 bootloader Kernel Applications Userspace programs Device Drivers
    1 sector = 512 bytes
    1 track (= [Cyl, Head]) = x sectors (1..63, typ 63)
    heads (tracks/cylinder) = 0..255, typ 255
    ..1024 cylinders
    
    //FLG CHS_START         TYPE  CHS_STOP          LBA_START               LBA_LEN
    0x80, 0x00, 0x02, 0x00, 0x0B, 0x00, 0x1f, 0x00, 0x01, 0x00, 0x00, 0x00, 0x1d, 0x00, 0x00, 0x00
    BOOT  CHS=0,0,2         FAT32 CHS=0,0,31        1                       30 (30*255
    
    Hello World (ASM)
    Building the software

    The code

    The Assembly code for our application is given below:

     arch-arm/HelloWorld.S
            .file "HelloWorld.S"
            .global _start
            .text
            .code 32
    
            @ Register  definition, for ARM VersatilePB with PL011 UART
            .equ    UART0_BASE, 0x101f1000
            .equ    UARTDR,     0x0
    
    _start:
            @ Display 'str' to the UART:
            ldr     r0, =(str-1)      @ R0 is string pointer
            ldr     r1, =UART0_BASE   @ R1 is UART base register
    1:      ldrb    r2, [r0, #1]!     @ Get next character
            cmp     r2, #0            @ ... until end of string
            beq     .                 @ We are done.
            str     r2, [r1, #UARTDR] @ Print the character to UART
            b       1b                @ Loop back
    
    str:    .asciz  "Hello world!\n"
    
            .end
    

    This simple code simply copy each byte from the string to the UART Data Register. Note: ARM CPU must start in 32-bit ARM (.code 32)

     arch-x86/HelloWorld.S
            .file "HelloWorld.S"
            .global _start
    
            .text
            .arch i386
            .code16
            .org 0
    
    _start:
            # Clear screen
            movw $0x0003,%ax       # BIOS 10h Function 00h Mode 3 (Set Video Mode)
            int $0x10              # Call BIOS Interrupt 10h
    
            # Display string
            movw $0x1301,%ax       # BIOS 10h Function 13h Mode 1 (Write String) 
            movb $0x0f, %bl        # White (4) character on Black (0) background
            movw $len, %cx         # String size 
            movw $0x0c22, %dx      # Screen position x=34 (0x22), y=12 (0x0c)
            movw $str, %bp         # Message pointer 
            int $0x10              # Call BIOS Interrupt 10h
    
            jmp   .
    
    str:    .ascii  "Hello world!"
    len=    . - str
    
            # Insert Magic Number at the end of MBR so that BIOS boots this volume.
            .org 510
            .byte 0x55, 0xaa
    
            .end
    

    This simple code clear the screen and then display the string using a BIOS function. Note: x86 BIOS retrieve boot from MBR sector of the selected drive (the 512 first bytes of that drive). It is loaded at address 0x7c00 and run. The last 2 bytes from a valid boot sector must be 0x55AA.

    Building process

    Now is time to build the Software... The assembler and linker will be run separately to detail each step of the build process, however the GCC compiler can automatically invoke the linker.

    Build steps are:

    # Assemble HelloWorld program
    arm-none-eabi-as -mcpu=arm926ej-s -o HelloWorld-arm.o arch-arm/HelloWorld.S
    
    # Link program at address 0x10000 (VersatilePB start address)
    arm-none-eabi-ld -Ttext=0x10000 -o HelloWorld-arm.elf HelloWorld-arm.o
    
    # Generate raw binary file
    arm-none-eabi-objcopy -O binary HelloWorld-arm.elf HelloWorld-arm.raw
    
    # Assemble HelloWorld program
    i686-elf-as -o HelloWorld-x86.o arch-x86/HelloWorld.S
    
    # Link program at address 0x7c00 (Boot code location in RAM)
    i686-elf-ld -Ttext=0x7c00 -o HelloWorld-x86.elf HelloWorld-x86.o
    
    # Generate raw binary file
    i686-elf-objcopy -O binary HelloWorld-x86.elf HelloWorld-x86.raw
    

    Closer look at the objects

    Several utilities help to analyse the output files:

    Executing the program

    Preparation

    Arm VersatilePB images will be emulated with QEMU

    Qemu is an emulator that can execute code from a different architecture in a virtual machine. It is available at http://www.qemu.org.

    To install Qemu:

    sudo apt-get install qemu-system

    To compile Qemu:

    # Retrieve Qemu sources
    wget https://download.qemu.org/qemu-6.2.0.tar.xz
    tar xvf qemu-6.2.0.tar.xz
    cd qemu-*
    
    # Install tools required to build Qemu
    sudo apt-get -o Acquire::ForceIPv4=true install g++ autoconf automake libtool flex bison python3-pip
    pip3 install ninja
    export PATH=$PATH:$HOME/.local/bin
    # ... or add this to rc file
    
    # Mandatory libs
    sudo apt-get install libglib2.0-dev libfdt-dev libpixman-1-dev zlib1g-dev
    
    # Important libs
    sudo apt-get install libcap-ng-dev libsdl2-dev
    
    # Other libs
    sudo apt-get install libaio-dev libbluetooth-dev libbrlapi-dev libbz2-dev \
     libcap-dev libcap-ng-dev libcurl4-gnutls-dev libgtk-3-dev \
     libibverbs-dev libjpeg8-dev libncurses5-dev libnuma-dev librbd-dev librdmacm-dev \
     libsasl2-dev libseccomp-dev libsnappy-dev libssh2-1-dev \
     libvde-dev libvdeplug-dev libvte-2.91-dev libxen-dev liblzo2-dev valgrind xfslibs-dev \
     libnfs-dev libiscsi-dev libxml2-dev libsdl2-image-dev
    
    # Build and install Qemu
    ./configure --prefix=$HOME/.local --target-list="arm-softmmu x86_64-softmmu i386-softmmu"
    make -j4
    make install
    
    # Older versions
    wget http://wiki.qemu-project.org/download/qemu-2.3.0.tar.bz2
    sudo apt-get -o Acquire::ForceIPv4=true install g++ zlib1g-dev autoconf automake libtool libcap-ng-dev flex bison libglib2.0-dev libsdl-dev
    tar xvjf qemu-*.tar.bz2 
    cd qemu-*
    ./configure --prefix=~/dev/qemu --target-list="arm-softmmu x86_64-softmmu i386-softmmu"
    make
    make install

    Run the program

    Qemu comes with different emulation options:

    a full OS on the virtual host)
    qemuutility will run a userspace application inside a Vitual Machine.
    qemu-armwill do the same, but for an application compiled for the ARM architecture.
    qemu-systemallows to run a full native system (so that you can boot
    qemu-system-armfinally, runs a full system on a different (emulated) architecture

    In our scope, we will use qemu-system-arm:

    # Emulate system
    QEMU_AUDIO_DRV=none qemu-system-arm -M versatilepb -m 128M -nographic -kernel HelloWorld.raw
    Hello world!
    QEMU: Terminated
    
    -------------------------------------------------------------------------------
    Emulate program, >CTRL-A> X to exit
    > QEMU_AUDIO_DRV=none qemu-system-arm -M versatilepb -m 128M -nographic -kernel HelloWorld-arm.raw
    Hello world!
    QEMU: Terminated
    -------------------------------------------------------------------------------
    Emulate program, >CTRL-A> X to exit
    > qemu-system-i386 -drive file=HelloWorld-x86.raw,format=raw
    -------------------------------------------------------------------------------
    

    To terminate emulation, press <CTRL-A> then X

    About the options:

    -M versatilepbdefines the target Machine (use -M ? for a list of supported targets)
    -m 128defines the Memory size
    -nographicdisable the video output, and provide serial interface on host stdio
    -kernel HelloWorld.rawdefines the application to use (Note: Qemu could also load the ELF).
    QEMU_AUDIO_DRV=nonedisable the audio interface (avoiding Warning message on some systems)

    Note: many other options are available. Use qemu-system-arm --help for a list

    Hello World (C)
    Objective

    Run the same application but using C code. Also enable debug fonctionatilites (GDB)

    Building the software

    The code

    The C code for our application is given below:

     HelloWorld.c
    /*
     * HelloWorld.c - Print out Hello World on Arm system
     */
    
    // Define UART Data Register (For Qemu ARM VersatilePB target) 
    volatile unsigned int * const Uart_DR = (unsigned int *)0x101f1000;
    
    // Main C function: Just print HelloWorld
    void _start() {
      char *s = "Hello world!\n";
    
      // Copy the string to UART Data Register
      while (*s != '\0')
        *Uart_DR = (unsigned int)(*s++);
    
      while (1) ;
    }
    

    This code is a simple example that misses some important initialisation needed to run more complex C program.

    Create the Makefile

    Let's now automatise the build process using the following Makefile:

     Makefile
    # Makefile - Compile HelloWorld C program
    
    # Settings for Cross-Compiler
    # ---------------------------------------------------------------------------
    PREFIX ?= arm-none-eabi-
    CC = $(PREFIX)gcc
    AS = $(PREFIX)as
    LD = $(PREFIX)ld
    
    CFLAGS  += -mcpu=arm926ej-s -Os
    LDFLAGS += -Ttext=0x10000 -nostartfiles
    
    # Define targets
    # ---------------------------------------------------------------------------
    
    # default target
    all: HelloWorld
    
    # To clean all generated file
    clean:
      $(RM) HelloWorld *.o *.elf *.raw *.map *~
    
    # To start Qemu with "make qemu" (and re-build as necessary)
    qemu: HelloWorld
      @-echo "*** Running qemu *** <CTRL-a> x to quit"
      QEMU_AUDIO_DRV=none qemu-system-arm -M versatilepb -m 128M -nographic -kernel $?
    
    .PHONY: clean qemu
    

    The makefile automatize the compilation process using rules:

    This makefile relies on an implicit rule to compile a C file from source such as:

    %: %.c
      $(CC) $(CFLAGS) $(LDFLAGS) -o $@ $^
    

    Building (make)

    The C file is now compiled using the command 'make'.

    # Build the HelloWorld application
    make
    arm-none-eabi-gcc -mcpu=arm926ej-s -Os  -Ttext=0x10000 -nostartfiles  HelloWorld.c   -o HelloWorld
    
    # To start Qemu
    make qemu
    *** Running qemu *** <CTRL-a> x to quit
    QEMU_AUDIO_DRV=none qemu-system-arm -M versatilepb -m 128M -nographic -kernel HelloWorld
    Hello world!
    QEMU: Terminated
    Debugging the Software

    QEMU CPU execution traces

    QEMU can trace CPU execution state to a log file to help debugging code execution.

    It could be helpful to instruct the linker to output the symbol file to help with debugging: use '-Map file.map' linker option (or '-Wl,-Map,file.map' when invoking LD from GCC) and eventually '--cref' cross reference.

    make CFLAGS="-Wl,-Map,HelloWorld.map -Wl,--cref -mcpu=arm926ej-s -Os"
    arm-none-eabi-gcc -Wl,-Map,HelloWorld.map -Wl,--cref -mcpu=arm926ej-s -Os  -Ttext=0x10000 -nostartfiles  HelloWorld.c   -o HelloWorld
    
    cat HelloWorld.map

    Execution trace logs are generated when Qemu is invoqued with the options -d in_asm,cpu -D file.log -singlestep.

    QEMU_AUDIO_DRV=none qemu-system-arm -M versatilepb -m 128M -nographic -kernel HelloWorld -d in_asm,cpu -D HelloWorld.log -singlestep
    Hello world!
      *** Press <CTRL-A> x ***
    cat HelloWorld.log
    R00=00000000 R01=00000000 R02=00000000 R03=00000000
    R04=00000000 R05=00000000 R06=00000000 R07=00000000
    R08=00000000 R09=00000000 R10=00000000 R11=00000000
    R12=00000000 R13=00000000 R14=00000000 R15=00010000
    PSR=400001d3 -Z-- A svc32
    ----------------
    IN: _start
    0x00010000:  e59f2018      ldr  r2, [pc, #24]   ; 0x10020
    
    R00=00000000 R01=00000000 R02=0001002b R03=00000000
    R04=00000000 R05=00000000 R06=00000000 R07=00000000
    R08=00000000 R09=00000000 R10=00000000 R11=00000000
    R12=00000000 R13=00000000 R14=00000000 R15=00010004
    PSR=400001d3 -Z-- A svc32
    ----------------
    IN: _start
    0x00010004:  e5f23001      ldrb r3, [r2, #1]!
    

    Debugging with GDB

    Qemu integrate a GDB server to allow execution and debug via extenal debugger

    To install GDB and DDD:

    sudo apt-get install gdb-multiarch ddd

    You must compile code with -ggdb or -gstabs option to allow symbolic debugging

    # Allowing symbolic debug (in Makefile)
    CFLAGS   += -ggdb
    CXXFLAGS += -ggdb
    ASFLAGS  += -ggdb
    

    Run qemu with the following options: -s (Wait for GDB connection on default -1234- TCP port) and -S (Do not start CPU)

    QEMU_AUDIO_DRV=none qemu-system-arm -M versatilepb -m 128M -kernel kernel.raw -nographic -S -s

    Run the ddd/gdb debugger (Note: DDD will start automaticaly GDB client). Optionally provide an initialisation script

     ddd --debugger gdb-multiarch --command gdb.init
    (gdb) set arch arm
    The target architecture is assumed to be arm
    (gdb) symbol-file kernel.elf
    (gdb) target remote localhost:1234
    (gdb) break c_entry
    (gdb) cont
    

    Some gdb commands

    Create breakpoint: break at c_entry, or at given address

    (gdb) break c_entry
    (gdb)b *0x7c1d

    Run until breakpoint is reached

    (gdb) cont

    Run a single step instruction run 10 instructions

    (gdb) stepi 10

    Dump memory: Dump 10 words (hex) at address 0x10000

    (gdb) x /10xw 0x10000

    Step variable: set PC to 0x10000

    (gdb) set variable $pc = 0x10000
    disas /r 0x8000,0x8010
    Hello World (C++)
    Objective

    Create another version of the HelloWorld application this time in C++. Also review initialisation requirements and linker configuration.

    C++ code, simple version

    The source

     HelloWorld.cpp
    /*
     * HelloWorld.cpp
     *
     * Print out Hello World on Arm system
     */
    
    // -- Declarations ------------------------------------------------------------
    
    #define UART_BASE 0x101f1000
    #define UART_DR   0
    
    int main();
    
    class Uart { // Uart class declaration
      volatile unsigned int * const regs;
    public:
      Uart(); // Constructor
      friend Uart& operator<< (Uart &uart, const char * str); // print operator
    };
    
    // -- Functions ---------------------------------------------------------------
    
    // Program entry: setup the stack and calls main function, must be first
    extern "C" void __attribute__((naked)) _start() {
      asm("ldr sp, =0x20000");
      main();
      while (1) ;
    }
    
    // Uart Constructor
    Uart::Uart() : regs((unsigned int *)UART_BASE) {}
    
    // Copy a string to UART Data Register
    Uart& operator<< (Uart &uart, const char * str) {
      while(*str != '\0')
        uart.regs[UART_DR] = (unsigned int)(*str++);
      return uart;
    }
    
    // Main function: Print string to Uart
    int main() {
      Uart uart;
      uart << "Hello World!\n";
      return 0;
    }
    

    Makefile

    C++ options should be added to the Makefile:

    This code must use the ARM 32 bit instruction set (.code 32) since ARM always starts in ARM mode

    i686-elf-g++ -c kernel.c++ -o kernel.o -ffreestanding -O2 -Wall -Wextra -fno-exceptions -fno-rtti

     Makefile
    CXX = $(PREFIX)g++
    CXXFLAGS = -mcpu=$(CPU) -Os -fno-rtti -fno-exceptions -fno-use-cxa-atexit

    The C++ flags ensure that GCC will generate code that do not rely upon any standard library features.

    Compiler options are:

    Entry point and low level initialisation

    C/C++ program usually starts at main function. But some prior initialisation is needed when running from scratch. At bare minimum, the Stack pointer must be set.

    This is done in the _start (gcc default linker entry) function (note that this function is declared as 'naked' since it cannot be treated as a standard C function).
    That initialisation will be complemented in the next chapters.

    More comprehensive C/C++ environent

    Initialisation

    We will now extend the initialisation code and isolate it in specific files (After all, this is why we use Makefile). The initialization will include the following:

    More info on GCC initialization can be found here.

    The startup code

    Startup code typically goes to an assembly file:

     startup.s (Assembly startup file)
    /*
     * startup.s - Startup code for ARM
     */
    
    .global _reset_handler
    
      /* Code section */
      .text
      .code 32
    
    /* Startup code. Do basic system initialisation.*/
    _reset_handler:
      /* Setup stack pointer */
      LDR sp, =__stack_top
    
      /* Enable L1 cache */
      mov r0, #0
      mcr p15, 0, r0, c7, c7, 0 ;@ invalidate caches
      mcr p15, 0, r0, c8, c7, 0 ;@ invalidate tlb
      mrc p15, 0, r0, c1, c0, 0
      orr r0,r0,#0x1000 ;@ instruction
      orr r0,r0,#0x0004 ;@ data
      mcr p15, 0, r0, c1, c0, 0
    
      /* Clear ZI area (BSS) */
      LDR r1, =__bss_start
      LDR r2, =__bss_end
      MOV r3, #0
    1:
      CMP r1, r2
      STMLTIA r1!, {r3}
      BLT 1b
    
      /* C++ init (call static constructors) */
      BL do_init_array
    
      /* Branch to C code */
      BL main
    
      /* C++ finalisation (call static destructors) */
      BL do_fini_array
    
      /* Done */
      B .
    
      .end
    
     HelloWorld.ld (Linker script)
      .text : {
        *(.text)
        *(.text*)
    
        PROVIDE (_init = .);
        *crti.o(.init)
        *(.init)
        *crtn.o(.init)
    
        PROVIDE (_fini = .);
        *crti.o(.fini)
        *(.fini)
        *crtn.o(.fini)
      }
    
      .preinit_array : {
        PROVIDE_HIDDEN (__preinit_array_start = .);
        KEEP (*(SORT(.preinit_array.*)))
        KEEP (*(.preinit_array*))
        PROVIDE_HIDDEN (__preinit_array_end = .);
      }
    
      .init_array : {
        PROVIDE_HIDDEN (__init_array_start = .);
        KEEP (*(SORT(.init_array.*)))
        KEEP (*(.init_array*))
        PROVIDE_HIDDEN (__init_array_end = .);
      }
    
      .fini_array : {
        PROVIDE_HIDDEN (__fini_array_start = .);
        KEEP (*(.fini_array*))
        KEEP (*(SORT(.fini_array.*)))
        PROVIDE_HIDDEN (__fini_array_end = .);
      }
    
    EFI bios

    sudo apt-get install gnu-efi

    QEMU EFI/UEFI support:
    Will prepare a UEFI fat filesystem, that boots a kernel

    copy kernel to ./hd/efi/boot/kernel
    run: qemu-system-i386 -bios bios/efi32.fd -drive file=fat:hd-efi,format=raw
       qemu-system-x86_64 -bios bios/efi64.fd -drive file=fat:hd-efi,format=raw
    # If fails in efishell 'fs0:', 'cd efi/boot', 'bootia32.efi'
    
    Syslinux can boot mboot image on EFI system
        cp $SYSLINUX/efi32/efi/syslinux.efi bootia32.efi
        #cp $SYSLINUX/efi64/efi/syslinux.efi bootx64.efi
        cp $SYSLINUX/efi32/com32/elflink/ldlinux/ldlinux.e32 .
        #cp $SYSLINUX/efi64/com32/elflink/ldlinux/ldlinux.e64 .
        cp $SYSLINUX/efi32/com32/mboot/mboot.c32 .
        cp $SYSLINUX/efi32/com32/lib/libcom32.c32 .
    
        echo "DEFAULT mboot.c32 /os.elf" > syslinux.cfg
    
    http://www.rodsbooks.com/efi-programming/prepare.html

    Networking
    Objective
    Networking in Qemu
    
    Client: if=eth0, MAC=00:10:20:00:00:01, IP=192.168.1.10/24, GW=192.168.1.1, DNS[]=212.27.40.241,212.27.40.240
    GW:              MAC=00:30:40:00:00:ff, IP=192.168.1.1
    DHCP:            MAC=00:30:40:00:00:fe, IP=192.168.1.2
    DNS:                                    IP=212.27.40.241
    
    > net.setHwAddr(if=eth0, MAC=00:10:20:00:00:01)
    > net.ifconfig(if=eth0, IP=192.168.1.10/24, GW=192.168.1.1, DNS[]=212.27.40.241)
    > net.autoconfig(if=eth0, MAC=00:10:20:00:00:01)
      > bootp.dhcp_discover(if=eth0, hostname="netclient")
        > udp.send(src=68[BOOTPC], dst=67[BOOTPS], *)
          > ip.send(src=0.0.0.0, dst=255.255.255.255, prot=17[UDP], *)
            > mac.send   (src=00:10:20:00:00:01, dst=ff:ff:ff:ff:ff:ff, type=0x0800[IP], *)
            > mac.receive(src=00:30:40:00:00:fe, dst=ff:ff:ff:ff:ff:ff, type=0x0800[IP])
          > ip.receive(src=192.168.1.2, dst=255.255.255.255, prot=17[UDP])
        > udp.receive(src=67[BOOTPS], dst=68[BOOTPC])
      > bootp.receive(dhcp_offer, IP=192.168.1.10/24, GW=192.168.1.1, DNS[]=212.27.40.241,212.27.40.240)
      > bootp.dhcp_request(if=eth0, DhcpSrv=192.168.1.1, IP=192.168.1.10, hostname="netclient")
        > udp.send(src=68[BOOTPC], dst=67[BOOTPS], *)
          > ip.send(src=0.0.0.0, dst=255.255.255.255, prot=17[UDP], *)
            > mac.send   (src=00:10:20:00:00:01, dst=ff:ff:ff:ff:ff:ff, type=0x0800[IP], *)
            > mac.receive(src=00:30:40:00:00:fe, dst=ff:ff:ff:ff:ff:ff, type=0x0800[IP])
          > ip.receive(src=192.168.1.2, dst=255.255.255.255, prot=17[UDP])
        > udp.receive(src=67[BOOTPS], dst=68[BOOTPC])
      > bootp.receive(dhcp_ack, IP=192.168.1.10/24, GW=192.168.1.1, DNS[]=212.27.40.241,212.27.40.240)
    
    > http.get("http://www.google.com/index.html");
      > net.resolve("www.google.com")
        > dns.query("www.google.com A IN")
          > udp.send(src=*, dst=53[DNS], *)
            > ip.send(src=IP, dst=212.27.40.241[DNS.0], prot=17[UDP], *)
              > arp.request(ip=192.168.1.1[GW])
                > mac.send(src=00:10:20:00:00:01, dst=ff:ff:ff:ff:ff:ff, type=0x806[ARP], *)
                > mac.receive(src=00:30:40:00:00:ff, dst=00:10:20:00:00:01, type=0x806[ARP])
              > arp.response(arp[192.168.1.1] = 00:30:40:00:00:ff)
              > mac.send(src=00:10:20:00:00:01, dst=00:30:40:00:00:ff[GW], type=0x0800[IP], *)
              > mac.receive(src=00:30:40:00:00:ff[GW], dst=00:10:20:00:00:01, type=0x800[IP])
            > ip.receive(src=212.27.40.241[DNS.0], dst=IP, prot=17[UDP])
          > udp.receive(src=53[DNS], dst=*)
        > dns.receive(response, www.google.com = "A IN 173.194.45.55")
    
      > tcpCnx = tcp.open("173.194.45.55", 80)
      > tcpCnx.send(src=*, dst=80[http], SYN, "")
        > ip.send(src=IP, dst=173.194.45.55[www], prot=6[TCP], *)
          > mac.send(src=00:10:20:00:00:01, dst=00:30:40:00:00:ff[GW], type=0x0800[IP], *)
          > mac.receive(src=00:30:40:00:00:ff[GW], dst=00:10:20:00:00:01, type=0x800[IP])
        > ip.receive(src=173.194.45.55[www], dst=00:10:20:00:00:01, prot=6[TCP])
      > tcpCnx.receive(src=80[http], dst=*, SYN+ACK)
      > tcpCnx.send("", ACK)
        > ip.send()
          > mac.send()
      -- tcpCnx = established ---------------------------
      > tcpCnx.send("GET /index.html HTTP/1.1", PSH+ACK)
        > ip.send()
          > mac.send()
          > mac.receive()
        > ip.receive()
      > tcpCnx.receive("", ACK)
      > tcpCnx.receive("HTTP/1.1 200 OK ...", ACK)
      > tcpCnx.send("", ACK)
      > tcpCnx.receive("...", PSH+ACK)
      > tcpCnx.send("", ACK)
    
      -- tcpCnx = teardown ---------------------------
      > tcpCnx.receive("", FIN+ACK)
      > tcpCnx.send("", ACK)
    
    
    SYN>     (   0) Seq=   0          fd858d8c 00000000
    >SYN+ACK (   0) Seq=   0 Ack=   1                   b90644d8 fd858d8d
    ACK>     (   0) Seq=   1 Ack=   1 fd858d8d b90644d9
    
    PSH,ACK> ( 129) Seq=   1 Ack=   1 fd858d8d b90644d9
    >ACK     (   0) Seq=   1 Ack= 130                   b90644d9 fd858e0e
    >ACK     (1448) Seq=   1 Ack= 130                   b90644d9 fd858e0e
    ACK>     (   0) Seq= 130 Ack=1449 fd858e0e b9064a81
    >PSH,ACK (  23) Seq=1449 Ack= 130                   b9064a81 fd858e0e
    ACK>     (   0) Seq= 130 Ack=1472 fd858e0e b9064a98
    
    >FIN,ACK (   0) Seq=1472 Ack= 130                   b9064a98 fd858e0e
    ACK>     (   0) Seq= 130 Ack=1473 fd858e0e b9064a99
    

    Overview

    Qemu includes support for Networking development. Several emulated machines include an Ethernet interface. The qemu ARM VersatilePB includes a SMSC LAN91C111 emulated ethernet interface, such as the original board.

    More info on the LAN91C111 can be optained from SMSC website: http://www.smsc.com/Products/Ethernet_and_Embedded_Networking/Ethernet_Controllers/LAN91C111.

    Note: Following mplementation will focus on Qemu emulated device, and is not suitabble for real VersatilePB Hardware since emulation is only partial

    Networking support

    We will focus on user mode networking support in Qemu. Alternative such as TAP interface allows for better integration with emulation host but are more complex to setup and require root privileges. In user mode, Qemu will implement it's own LAN, Qemu acting as the DHCP server and a Gateway for the emulated guest. This allows the guest system to access the OS network, but the host itself is not directly accessible from network (similar to a device behind a NAT GW).

    The Virtual LAN network is created with "-net user" Qemu option. Additionally, included servers and services (DHCP, DNS, BOOTP/TFTP, SAMBA) can be configured.

    The Guest LAN network interface is created with "-net nic" Qemu option. It also support several option to fine tune interface (Address, name ...)

    Note: network is implicitly created on supporting machines and "-net user -net nic" only need to be specified when additional configuration is needed.

     QEMU_AUDIO_DRV=none qemu-system-arm -M versatilepb -m 128M -serial stdio -net user -net nic -kernel kernel.raw
    
    # Configuration:
    -net user[,vlan=n][,name=str][,net=addr[/mask]][,host=addr][,restrict=on|off]
             [,hostname=host][,dhcpstart=addr][,dns=addr][,tftp=dir][,bootfile=f]
             [,hostfwd=rule][,guestfwd=rule][,smb=dir[,smbserver=addr]]
                    connect the user mode network stack to VLAN 'n', configure its
                    DHCP server and enabled optional services
    
    -net nic[,vlan=n][,macaddr=mac][,model=type][,name=str][,addr=str][,vectors=v]
                    create a new Network Interface Card and connect it to VLAN 'n'
    

    Port redirection

    Since the emulated network is separated from the host one via a GW, LAN traffic is not available to Guest by default. Port redirection can be configured with the "-redir" Qemu option.
    Note: As many redir options can be specified as needed.
    Note: Root privileges may be needed to redirect registered ports (ports lower than 1024).
    Note: Some port may not be available (such are system ports, or when already bind)

    To redirect 8080 port to guest http port (so that browser can access guest webserver at http://localhost:8080), use:

    QEMU_AUDIO_DRV=none qemu-system-arm -M versatilepb -m 128M -serial stdio -net user -net nic -redir tcp:8080::80 -kernel kernel.raw

    Note: There now is a new QEMU interface:

    qemu -net user,hostfwd=tcp:127.0.0.1:8080-:80

    Network Traffic analysing

    Qemu allows to dump network traffic, for a particular interface, in a file. It uses pcap format, allowing to use TCP dump or Wireshark for parsing.

    QEMU_AUDIO_DRV=none qemu-system-arm -M versatilepb -m 128M -serial stdio -net user -net nic -redir tcp:8080::80 -net dump -kernel kernel.raw
    /usr/sbin/tcpdump -A -nr qemu-vlan0.pcap
    reading from file qemu-vlan0.pcap, link-type EN10MB (Ethernet)
    01:00:04.963452 ARP, Request who-has 10.0.2.15 tell 10.0.2.2, length 28
    01:00:04.980145 ARP, Request who-has 10.0.2.2 tell 10.0.2.15, length 50
    01:00:04.980169 ARP, Reply 10.0.2.2 is-at 52:55:0a:00:02:02, length 50
    01:00:05.004532 IP 192.168.0.17.59440 > 10.0.2.15.80: Flags [S], seq 640001, win 8760, options [mss 1460], length 0
    mkfifo live.pcap
    wireshark -k -i live.pcap &
    QEMU_AUDIO_DRV=none qemu-system-arm -M versatilepb -m 128M -serial stdio -redir tcp:8080::80 -net user -net nic -net dump,file=live.pcap -kernel kernel.raw
    rm live.pcap
    You may want to reconfigure wireshark to capture as non root user: sudo dpkg-reconfigure wireshark-common ; sudo adduser kerneldev wireshark
    buildapps

    toolchain (newlib) needs a few function to work correctly on undef OS: http://sourceware.org/newlib/libc.html#Syscalls
    Also: https://launchpadlibrarian.net/126639247/readme.txt

    STARTUP_DEFS=-D__NO_SYSTEM_INIT

    more /opt/gcc-arm-none-eabi-4_7-2012q4/share/gcc-arm-none-eabi/samples/ldscripts/sections.ld

    nasm -o test.bin test.s
    objdump -D -b binary -mi8086 test.bin
    dd bs=1 skip=$((0x55)) if=test.bin of=test.bin.32
    objdump -D -b binary -mi386 test.bin.32
    dd bs=1 skip=$((0x10b)) if=test.bin of=test.bin.64
    objdump -D -b binary -mi386:x86-64 test.bin.64
    
    Creating a CPU
    A B CNT RES DP SP PC FLG ALU Exec. Unit IR DBus IBus Clk Rst

    Instructions:

    LDR[u][size][cond] reg               [ptr] -> reg
    STR[u][size][cond] reg               reg -> [ptr]
    MOV[u][size][cond] reg_src, reg_dst  reg_src -> reg_dst
    SWP[u][size][cond] reg_src, reg_dst  reg_src <-> reg_dst
    MOV[u][size][cond] #imm16, reg       imm16 -> reg
    
    ADD[u][size][cond][!]                R = A + B
    SUB[u][size][cond][!]                R = A - B
    # MUL[u][size][cond]                 R = A * B
    # DIV[u][size][cond]                 R = A / B
    INCR[size] reg
    DECR[size] reg
    
    NOT[size][cond][!]
    OR[size][cond][!]
    AND[size][cond][!]
    #NOR[size][cond][!]
    #NAND[size][cond][!]
    #XOR[size][cond][!]
    CMP[size][cond]
    
    SHL[size][cond][!] reg
    SHR[u][size][cond][!] reg
    ROL[size][cond][!] reg
    ROR[size][cond][!] reg
    
    JMP[cond] reg                       PC + imm16 -> PC
    JMP[cond] #imm16                    PC + imm16 -> PC
    
    
    By Bertrand Tognoli
    2022-01-05