tutorial on setting up your toolchain

STM32 Example

Overview

This tutorial will walk you through the process of moving away from vendor-specific IDEs (like STM32CubeIDE) and transitioning to a modern, portable build and debug setup using open-source tools. The following components will be covered:

CMake and Ninja for build generation
GCC as a compiler toolchain
Make and Python for scripting
CPM.cmake for package management
OpenOCD and GDB for debugging
Full integration into Visual Studio Code (VSCode)

The goal is to establish a reusable and consistent embedded development workflow across microcontrollers and platforms.

Use STM32CubeIDE to generate project skeleton

Even though our goal is to become independent from the embedded systems IDEs (like the CubeIDE), it provides a convenient GUI to initialize clocks, peripherals, and generate vendor-specific boilerplate like:

Linker scripts
Startup files
HAL/CMSIS initialization

Open STM32CubeIDE.
Create a new STM32 project: file > new > STM32 Project
Choose your MCU (e.g. STM32F446RE) or development board.
Configure clocks, peripherals, middleware, and project settings as desired.
Generate code via Project > Generate Code.

This will create a folder structure similar to (depending on your actuar board and MCU):

📁 root/
├── 📁 Core/
│   ├── 📁 Inc/                       # Header files
│   │   ├── main.h                    # Main application header
│   │   ├── stm32f4xx_hal_conf.h      # HAL configuration macros
│   │   └── stm32f4xx_it.h            # Interrupt service routine declarations
│   └── 📁 Src/                       # Source files
│       ├── main.c                    # Application entry point
│       ├── stm32f4xx_hal_msp.c       # HAL initialization (MSP - MCU Support Package)
│       ├── stm32f4xx_it.c            # Interrupt service routines
│       ├── syscalls.c                # System stubs (e.g., `_kill`, `_write`, `_read`)
│       ├── sysmem.c                  # Defines `_sbrk()` for dynamic memory (malloc)
│       └── system_stm32f4xx.c        # System clock and configuration setup
├── 📁 Drivers/                       # Vendor-supplied HAL and CMSIS drivers
├── 📁 Startup/
│   └── startup_stm32f446retx.s       # MCU startup code and vector table
└── STM32F446RETX_FLASH.ld            # Linker script defining memory layout

These files will serve as the foundation of the project. Of course later you might want to write yor own linker scripts, but this is a nice starting point

Create your own build system

Instead of building the project with the IDE, we will set up a modern CMake-based build system.

To seperate the application code from the MCU setup, auto generated files and initialisations it is good practice to create a sub-folder called Application/ containing your application code and let the main function call into your application, while vendor files remain mostly untouched in Core/ and Drivers/.

Another approach would be to make your application a standalone library that only works with interfaces and abstract device drivers. The vendor generated root folder - plus some glue logic - will then provide concrete devices like pins, timers and communications to your application. That way you have fully portable applications that do not depend on microcontrollers or boards. (Just in case a new chip shortage comes accross).

Create the following CMakeLists.txt in the root folder:

cmake_minimum_required(VERSION 3.16)
 
# define project name and declare that we are going to use assembly (ASM), C and C++ (CXX)
project(ProjectName ASM C CXX)
 
# include the package manager 
include(CPM.cmake)
 
##########################################################################
#                               Libraries
##########################################################################
 
# Add the application code as a standalone library
add_subdirectory(Application)
 
##########################################################################
#           Get HAL, CMSIS and auto generated Core-Files
##########################################################################
 
# Get all autogenerated files
# HAL, CMSIS, and Core-Files
file(GLOB_RECURSE Core_Src_Files "Core/Src/*.c")
file(GLOB_RECURSE HAL_Src_Files "Drivers/*.c")
set(CORE_SRC_FILES
    Core/Startup/startup_stm32f446retx.s
    ${Core_Src_Files}
    ${HAL_Src_Files}
)
set(CORE_INC_DIRS
    Core/Inc
    Drivers/CMSIS/Include
    Drivers/CMSIS/Device/ST/STM32F4xx/Include
    Drivers/STM32F4xx_HAL_Driver/Inc
    Drivers/STM32F4xx_HAL_Driver/Inc/Legacy
)
 
##########################################################################
#                           Define project files
##########################################################################
 
# create an executable that can be flashed to the MCU with the name `main` (or a different name)
# pass it the source files that should be compiled
add_executable(main 
    ${CORE_SRC_FILES}
)
 
# Add the include search directories to the `main` target
target_include_directories(main
    PUBLIC 
        ${CORE_INC_DIRS}
)
 
# Optional: Fiber provides some very useful and efficient hardware drivers. 
# To make use of them pass the include directories to fiber as well.
target_include_directories(fiber PUBLIC ${CORE_INC_DIRS})
 
# link the actual application to the main target
target_link_libraries(main PUBLIC App)
 
# define the output name to be an `*.elf` file used for flashing.
set_target_properties(main PROPERTIES OUTPUT_NAME "main.elf")

Then, create a separate toolchain file toolchain.cmake to define cross-compilation settings for the ARM Cortex-M MCU:

# Compile generic code that runs bare metal and not within an encapsulating operating system
# https://cmake.org/cmake/help/latest/variable/CMAKE_SYSTEM_NAME.html
set(CMAKE_SYSTEM_NAME Generic)
 
# Define the processor arcitecture, for the STM32F4 it is the `arm`
# https://cmake.org/cmake/help/latest/variable/CMAKE_SYSTEM_PROCESSOR.html
set(CMAKE_SYSTEM_PROCESSOR arm)
 
# Set the compiler executables for cross compilation. (Optional: add the full path if you do not have them in the systems path)
set(CMAKE_C_COMPILER "arm-none-eabi-gcc.exe")
set(CMAKE_CXX_COMPILER "arm-none-eabi-g++.exe")
set(CMAKE_ASM_COMPILER "arm-none-eabi-gcc.exe")
set(CMAKE_OBJCOPY "arm-none-eabi-objcopy.exe")
set(CMAKE_SIZE "arm-none-eabi-size.exe")
 
# Compile to a static library and no not link dynamically at runtime, because on bare-metal embedded targets there is nothing to link to dynamically.
set(CMAKE_TRY_COMPILE_TARGET_TYPE STATIC_LIBRARY)
 
 
###########################################
#       Global Options for all targets
###########################################
 
# Set target architecture and cpu specific parameters
#
# Description of the compile options:
#   https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html
#
# PM0214 STM32 Cortex®-M4 MCUs and MPUs programming manual 10.0 
#   https://www.st.com/resource/en/programming_manual/pm0214-stm32-cortexm4-mcus-and-mpus-programming-manual-stmicroelectronics.pdf
#
# RM0390 STM32F446xx advanced Arm®-based 32-bit MCUs 6.0 
#   https://www.st.com/resource/en/reference_manual/rm0390-stm32f446xx-advanced-armbased-32bit-mcus-stmicroelectronics.pdf
 
set(CPU_FLAGS 
    "-mcpu=cortex-m4"       # use cortex m4 instructions
    "-mthumb"               # use 16-bit (Thumb) instead of 32-bit (full ARM) instructions | STM32 only supports Thumb
    "-mfpu=fpv4-sp-d16"     # This specifies what floating-point hardware
    "-mfloat-abi=hard"      # Generate hardware instructions for the floating-point hardware
)
 
add_compile_options(
    ${CPU_FLAGS}
 
    # use small c library implementation
    --specs=nano.specs 
 
    # Place functions and data in separate sections to allow linker garbage collection.
    -ffunction-sections  
    -fdata-sections      
 
    # do not store the frame pointer and make room for another register. Enables optimisations for performance.
    -fomit-frame-pointer  
)
 
# Add link options to all files
add_link_options(
    # pass the CPU flags to the linker so it knows how to link correctly
    ${CPU_FLAGS}
    
    # add the linker script that the IDE provided
    -Wl,-T${CMAKE_SOURCE_DIR}/STM32F446RETX_FLASH.ld
    
    # tell the linker to delete and garbage collect section dead code
    -Wl,--gc-sections 
)
 
# Add libraries that can be used by all files
link_libraries(
    nosys   # Minimal system stubs for functions like: of functions like `write()`, `read()`, `_exit()`, `sbrk()`, etc.
    gcc     # Compiler runtime support for operations not directly supported by hardware.
    c_nano  # A reduced-size version of the standard C library 
    m       # Provides functions like `sin()`, `cos()`, `sqrt()`, `pow()`, etc.
)
 
# Add CPU specific definitions - especially for the auto generated vendor files
add_compile_definitions(
    USE_HAL_DRIVER
    STM32F446xx
)

This sets up the proper compiler, target architecture, flags, and memory configuration for your STM32F4 MCU.

Now set up your Application/CMakeLists.txt. Two common practices are either in a sub folder (which this tutorial is doing) or as a seperate library in a seperate repo.

# Includ the package manager from: https://github.com/cpm-cmake/CPM.cmake
include(cmake/CPM.cmake)
 
# Optional: Enable system call stubs for freestanding/bare-metal for Fiber
set(FIBER_USE_SYS_STUBS ON CACHE BOOL "" FORCE)
 
# Add the Fiber library
CPMAddPackage("gh:TobiasWallner/Fiber#main")
 
# create the actual application. 
# It will contain generic code that can then be used by many boards and mcus
add_library(App STATIC
    app.cpp
)
 
# link fiber to the application
target_link_libraries(App PRIVATE fiber)
 
# optionally: if `FIBER_USE_SYS_STUBS ON`. !!! Has to link PUBLIC !!!
target_link_libraries(App PUBLIC fiber_sys_stubs)

Create a Makefile for easy CLI (Command Line Interface) builds

Instead of typing long CMake commands, create a Makefile to simplify your workflow:

If you are no windows and do not have make available, you can get it like so

Download MSYS2 from their homepage: https://www.msys2.org/
Open the MSYS2 shell
Update the package manager pacman
pacman -Syu
Install make
pacman -S make
Add the msys2 make directory C:\msys64\usr\bin to the PATH so you can use it in your host systems terminals

For this particular makefile I chose to use Ninja as a generator. This just tells CMake to create Ninja flags. You can get Ninja from here: https://ninja-build.org/

For flashing the make file uses the STM32_Programmer_CLI. You can download it from here: https://www.st.com/en/development-tools/stm32cubeprog.html or use the one provided by the vendor IDE

Additionally, this Makefile uses a custom python script (introduced later) that calculates the used FLASH and RAM.

#####################################
#           Variables
#####################################
 
STM32_Programmer_CLI = STM32_Programmer_CLI.exe
 
gcc_size = arm-none-eabi-size
gcc_nm = arm-none-eabi-nm
 
build_dir = build
config = Release
target = main
test_target = test
linker_script = STM32F446RETX_FLASH.ld
 
#####################################
#           Commands
#####################################
 
.PHONY: help
help: 
    $(info Make Commands:)
    $(info --------------)
    $(info build .......... builds the 'target' with the 'config')
    $(info flash .......... builds the 'target' with the 'config' and then flashes the binary to the micro-controller)
    $(info test ........... convenience command that executest flash with the 'test_target')
    $(info )
    $(info Optional Arguments:)
    $(info ---------------)
    $(info build_dir= ...... sets the build directory. Default: $(build_dir))
    $(info config= ........ sets the build type: 'Release' or 'Debug'. Default: $(config))
    $(info target= ........ sets the target/CMake executable. Default: $(target))
    $(info test_target= ... sets the target/CMake executable. Default: $(test_target))
 
# CMAKE Setup
# -----------
.PHONY: setup_cmake
setup_cmake:
    $(info )
    $(info ================================ [build_dir: ./$(build_dir)/ | target: $(target) | config: $(config)] ================================)
    $(info )
    cmake -S . -B $(build_dir) -G "Ninja Multi-Config" -DCMAKE_TOOLCHAIN_FILE=toolchain.cmake -DFIBER_COMPILE_TESTS=ON
    
# build commands
# --------------
 
# build run target
.PHONY: build
build: setup_cmake
    cmake --build $(build_dir) --target $(target) --config $(config)
    $(gcc_size) --format=berkeley "$(build_dir)/$(config)/$(target).elf" > "$(build_dir)/$(config)/$(target)_size.txt"
    python stats.py -l $(linker_script) -s "$(build_dir)/$(config)/$(target)_size.txt"
    $(gcc_nm) --size --print-size --radix=d "$(build_dir)/$(config)/$(target).elf" > "$(build_dir)/$(config)/$(target)_fsize.txt"
 
# flash run target
.PHONY: flash
flash: build
    ${STM32_Programmer_CLI} -c port=SWD -d "$(build_dir)/$(config)/$(target).elf" -v -rst
    
.PHONY: test
test: build
    @make flash target=$(test_target)
 
.PHONY: openocd
openocd: build_run_debug
    openocd -f openocd.cfg
 
.PHONY: gdb
gdb:
    arm-none-eabi-gdb "$(build_dir)/$(config)/$(target).elf"
 
 
.PHONY: clean
clean:
    cmake --build build --target clean --config Release
    cmake --build build --target clean --config Debug

This lets you run commands like:

make build to build the project
make flash to flash the binary
make test to test binaries

You'll also probably want this Python script stats.py that calculate memory FLASH and RAM usage of your application code:

import re
import argparse
 
def parse_bytes(size_str):
    size_str = size_str.strip().upper()
    units = {"K": 1024, "k" : 1024, "M": 1024**2, "G": 1024**3, "T": 1024**4}
 
    num = ""
    unit = ""
    for char in size_str:
        if char.isdigit() or char == ".":
            num += char
        else:
            unit += char
 
    if not num:
        raise ValueError(f"Invalid size format: {size_str}")
 
    num = float(num)  # Support decimal sizes like 1.5M
 
    return int(num * units.get(unit, 1))
 
# get arguments
parser = argparse.ArgumentParser(
                    prog='stats',
                    description='Provides statistics from binary files')
 
parser.add_argument('-l', '--linker_script', help='Path to the linker script')
parser.add_argument('-s', '--size_file', help='Path to the size file')
 
args = parser.parse_args()
print(f"args: {args}")
 
# read linker script
with open(args.linker_script, "r") as linker_script:
    linker_stript_str = linker_script.read()
        
# extract the ram size
regex_ram = r'RAM.*LENGTH\s*=\s*(.*)\s'
match = re.search(regex_ram, linker_stript_str)
ram_str = match.group(1)
ram_size = parse_bytes(ram_str)/1024
 
# extract the flah size
regex_flash = r'FLASH.*LENGTH\s*=\s*(.*)\s'
match = re.search(regex_flash, linker_stript_str)
flash_str = match.group(1)
flash_size = parse_bytes(flash_str)/1024
 
# read size file
with open(args.size_file, "r") as size_file:
    lines = size_file.readlines()
    text_str, data_str, bss_str, dec_str, hex_str, filename = map(str, lines[1].split())
 
# sizes converted to kB
text_size = parse_bytes(text_str)/1024.
data_size = parse_bytes(data_str)/1024.
bss_size = parse_bytes(bss_str)/1024.
dec_size = parse_bytes(dec_str)/1024.
 
ram_used = bss_size + data_size
flash_used = (text_size + data_size)
 
 
print("\n--------------------------- Memory Usage in kB: ---------------------------")
print(f"filename: {filename}")
print(f".text: {text_size:.1f}, .data: {data_size:.1f}, .bss: {bss_size:.1f}")
print(f"Ram:   .bss + .data  = {ram_used:.1f}\tof\t{ram_size:.0f}\t{ram_used*100/ram_size :.1f}%")
print(f"Flash: .text + .data = {flash_used:.1f}\tof\t{flash_size:.0f}\t{flash_used*100/flash_size :.1f}%\n")
print("------------------------------------ Legend -------------------------------")
print(".text (Code Segment): Contains the compiled machine code")
print(".data (Initialized Data Segment): Holds initialized global and static variables.")
print(".bss (Uninitialized Data Segment): Contains uninitialized global and static variables.\n")

Setup VSCode for Development + Debugging

Though the above make files allow for an easy build process using the terminal, vscode offers some nice to have features that let you step through your code in the debugger.

To enable those features we will setup a GDB server and connect to the MCU over OpenOCD.

Place the following config files in a .vscode/ folder:

Debug Configuration .vscode/launch.json

"version": "0.2.0",
"configurations": [
    {
        "name": "STM32 Debug",
        "type": "cortex-debug",
        "request": "attach",
        "servertype": "openocd",
        "configFiles": ["openocd.cfg"],
        "cwd": "${workspaceRoot}",
        "executable": "${workspaceRoot}/build/Debug/main.elf",
        "gdbTarget": "localhost:3333",
        "showDevDebugOutput": "raw",
        "runToEntryPoint": "main",
        "preLaunchTask": "flash-debug"
    }
]

Build Task for Debugging .vscode/tasks.json

"version": "2.0.0",
"tasks": [
    {
        "label": "flash-debug",
        "type": "shell",
        "command": "make",
        "args": ["flash", "config=Debug"],   // Calls the `debug` target in Makefile
        "group": {
            "kind": "build",
            "isDefault": true
        },
        "problemMatcher": ["$gcc"]
    }
]

Also create an OpenOCD configuration file to define your probe and target:

source [find "interface/stlink.cfg"]
source [find "target/stm32f4x.cfg"]
 
reset_config srst_only
init
reset init

You now have a clean and modern embedded development setup that, is portable, works across operating systems, avoids vendor lock-in, offers full integration with advanced tooling like GDB, OpenOCD, and VSCode and is easily integrateable it into a CD/CI system on e.g.: GitHub or GitLab.

This approach helps you scale your codebase, reuse your setup across projects, and work just like you would in a modern software engineering environment.

To implement CD/CI with automatic testing on pushing to the repo when working with GitHub:

create a folder called .github (has to be the exact name)
create a folder called .github/workflows (has to be the exact name)
create a workflow file e.g: test.yml
write a script that compiles your project and executes the tests. Since cmake already does most of the work, that test.yml can be fairly small.

.github/workflows/test.yml

name: test
 
on:
  push:
  pull_request:
 
jobs:
  test:
    runs-on: ubuntu-latest
 
    env:
      BUILD_TYPE: Release
 
    steps:
    - uses: actions/checkout@main
 
    - name: Install dependencies
      run: |
        sudo apt update
        sudo apt install ninja-build cmake
        sudo apt install g++ # Just in case
 
    - name: Set up compiler
      run: |
          echo "CC=gcc" >> $GITHUB_ENV
          echo "CXX=g++" >> $GITHUB_ENV
 
    - name: Configure with CMake
      run: cmake -S . -B build -G "Ninja Multi-Config"
 
    - name: Build
      run: cmake --build build --config Release
 
    - name: Run CTest
      run: ctest --test-dir build -C Release -V