tutorial on setting up your toolchain
STM32 Example
Overview
This tutorial will walk you through the process of moving away from vendor-specific IDEs (like STM32CubeIDE) and transitioning to a modern, portable build and debug setup using open-source tools. The following components will be covered:
- CMake and Ninja for build generation
- GCC as a compiler toolchain
- Make and Python for scripting
- CPM.cmake for package management
- OpenOCD and GDB for debugging
- Full integration into Visual Studio Code (VSCode)
The goal is to establish a reusable and consistent embedded development workflow across microcontrollers and platforms.
Use STM32CubeIDE to generate project skeleton
Even though our goal is to become independent from the embedded systems IDEs (like the CubeIDE), it provides a convenient GUI to initialize clocks, peripherals, and generate vendor-specific boilerplate like:
- Linker scripts
- Startup files
- HAL/CMSIS initialization
- Open STM32CubeIDE.
- Create a new STM32 project:
file > new > STM32 Project
- Choose your MCU (e.g. STM32F446RE) or development board.
- Configure clocks, peripherals, middleware, and project settings as desired.
- Generate code via
Project > Generate Code.
This will create a folder structure similar to (depending on your actuar board and MCU):
๐ root/
โโโ ๐ Core/
โ โโโ ๐ Inc/ # Header files
โ โ โโโ main.h # Main application header
โ โ โโโ stm32f4xx_hal_conf.h # HAL configuration macros
โ โ โโโ stm32f4xx_it.h # Interrupt service routine declarations
โ โโโ ๐ Src/ # Source files
โ โโโ main.c # Application entry point
โ โโโ stm32f4xx_hal_msp.c # HAL initialization (MSP - MCU Support Package)
โ โโโ stm32f4xx_it.c # Interrupt service routines
โ โโโ syscalls.c # System stubs (e.g., `_kill`, `_write`, `_read`)
โ โโโ sysmem.c # Defines `_sbrk()` for dynamic memory (malloc)
โ โโโ system_stm32f4xx.c # System clock and configuration setup
โโโ ๐ Drivers/ # Vendor-supplied HAL and CMSIS drivers
โโโ ๐ Startup/
โ โโโ startup_stm32f446retx.s # MCU startup code and vector table
โโโ STM32F446RETX_FLASH.ld # Linker script defining memory layout
These files will serve as the foundation of the project. Of course later you might want to write yor own linker scripts, but this is a nice starting point
Create your own build system
Instead of building the project with the IDE, we will set up a modern CMake-based build system.
To seperate the application code from the MCU setup, auto generated files and initialisations it is good practice to create a sub-folder called Application/ containing your application code and let the main function call into your application, while vendor files remain mostly untouched in Core/ and Drivers/.
Another approach would be to make your application a standalone library that only works with interfaces and abstract device drivers. The vendor generated root folder - plus some glue logic - will then provide concrete devices like pins, timers and communications to your application. That way you have fully portable applications that do not depend on microcontrollers or boards. (Just in case a new chip shortage comes accross).
Create the following CMakeLists.txt in the root folder:
cmake_minimum_required(VERSION 3.16)
# define project name and declare that we are going to use assembly (ASM), C and C++ (CXX)
project(ProjectName ASM C CXX)
# include the package manager
include(CPM.cmake)
##########################################################################
# Libraries
##########################################################################
# Add the application code as a standalone library
add_subdirectory(Application)
##########################################################################
# Get HAL, CMSIS and auto generated Core-Files
##########################################################################
# Get all autogenerated files
# HAL, CMSIS, and Core-Files
file(GLOB_RECURSE Core_Src_Files "Core/Src/*.c")
file(GLOB_RECURSE HAL_Src_Files "Drivers/*.c")
set(CORE_SRC_FILES
Core/Startup/startup_stm32f446retx.s
${Core_Src_Files}
${HAL_Src_Files}
)
set(CORE_INC_DIRS
Core/Inc
Drivers/CMSIS/Include
Drivers/CMSIS/Device/ST/STM32F4xx/Include
Drivers/STM32F4xx_HAL_Driver/Inc
Drivers/STM32F4xx_HAL_Driver/Inc/Legacy
)
##########################################################################
# Define project files
##########################################################################
# create an executable that can be flashed to the MCU with the name `main` (or a different name)
# pass it the source files that should be compiled
add_executable(main
${CORE_SRC_FILES}
)
# Add the include search directories to the `main` target
target_include_directories(main
PUBLIC
${CORE_INC_DIRS}
)
# Optional: Fiber provides some very useful and efficient hardware drivers.
# To make use of them pass the include directories to fiber as well.
target_include_directories(fiber PUBLIC ${CORE_INC_DIRS})
# link the actual application to the main target
target_link_libraries(main PUBLIC App)
# define the output name to be an `*.elf` file used for flashing.
set_target_properties(main PROPERTIES OUTPUT_NAME "main.elf")
Then, create a separate toolchain file toolchain.cmake to define cross-compilation settings for the ARM Cortex-M MCU:
# Compile generic code that runs bare metal and not within an encapsulating operating system
# https://cmake.org/cmake/help/latest/variable/CMAKE_SYSTEM_NAME.html
set(CMAKE_SYSTEM_NAME Generic)
# Define the processor arcitecture, for the STM32F4 it is the `arm`
# https://cmake.org/cmake/help/latest/variable/CMAKE_SYSTEM_PROCESSOR.html
set(CMAKE_SYSTEM_PROCESSOR arm)
# Set the compiler executables for cross compilation. (Optional: add the full path if you do not have them in the systems path)
set(CMAKE_C_COMPILER "arm-none-eabi-gcc.exe")
set(CMAKE_CXX_COMPILER "arm-none-eabi-g++.exe")
set(CMAKE_ASM_COMPILER "arm-none-eabi-gcc.exe")
set(CMAKE_OBJCOPY "arm-none-eabi-objcopy.exe")
set(CMAKE_SIZE "arm-none-eabi-size.exe")
# Compile to a static library and no not link dynamically at runtime, because on bare-metal embedded targets there is nothing to link to dynamically.
set(CMAKE_TRY_COMPILE_TARGET_TYPE STATIC_LIBRARY)
###########################################
# Global Options for all targets
###########################################
# Set target architecture and cpu specific parameters
#
# Description of the compile options:
# https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html
#
# PM0214 STM32 Cortexยฎ-M4 MCUs and MPUs programming manual 10.0
# https://www.st.com/resource/en/programming_manual/pm0214-stm32-cortexm4-mcus-and-mpus-programming-manual-stmicroelectronics.pdf
#
# RM0390 STM32F446xx advanced Armยฎ-based 32-bit MCUs 6.0
# https://www.st.com/resource/en/reference_manual/rm0390-stm32f446xx-advanced-armbased-32bit-mcus-stmicroelectronics.pdf
set(CPU_FLAGS
"-mcpu=cortex-m4" # use cortex m4 instructions
"-mthumb" # use 16-bit (Thumb) instead of 32-bit (full ARM) instructions | STM32 only supports Thumb
"-mfpu=fpv4-sp-d16" # This specifies what floating-point hardware
"-mfloat-abi=hard" # Generate hardware instructions for the floating-point hardware
)
add_compile_options(
${CPU_FLAGS}
# use small c library implementation
--specs=nano.specs
# Place functions and data in separate sections to allow linker garbage collection.
-ffunction-sections
-fdata-sections
# do not store the frame pointer and make room for another register. Enables optimisations for performance.
-fomit-frame-pointer
)
# Add link options to all files
add_link_options(
# pass the CPU flags to the linker so it knows how to link correctly
${CPU_FLAGS}
# add the linker script that the IDE provided
-Wl,-T${CMAKE_SOURCE_DIR}/STM32F446RETX_FLASH.ld
# tell the linker to delete and garbage collect section dead code
-Wl,--gc-sections
)
# Add libraries that can be used by all files
link_libraries(
nosys # Minimal system stubs for functions like: of functions like `write()`, `read()`, `_exit()`, `sbrk()`, etc.
gcc # Compiler runtime support for operations not directly supported by hardware.
c_nano # A reduced-size version of the standard C library
m # Provides functions like `sin()`, `cos()`, `sqrt()`, `pow()`, etc.
)
# Add CPU specific definitions - especially for the auto generated vendor files
add_compile_definitions(
USE_HAL_DRIVER
STM32F446xx
)
This sets up the proper compiler, target architecture, flags, and memory configuration for your STM32F4 MCU.
Now set up your Application/CMakeLists.txt. Two common practices are either in a sub folder (which this tutorial is doing) or as a seperate library in a seperate repo.
# Includ the package manager from: https://github.com/cpm-cmake/CPM.cmake
include(cmake/CPM.cmake)
# Optional: Enable system call stubs for freestanding/bare-metal for Fiber
set(FIBER_USE_SYS_STUBS ON CACHE BOOL "" FORCE)
# Add the Fiber library
CPMAddPackage("gh:TobiasWallner/Fiber#main")
# create the actual application.
# It will contain generic code that can then be used by many boards and mcus
add_library(App STATIC
app.cpp
)
# link fiber to the application
target_link_libraries(App PRIVATE fiber)
# optionally: if `FIBER_USE_SYS_STUBS ON`. !!! Has to link PUBLIC !!!
target_link_libraries(App PUBLIC fiber_sys_stubs)
Create a Makefile for easy CLI (Command Line Interface) builds
Instead of typing long CMake commands, create a Makefile to simplify your workflow:
If you are no windows and do not have make available, you can get it like so
- Download MSYS2 from their homepage: https://www.msys2.org/
- Open the MSYS2 shell
- Update the package manager
pacman
- Install make
- Add the msys2 make directory
C:\msys64\usr\bin to the PATH so you can use it in your host systems terminals
For this particular makefile I chose to use Ninja as a generator. This just tells CMake to create Ninja flags. You can get Ninja from here: https://ninja-build.org/
For flashing the make file uses the STM32_Programmer_CLI. You can download it from here: https://www.st.com/en/development-tools/stm32cubeprog.html or use the one provided by the vendor IDE
Additionally, this Makefile uses a custom python script (introduced later) that calculates the used FLASH and RAM.
#####################################
# Variables
#####################################
STM32_Programmer_CLI = STM32_Programmer_CLI.exe
gcc_size = arm-none-eabi-size
gcc_nm = arm-none-eabi-nm
build_dir = build
config = Release
target = main
test_target = test
linker_script = STM32F446RETX_FLASH.ld
#####################################
# Commands
#####################################
.PHONY: help
help:
$(info Make Commands:)
$(info --------------)
$(info build .......... builds the 'target' with the 'config')
$(info flash .......... builds the 'target' with the 'config' and then flashes the binary to the micro-controller)
$(info test ........... convenience command that executest flash with the 'test_target')
$(info )
$(info Optional Arguments:)
$(info ---------------)
$(info build_dir= ...... sets the build directory. Default: $(build_dir))
$(info config= ........ sets the build type: 'Release' or 'Debug'. Default: $(config))
$(info target= ........ sets the target/CMake executable. Default: $(target))
$(info test_target= ... sets the target/CMake executable. Default: $(test_target))
# CMAKE Setup
# -----------
.PHONY: setup_cmake
setup_cmake:
$(info )
$(info ================================ [build_dir: ./$(build_dir)/ | target: $(target) | config: $(config)] ================================)
$(info )
cmake -S . -B $(build_dir) -G "Ninja Multi-Config" -DCMAKE_TOOLCHAIN_FILE=toolchain.cmake -DFIBER_COMPILE_TESTS=ON
# build commands
# --------------
# build run target
.PHONY: build
build: setup_cmake
cmake --build $(build_dir) --target $(target) --config $(config)
$(gcc_size) --format=berkeley "$(build_dir)/$(config)/$(target).elf" > "$(build_dir)/$(config)/$(target)_size.txt"
python stats.py -l $(linker_script) -s "$(build_dir)/$(config)/$(target)_size.txt"
$(gcc_nm) --size --print-size --radix=d "$(build_dir)/$(config)/$(target).elf" > "$(build_dir)/$(config)/$(target)_fsize.txt"
# flash run target
.PHONY: flash
flash: build
${STM32_Programmer_CLI} -c port=SWD -d "$(build_dir)/$(config)/$(target).elf" -v -rst
.PHONY: test
test: build
@make flash target=$(test_target)
.PHONY: openocd
openocd: build_run_debug
openocd -f openocd.cfg
.PHONY: gdb
gdb:
arm-none-eabi-gdb "$(build_dir)/$(config)/$(target).elf"
.PHONY: clean
clean:
cmake --build build --target clean --config Release
cmake --build build --target clean --config Debug
This lets you run commands like:
make build to build the project
make flash to flash the binary
make test to test binaries
You'll also probably want this Python script stats.py that calculate memory FLASH and RAM usage of your application code:
import re
import argparse
def parse_bytes(size_str):
size_str = size_str.strip().upper()
units = {"K": 1024, "k" : 1024, "M": 1024**2, "G": 1024**3, "T": 1024**4}
num = ""
unit = ""
for char in size_str:
if char.isdigit() or char == ".":
num += char
else:
unit += char
if not num:
raise ValueError(f"Invalid size format: {size_str}")
num = float(num)
return int(num * units.get(unit, 1))
parser = argparse.ArgumentParser(
prog='stats',
description='Provides statistics from binary files')
parser.add_argument('-l', '--linker_script', help='Path to the linker script')
parser.add_argument('-s', '--size_file', help='Path to the size file')
args = parser.parse_args()
print(f"args: {args}")
with open(args.linker_script, "r") as linker_script:
linker_stript_str = linker_script.read()
regex_ram = r'RAM.*LENGTH\s*=\s*(.*)\s'
match = re.search(regex_ram, linker_stript_str)
ram_str = match.group(1)
ram_size = parse_bytes(ram_str)/1024
regex_flash = r'FLASH.*LENGTH\s*=\s*(.*)\s'
match = re.search(regex_flash, linker_stript_str)
flash_str = match.group(1)
flash_size = parse_bytes(flash_str)/1024
with open(args.size_file, "r") as size_file:
lines = size_file.readlines()
text_str, data_str, bss_str, dec_str, hex_str, filename = map(str, lines[1].split())
text_size = parse_bytes(text_str)/1024.
data_size = parse_bytes(data_str)/1024.
bss_size = parse_bytes(bss_str)/1024.
dec_size = parse_bytes(dec_str)/1024.
ram_used = bss_size + data_size
flash_used = (text_size + data_size)
print("\n--------------------------- Memory Usage in kB: ---------------------------")
print(f"filename: {filename}")
print(f".text: {text_size:.1f}, .data: {data_size:.1f}, .bss: {bss_size:.1f}")
print(f"Ram: .bss + .data = {ram_used:.1f}\tof\t{ram_size:.0f}\t{ram_used*100/ram_size :.1f}%")
print(f"Flash: .text + .data = {flash_used:.1f}\tof\t{flash_size:.0f}\t{flash_used*100/flash_size :.1f}%\n")
print("------------------------------------ Legend -------------------------------")
print(".text (Code Segment): Contains the compiled machine code")
print(".data (Initialized Data Segment): Holds initialized global and static variables.")
print(".bss (Uninitialized Data Segment): Contains uninitialized global and static variables.\n")
Setup VSCode for Development + Debugging
Though the above make files allow for an easy build process using the terminal, vscode offers some nice to have features that let you step through your code in the debugger.
To enable those features we will setup a GDB server and connect to the MCU over OpenOCD.
Place the following config files in a .vscode/ folder:
Debug Configuration .vscode/launch.json
"version": "0.2.0",
"configurations": [
{
"name": "STM32 Debug",
"type": "cortex-debug",
"request": "attach",
"servertype": "openocd",
"configFiles": ["openocd.cfg"],
"cwd": "${workspaceRoot}",
"executable": "${workspaceRoot}/build/Debug/main.elf",
"gdbTarget": "localhost:3333",
"showDevDebugOutput": "raw",
"runToEntryPoint": "main",
"preLaunchTask": "flash-debug"
}
]
Build Task for Debugging .vscode/tasks.json
"version": "2.0.0",
"tasks": [
{
"label": "flash-debug",
"type": "shell",
"command": "make",
"args": ["flash", "config=Debug"], // Calls the `debug` target in Makefile
"group": {
"kind": "build",
"isDefault": true
},
"problemMatcher": ["$gcc"]
}
]
Also create an OpenOCD configuration file to define your probe and target:
source [find "interface/stlink.cfg"]
source [find "target/stm32f4x.cfg"]
reset_config srst_only
init
reset init
You now have a clean and modern embedded development setup that, is portable, works across operating systems, avoids vendor lock-in, offers full integration with advanced tooling like GDB, OpenOCD, and VSCode and is easily integrateable it into a CD/CI system on e.g.: GitHub or GitLab.
This approach helps you scale your codebase, reuse your setup across projects, and work just like you would in a modern software engineering environment.
To implement CD/CI with automatic testing on pushing to the repo when working with GitHub:
- create a folder called
.github (has to be the exact name)
- create a folder called
.github/workflows (has to be the exact name)
- create a workflow file e.g:
test.yml
- write a script that compiles your project and executes the tests. Since cmake already does most of the work, that
test.yml can be fairly small.
.github/workflows/test.yml
name: test
on:
push:
pull_request:
jobs:
test:
runs-on: ubuntu-latest
env:
BUILD_TYPE: Release
steps:
- uses: actions/checkout@main
- name: Install dependencies
run: |
sudo apt update
sudo apt install ninja-build cmake
sudo apt install g++ # Just in case
- name: Set up compiler
run: |
echo "CC=gcc" >> $GITHUB_ENV
echo "CXX=g++" >> $GITHUB_ENV
- name: Configure with CMake
run: cmake -S . -B build -G "Ninja Multi-Config"
- name: Build
run: cmake --build build --config Release
- name: Run CTest
run: ctest --test-dir build -C Release -V