Stack overflows are probably the number 1 enemy of embedded applications: a call to a a printf() monster likely will use too much stack space, resulting in overwritten memory and crashing applications. But stack memory is limited and expensive on these devices, so you don’t want to spend too much space for it. But for sure not to little too. Or bad things will happen.
The Eclipse based MCUXpresso IDE has a ‘Heap and Stack Usage’ view which can be used to monitor the stack usage and shows that a stack overflow happened:
Heap and Stack Usage
But this is using the help of the debugger: how to catch stack overflows at runtime without the need of a debugger? There is an option in the GNU gcc compiler to help with this kind of situation, even if it was not originally intended for something different.
The problem is that application call stack (function calls, pushing parameters and using local variables) is growing into one direction. If the reserved stack space is not large enough, the call stack space can grow into the other memory area and corrupt data:
stack overflow
There are different ways to deal with this:
- Static Analysis. Making a good analysis how much stack is needed. Recursion can be a problem.
- Using MPU (Hardware Memory Protection) to detect and protect the overflow
- Using hardware watch points to detect the overwrite
- Place sentinel values at the end of the stack space which are periodically checked
The last option is what can be turned on in FreeRTOS.
This article uses the NXP MCUXpresso IDE V11 which uses GNU tools. In this article I describe an approach with the GNU gcc in a bare-metal (no RTOS) environment, because FreeRTOS already includes an option to check for a stack overflow at runtime: the check is performed at task context switch, see “FreeRTOS – stacks and stack overflow” for more details.
FreeRTOS has two methods: one is just comparing the current task stack pointer with a known stack limit value (if it is outside the stack range). The second method includes the first plus places a pattern at the end of the stack and verifies it if it has been touched. The second method takes more time. And both methods are used at context switch time only, so stack overflow detection might not be detected for a while.
The MCUXpresso IDE V11 includes the ‘Image Info’ view which calculates the stack space needed:
Image Info
This is a good start, but it does not have the information from the libraries. To get that information, one would have to rebuild correctly all the GNU libraries which can be a daunting task.
There is another problem especially when considering security: arbitrary code execution causing a stack overflow/corruption with the goal to take control over the system. These are called ‘stack overflow exploits’. See http://phrack.org/issues/49/14.html for a good tutorial on this concept (and if you want to get into the hacking business 😉 ).
To counter these exploits, compilers including the gcc started to add ‘hardening’ options to detect these exploits. One of it is the GNU gcc StackGuard (see ftp://gcc.gnu.org/pub/gcc/summit/2003/Stackguard.pdf). In that approach, the compiler is placing a ‘canary’ guard into each instrumented function stack frame:
StackGuard Canary (Source: https://access.redhat.com/blogs/766093/posts/3548631)
Similar to the canaries used in coal mines, a stack canary is a variable with a special value placed at the end of the stack memory. Assuming that an exploit with a stack buffer overflow will very likely overwrite that canary, it can be detected by the by the running program.
The gcc compiler provides a set of options to use canaries (see https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html).
-fstack-protector: Emit extra code to check for buffer overflows, such as stack smashing attacks. This is done by adding a guard variable to functions with vulnerable objects. This includes functions that call alloca, and functions with buffers larger than 8 bytes. The guards are initialized when a function is entered and then checked when the function exits. If a guard check fails, an error message is printed and the program exits.
-fstack-protector-all: Like -fstack-protector except that all functions are protected.
For example add that option to the compiler settings like this:
-fstack-protector-all
Below is a small function which prints a value. Possibly that printf() might cause a stack overflow:
void printValue(int val) { printf("The value is: '%d'", val);}If using the Stack Guard functionality of the GNU compiler, I have to provide two things:
- stack guard (32bit) value, ideally with a ‘random’ value, named __stack_chk_guard
- error callback function, named __stack_chk_fail
Below is a very simple implementation of this:
unsigned long __stack_chk_guard = 0xDEADBEEF;void __stack_chk_fail(void) { /* will be called if guard/canary gets corrupted */ /* Handle error, print error message, stop the target, ... */ DisableInterrupts(); __asm volatile("bkpt #0"); /* break target */}💡 Check your library implementation! For example the NXP provided NewLib and NewLib nano libraries in MCUXpresso IDE V11.0.1 already include a default implementation of the guard variable and fail hook (as weak symbols). The RedLib library does not have it, so you have to add it anyway. For newlib and newlib nano provide your own implementation.
The compiler generates the following code:
- At function entry, it stores the __stack_chk_guard value into the stack frame at function entry
- At function exit, the guard value on the stack is compared against the value in __stack_chk_guard
To illustrate this, here the commented disassembly (ARM Cortex-M4F):
00000000 <printValue>: 0:b580 push{r7, lr} ; push used regs 2:b084 subsp, #16 ; reserve space 4:af00 addr7, sp, #0 ; move SP to R7 6:6078 strr0, [r7, #4] ; store param 8:4b08 ldrr3, [pc, #32]; load &__stack_chk_guard a:681b ldrr3, [r3, #0] ; load content of it c:60fb strr3, [r7, #12] ; store canary value e:6879 ldrr1, [r7, #4] ; load fucntion param 10:4807 ldrr0, [pc, #28]; load &printf 12:f7ff fffe bl0 <_printf> ; call printf12: R_ARM_THM_CALL_printf 16:bf00 nop 18:4b04 ldrr3, [pc, #16]; load &__stack_chk_guard 1a:68fa ldrr2, [r7, #12] ; load canary 1c:681b ldrr3, [r3, #0] ; load __stack_chk_guard 1e:429a cmpr2, r3 ; compare it 20:d001 beq.n26 ; match? 22:f7ff fffe bl0 <printValue> ; no: call error handler22: R_ARM_THM_CALL__stack_chk_fail 26:3710 addsr7, #16 ; normal exit code 28:46bd movsp, r7 ; restore sp 2a:bd80 pop{r7, pc} ; resore pushed regs...2c: R_ARM_ABS32__stack_chk_guard30: R_ARM_ABS32.rodata💡 The point to make here is: the check is something has overwritten the stack space of the instrumented function (printValue() in this case). The gcc original implementation does not catch my case above where the allocated stack for the application overflows.
With –fstack-protector-all I’m instrumenting all functions. Of course that instrumentation has a cost a runtime. The other thing is that the startup code might cause false alarm if the canary variable has not been setup yet or is not initialized yet. For this, I can use the following attribute:
__attribute__ ((no_instrument_function))For example I have excluded the data initialization (zero-out and copy-down) in my startup code that way:
disabled canary check with no_instrument_function attribute
As explained above, the gcc implementation is about exploit code which tries to overwrite the stack and return address to execute arbitrary code. My goal is to detect the problem that there is not enough stack space for the application.
So instead checking if the canary in each function has been overwritten, I can check if the ‘global’ canary __stack_chk_guard is overwritten :-).
For this, I’m placing the global canary at the end of the stack, using the approach I have described in “Defining Variables at Absolute Addresses with gcc”:
/* place the following canary variable at the end of the stack */__attribute__((section (".stack"))) unsigned long __stack_chk_guard = 0xDEADBEEF;This is of course depending on the linker file, and this is how I have my stack space allocated:
_StackSize = 0x100; /* Reserve space in memory for Stack */ .heap2stackfill : { . += _StackSize; } > SRAM_UPPER /* Locate actual Stack in memory map */ .stack ORIGIN(SRAM_UPPER) + LENGTH(SRAM_UPPER) - _StackSize - 0: ALIGN(4) { _vStackBase = .; . = ALIGN(4); _vStackTop = . + _StackSize; } > SRAM_UPPERVerify in the map file that the canary is at the right place:
__stack_chk_guard
In the above case the stack bottom is at 0x2001’0000 and grows towards 0x2000’ff00.
💡 Some linker files use an approach to let the heap and stack grow towards each other. This might be a smart idea to utilize memory, but personally I don’t like it. First I rather avoid a growing heap at all, and I rather want to have a controlled environment with a clearly defined stack space area.
Below a debug session which catched such a stack overflow :-). If the local canary value does not match any more the global one, the error hook gets called:
Stack Overflow Detected
The gcc StackGuard cannot be only used to detect stack overflow exploits, it is useful too to check the application stack overflow case. Of course this is not a 100% check, because it relies on the fact that an overflow really changes the canary at the end of the stack. There are cases where stack space is allocated but not used. Still, it is a good check with little overhead to each function.
If using FreeRTOS, I use the FreeRTOS build-in task stack overflow protection. And I can combine this with the gcc StackGuard feature, but then this would be either only checking interrupt (MSP) stack or I would use it to harden my code against buffer overflow exploits too. This will slow down each instrumented function, but there is no free security.
Update: FreeRTOS V11 comes with ‘heap free list hardening’, see FreeRTOS with Heap Protector.
Happy Canaring 🙂
Links
- GCC command line options: https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html
- GCC Stack Guard: ftp://gcc.gnu.org/pub/gcc/summit/2003/Stackguard.pdf
- Stack Canaries: https://en.wikipedia.org/wiki/Stack_buffer_overflow#Stack_canaries
- Placing variables: Defining Variables at Absolute Addresses with gcc
- Security Technologies: Stack Smashing Protection (StackGuard): https://access.redhat.com/blogs/766093/posts/3548631
- FreeRTOS with Heap Protector