A very simple LED blinky for the STM32-F030R8.

It's just the relevant bits of the standard code on the device.

Let's see what this looks like in various disassemblers/decompilers!
This is all it does!
STM32Cube (and any other toolchain, really) make both an .elf and a .bin.

Let's see what the .elf looks like in IDA. We don't need to tell it that it's ARM etc. as this is in the ELF.
We could tell it something about the processor (Cortex M0, so ARMv6-M), but ELF shouldn't need this so much.
All the function names are present.
And the diassembly is clean and sensible.
Ghidra is much the same - .elf loads straight away.

The decompiled function is not bad at all!

The speed has been interpreted as a string, and the loops are... interesting. But very easy to infer what is going on.
Binary Ninja Cloud has really clean high level decompiled code. I don't think this could be more obvious IMHO
That's practically what I wrote, so really not bad!
Now, the problem is that we wouldn't have that .elf if we had pulled the firmware from a device. We have a .bin - which is going to be devoid of all the symbols and other information.
Strings from the .elf - lots of them.
The .bin?

No actual ASCII strings!

This is one of the issues with baremetal firmware - unless it logs or interacts with external systems, it might not have any strings at all!
So we open the .bin in IDA and it doesn't know anything about it. We need to tell it.
It's ARM little-endian.
And it's ARMv6-M. That means no full ARM instructions, only original thumb. It makes the disassemblers work far easier.
We need to tell IDA where to load the file. Look to the datasheet, and we can see that flash is at 0x8000000 on these chips.
That's IDA setup. I'm not bothered about seting up RAM.
Oh. No IDA magic?

That's because there is no way for IDA to know where to start looking for code. We need to tell it!
ARM Cortex-M0 (and many other ARM Cortex) use something called a vector table, right at the start of flash.

The word is the initial stack pointer. The next is the reset vector - where to run code from! This tells us where to start.
A few taps of the "D" key to change bytes into words, and we can see these vectors.

Reset is 0x80005ED. Lets's jump there by double clicking on it!
Ok, now we hit C to make it into juicy code!

Oh what now? Why did that not work?
Well, a strange quirk of ARM is that the least significant bit of an called address says if it is in Thumb mode or ARM mode. We know we are in Thumb mode... as it can only run in Thumb mode. We need to subtract 1 from the address to get 0x80005EC.

Bang!
Let's head to our LD2_Blink() function.

It still looks nice, a good structure. But now the variable names have gone, the functions it calls have gone. This is much harder work to reverse.
Let's try the same in Ghidra. It also can't just guess, so we tell it!
I think this is right!
Same story, tell it where to load.
Now, Ghidra sometimes sets up the ARM vector table automatically (when it's at 0x0). It hasn't done it here.
But Ghidra has found a lot of the functions itself.
Our LD2_Blink() function has been found and diassembles fine.
Now, I can't do this in Binary Ninja Cloud to show you what it does, but I can try the trial.

It's guessed that it's thumb2 - although the chip is only thumb, this should not present an issue.

I found these offsets very confusingly named compared to IDA and Ghidra.
And it's done an admirable job here, finding the function.
As people have pointed out, there are ways and means to make a lot of this automated. But I hope it illustrates how much harder it can be to work with bare metal firmware.

Sometimes we don't even know *what* the processor is, or have a memory map.
You can follow @cybergibbons.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: