Arduino vs. embedded C - AVR reversing
As I was teaching some embedded C basics, I was asked what are some benefits of embedded C over the classic Arduino language for an Arduino-based board. This article tries to see what we can do by reversing a really simple program compiled with both methods for the Arduino Uno.
Prerequisites
arduino-cli
: Command Line Interface for Arduino. You will need to install the AVR toolchain as well:arduino-cli core install arduino:avr
- AVR cross compiler:
sudo apt install gcc-avr
- Optional: an Arduino simulator such as SimulIDE
Sample program
We want to create the most simple program which goal is to light on the built-in LED, located at port 13 (or PORT PB5) on the Arduino Uno.
Arduino
void setup()
{
pinMode(LED_BUILTIN, OUTPUT);
}
void loop()
{
digitalWrite(LED_BUILTIN, HIGH);
}
Embedded C
#include <avr/io.h>
int main()
{
DDRB = 1 << PB5;
PORTB = 1 << PB5;
return 0;
}
Codes and compilation
$ ls -lR
./led-arduino:
total 4
-rw-r--r-- 1 pascal pascal 104 mai 1 12:58 led-arduino.ino
./led-embedded:
total 4
-rw-r--r-- 1 pascal pascal 81 mai 1 13:00 led-embedded.ino
$ arduino-cli compile --fqbn arduino:avr:uno --output-dir led-arduino led-arduino
$ arduino-cli compile --fqbn arduino:avr:uno --output-dir led-embedded led-embedded
Binary comparison
Arduino | Embedded C | |
---|---|---|
Storage use | 724 bytes | 144 bytes |
The embedded C code is 5 times smaller than the Arduino one which is a bit “weird” as both codes do the same thing! Let’s find why by reversing binaries and analyzing assembly codes.
arduino-cli
produces the *.elf
binary and the *.hex
file which is just a series of bytes to be loaded in the Arduino.
Reversing binaries
$ avr-objdump -S led-arduino/led-arduino.ino.elf > led-arduino.asm
$ avr-objdump -S led-embedded/led-embedded.ino.elf > led-embedded.asm
Arduino binary
void init()
{
// this needs to be called before setup() or some functions won't
// work there
sei();
174: 78 94 sei
// on the ATmega168, timer 0 is also used for fast hardware pwm
// (using phase-correct PWM would mean that timer 0 overflowed half as often
// resulting in different millis() behavior on the ATmega8 and ATmega168)
#if defined(TCCR0A) && defined(WGM01)
sbi(TCCR0A, WGM01);
176: 84 b5 in r24, 0x24 ; 36
178: 82 60 ori r24, 0x02 ; 2
17a: 84 bd out 0x24, r24 ; 36
sbi(TCCR0A, WGM00);
17c: 84 b5 in r24, 0x24 ; 36
17e: 81 60 ori r24, 0x01 ; 1
180: 84 bd out 0x24, r24 ; 36
// this combination is for the standard atmega8
[...]
Here is a part of the Arduino objdump. The assembly code is really long for such a program… It is easy to understand when we look at the source code of pinMode()
and digitalWrite()
https://github.com/arduino/ArduinoCore-avr/blob/master/cores/arduino/wiring_digital.c. It does not only write a value into a register…
Garretlab made an analysis of both functions:
- https://garretlab.web.fc2.com/en/arduino/inside/hardware/arduino/avr/cores/arduino/wiring_digital.c/digitalWrite.html
- https://garretlab.web.fc2.com/en/arduino/inside/hardware/arduino/avr/cores/arduino/wiring_digital.c/pinMode.html
Embedded C binary
48 00000080 <main>:
49 #include <avr/io.h>
50 int main()
51 {
52 DDRB = 1<<PB5;
53 80: 80 e2 ldi r24, 0x20 ; 32
54 82: 84 b9 out 0x04, r24 ; 4
55 PORTB = 1<<PB5;
56 84: 85 b9 out 0x05, r24 ; 5
57 return 0;
58 86: 90 e0 ldi r25, 0x00 ; 0
59 88: 80 e0 ldi r24, 0x00 ; 0
60 8a: 08 95 ret
At first sight, it’s clearly easier. Step-by-step instruction decoding:
ldi r24, 0x20: according to the AVR ISA [1], ldi
load the value 0x20
in r24
which is a General Purpose Working Register of the Arduino.
out 0x04, r24: this instruction write the value stored in r24
at the address 0x04
(the address of DDRB
according to [2]). 0x20=0b100000
: DDRB5
is set to 1 😉
out 0x05, r24: same thing, for PORTB
.
Conclusion
Embedded C code is usually quicker and smaller as long as we make some effort to study the datasheet. But it’s worth it when we’re working with embedded systems! 😉