Enter /home/dragon with Ghidra
In some binaries from the chinese fabless semiconductor company called Espressif, I have noticed the string /home/dragon. I see this as an invitation to eneter the home of the dragon. In this article I will investigate how well my implementation of the tenscilica xtensa, ghidra module performs when analyzing esp32 elf files.
Setting up ghidra to be able to understand xtensa opcodes
Xtensa is a configurable processor, and it has not yet been included in ghidra although a pull request has been committed (Aug 1, 2020). Instead you should follow the instructions here to make ghidra xtensa aware. I don't claim to have a complete understanding of ghidra, and the purpose of this article is to understand what ghidra does wrong with this naive extension.
Here is a link to the extension, https://github.com/Ebiroll/ghidra-xtensa
Learning ghidra
I have found the following links to be useful to get an understanding of ghidra.
https://ghidra.re/online-courses/
Background
I understand that Espressif need to protect their intellectual properties, and I do not want to overstep any legal or moral boundaries. However I also firmly believe that security by obscurity is not a good thing. It seems likely that the malicious hackers already have this knowledge, so remember that this article is written with good intentions for the curious.
Baseline
To get a baseline of the capabilities of ghidra, I wrote this article. Here I perform a similar analysis of a binary on Linux.
The code
Header
#pragma oncetypedef struct my_type
{
char my_char;
int my_int;
char my_string[8];
} my_type;
hello_ghidra.c
void one_ptr_arg(my_type *t){
t->my_int=t->my_int+10;
}int one_arg(int my_arg){
char buffer[8];
int i;
my_type silly;
silly.my_int=4; for (i = 0; i < my_arg; ++i) {
buffer[i]=0x32;
}
one_ptr_arg(&silly);
printf("My_arg=%d,%s",my_arg,buffer); printf("silly.my_int=%d\n",silly.my_int);
return (silly.my_int);
}int three_args(int arg1,int arg2,int arg3) {
return(arg1+arg2+arg3);
}void two_args(int my_arg,my_type *t)
{
int my_local = my_arg + 2;
int i;
printf("my_int=%d\n",t->my_int);
for (i = 0; i < my_local; ++i) {
printf("i = %d\n", i);
} int ret=one_arg(my_arg);
printf("ret=%d\n",ret); ret=three_args(my_local,2,3);
printf("ret=%d\n",ret);}
The complete sourcecode can be found here, https://github.com/Ebiroll/esp32s2_kaluga/tree/master/examples/ghidra/main
App main contains some more code, but this will be our focus.
void app_main(void)
{
my_type test = {
.my_int = 4
};
two_args(2,&test);
int ret=two_args_one_ret(2,3);
printf("Restarting now. %d\n",ret);
}
Roundtrip with ghidra
After compiling the souces for the esp32-s2 and importing the elf file we get the following results.
three_args
This was an sitting duck for ghidra.
int three_args(int arg1,int arg2,int arg3) {
return(arg1+arg2+arg3);
}0x400e2ae8 <three_args> entry a1, 32
<three_args+3> add.n a2, a2, a3
<three_args+5> add.n a2, a2, a4
0x400e2ad3 <three_args+7> retw.n
Windowed ABI
To make sense of the assembly you must understand the register usage for function calls on the esp32.
In the windowed ABI the registers of the current window are used as follows:
a0 = return address
a1 = stack pointer (alias sp)
a2 = first argument and result of call (in simple cases)
a3–7 = second through sixth arguments of call (in simple cases).
Note that complex or large arguments are passed on the stack.
a8-a15 = available for use as temporaries. There are no callee-save registers. The windowed hardware automatically saves registers a0-a3 on a call4, a0-a8 on a call8, a0-a12 on a call12, by rotating the register window.
Note that the entry function also allocates space for local variables to be found on the stack. It also stores the PS.CALLINC (0,4 ,8 or 12) in the two highest bits together with a1. Normally call8 is used so when performing a backtrace, you must mask the highest bits to get the real address.
Fake registers
As the windowed call mechanism is not so easy to implement. I introduced fake registers i2-i7 for input arguments and o2 for the return value. This is reflected in the xtensa.cspec ghidra ABI call specification file. These registers are set by the callx8 and call8 instruction p-spec.
One_arg()
This one is also good.
int one_arg(int my_arg)
{
int i;
char buffer [8];
my_type silly;
silly.my_int = 4;
i = 0;
while (i < my_arg) {
buffer[i] = '2';
i = i + 1;
}
one_ptr_arg(&silly);
printf(s_My_arg=%d,%s_ram_3f402cc0,my_arg,buffer);
printf(s_silly.my_int=%d_ram_3f402cd0,silly.my_int,buffer);
return silly.my_int;
}
Two_args()
Some renaming of the local variables is required but we can recognize the function.
void two_args(int my_arg,my_type *t)
{
int ret;
int my_local;
my_local = my_arg + 2;
printf(s_my_int=%d_ram_3f402ce4,t->my_int);
ret = 0;
while (ret < my_local) {
printf(s_i_=_%d_ram_3f402cf0,ret);
ret = ret + 1;
}
ret = one_arg(my_arg);
printf(s_ret=%d_ram_3f402cf8,ret);
ret = three_args(my_local,2,3);
printf(s_ret=%d_ram_3f402cf8,ret,3);
return;
}
app_main()
app_main is a task started by the start_main_task function or the main_task on the esp32-s2. This may vary depending on the version of esp-idf that you are using.
two_args(2,&test);
ret = two_args_one_ret(2,3);
printf(s_Restarting_now._%d_ram_3f402da4,ret,pcVar1,puVar2);
reent_p = __getreent();
fflush((FILE *)reent_p->_stdout);
esp_restart();
Conclusion
The implementation is useful but still have room for improvements. A lot of the information that Ghidra uses comes from the Elf- debug sections and in the future I will do an analysis of a flash dump, of this same program.