A story about elfs, dwarfs and dragons

Olof Astrand
5 min readJul 16, 2020

--

Despite the title, I will not be talking about three-headed dragon-like monsters, dwarfs and elfs. Instead we will be talking about ghidra, the reverse engineering software tool. (https://ghidra-sre.org/) Although it is primarily a reverse engineering tool similar to IDA. I consider it to be a very useful tool to keep in your development kit, along with static analysis tools like nm and objdump.

To learn and understand how Ghidra works, we will analyze a simple program that we have written ourselves. We will do this a few times with different compiler settings.

Use F1 frequently, and you will learn how ghidra works.

You may also want to look at compiler explorer, and compare the diffrent options. https://godbolt.org/ In this example I use the gcc 10.1 compiler. We explore the -fanalyzer,-g, and -O2 options.

#include <stdio.h>
#include <stdlib.h>
/* --------- my_header.h-------*/typedef struct my_type
{
char my_char;
int my_int;
char my_string[8];
} my_type;
int one_arg(int my_arg);
void two_args(int my_arg,my_type *t);
/* --------- End of header -------*/void test(int n)
{
int buf[10];
int *ptr;
if (n < 10)
ptr = buf;
else
ptr = (int *)malloc(sizeof (int) * n);
/* oops; this free should be conditionalized. */
free(ptr);
}
void two_args(int my_arg,my_type *t)
{
int my_local = my_arg + 2;
int i;
printf("my_int=%d\n",t->my_int);
for (i = 0; i < my_local; ++i)
printf("i = %d\n", i);
}
int one_arg(int my_arg) {
char buffer[8];
int i;
my_type silly;
silly.my_int=4;
for (i = 0; i < my_arg; ++i) {
buffer[i]=0x32;
}
printf("My_arg=%d",my_arg);printf("silly.my_int=%d\n",silly.my_int);return (silly.my_int);
}
int main(int argc,char *argv)
{
my_type test = {
.my_int = 4
};
one_arg(10);
two_args(2,&test);

return 0;
}

For the linux example we compile the file with, gcc -fanalyzer main.c

Decompiling with ghidra

After importing the a.out file into ghidra we see the following decompiled code.

void two_args(int param_1,long param_2) {
uint local_10;
printf("my_int=%d\n",(ulong)*(uint *)(param_2 + 4));
local_10 = 0;
while ((int)local_10 < param_1 + 2) {
printf("i = %d\n",(ulong)local_10);
local_10 = local_10 + 1;
}
return;
}

This is not perfect, but remember that we do not have any debug information. You can compare this with the import of debug compiled program (-g) If you see debug* sections in the elf-file, it is very likely that this elf has debug information.

But in this example we assume that we only have corresponding header files and not debug compiled binaries. The Dwarf information is located in the .debug* sections in the elf-file.

Parse the header

In ghidra we select [file->parse c-program] and parse the header file.

Use, save as new configuration to save your settings.

Data type manager

In the data type manager we can expand the newly parsed type.

If you double click on my_type, you can see this

Note how the struct is padded.

We now select, apply function datatypes.

Datatye manager

Ghidra now have applied this knowledge during analysis.

void two_args(int my_arg,my_type *t){
int i;
printf("my_int=%d\n",(ulong)(uint)t->my_int);
i = 0;
while (i < my_arg + 2) {
printf("i = %d\n",(ulong)(uint)i);
i = i + 1;
}
return;
}

The other function , we have to massage a little more.

int one_arg(int my_arg)
{
long in_FS_OFFSET;
int i;
undefined auStack24 [8];
long local_10;
local_10 = *(long *)(in_FS_OFFSET + 0x28); i = 0; while (i < my_arg) {
auStack24[i] = 0x32;
i = i + 1;
}
printf("My_arg=%d",(ulong)(uint)my_arg);
printf("silly.my_int=%d\n",4);
if (local_10 != *(long *)(in_FS_OFFSET + 0x28)) { /* WARNING: Subroutine does not return */
__stack_chk_fail();
}
return 4;
}

It is reasonable to retype auStack[8] to char[8] (Ctrl-L) and rename (L) to buffer. As you can see the compiler (gcc 10.1) also inserts a stack check.

If you run the program, you will get.

./a.out
My_arg=10silly.my_int=4
*** stack smashing detected ***: terminated
Aborted (core dumped)

Optimized code (-O2)

Now when we test with -O2, analyzer will not find the error. And now you can also run the program. The optimization has removed the unnecessary stack overwrite as it has discovered that this part of the code does not do anything useful, and we can now run the code.

My_arg=10silly.my_int=4
my_int=4
i = 0
i = 1
i = 2
i = 3

When looking at the generated code you see it has been optimized to this.

int one_arg(int my_arg)
{
printf("My_arg=%d",my_arg);
printf("silly.my_int=%d\n",4);
return 4;
}

Aanalsis of the optimized stripped version.

Strip a.out to remove all information from the file.

Now, no funcrion names are available and instead we start to look at the entry function,

void entry(undefined8 param_1,undefined8 param_2,undefined8 param_3){undefined8 in_stack_00000000;
undefined auStack8 [8];
__libc_start_main(&PTR_DAT_00101070,in_stack_00000000,&stack0x00000008,&LAB_001012b0,&DAT_00101320
,param_3,auStack8);
do {
/* WARNING: Do nothing block with infinite loop */
} while( true );
}

To localize the functions, we look at the strings instead,

Defined strings

When we follow the reference we end up here,PTR_DAT_00101070. Mark the label after the RET, as it seems likely that the function starts here.

Press (F) and rename the argument, We should recognize this now.

int FUN_00101270(int my_arg)
{
printf("My_arg=%d",my_arg);
printf("silly.my_int=%d\n",4);
return 4;
}

The other function can be located the same way.

Then we go back to the entry function, PTR_DAT_00101070 is the start function(main). If Ghidra incorrectly labeled it as data, press C (clear data) and mark it as function (F).

int main(int argc,char *argv[])
{
long lVar1;
long in_FS_OFFSET;
my_type test;
lVar1 = *(long *)(in_FS_OFFSET + 0x28);
test = (my_type)ZEXT816(0x400000000);
FUN_00101270(10);
FUN_00101230(2,(long)&test);
if (lVar1 == *(long *)(in_FS_OFFSET + 0x28)) {
return 0;
}
/* WARNING: Subroutine does not return */
__stack_chk_fail();
}

For fun, you can open the function call graph.

Tensilica & the esp32

Although not complete yet, the following is an implementation for the esp32 processor. https://github.com/Ebiroll/ghidra-xtensa

Here you can find the source code for this example. https://github.com/Ebiroll/esp32s2_kaluga/tree/master/examples/ghidra

--

--