
You Can’t See Me is a fun CTF on Hack The Box that requires you to reverse engineer a simple C application. It’s generally rated as an “Easy” challenge, and is a good introduction to reversing software and performing malware analysis. As with the other CTF guides, answers will be blurred out. Also for brevity, I won’t be including all output of every command.

You can find the link to You Can’t See Me here.

All you need to do is download the ZIP and extract it. There should only be one file inside, named auth, with this MD5:


Static Analysis#


Let’s first examine what type of file we’re dealing with here with the file command:

1$ file ./auth
1auth: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, stripped

So we know that this is an ELF executable, so it will execute on Linux. It’s also a 64-bit LSB executable, so we should expect to interact with R/64-bit registers when we disassemble the code (rax, rdi, rip, etc.). Finally, it’s stripped, meaning that the debugging symbols have been removed1. This will make the disassembly and reversing of this program much more difficult, as you will later see.


We can use the strings command to pull out some static data from the file:

1strings ./auth > auth.strings
2cat auth.strings
12I said, you can't c me!
16GCC: (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008

From this output, we can gather that this project was also compiled with GCC, and we can see some of the functions used in C, such as malloc, fgets, __libc_start_main, and so on. We can also see that the flag is set dynamically with a variable, and will be put into HTB{%s}. As for the actual code, we see some text being printed out, with an interesting one being this_is_the_password.


Finally, we can dump out all the disassembled code with objump to get an idea of what we’re working with. I’m used to reading Intel’s assembly style, so that’s why I specified -M intel.

1objdump -M intel -d ./auth > auth_dissembled.asm
2cat auth_dissembled.asm
 1auth:     file format elf64-x86-64
 3Disassembly of section .init:
 40000000000401000 <.init>:
 6Disassembly of section .plt:
 70000000000401020 <printf@plt-0x10>:
 80000000000401030 <printf@plt>:
 90000000000401040 <fgets@plt>:
100000000000401050 <strcmp@plt>:
110000000000401060 <malloc@plt>:
13Disassembly of section .text:
140000000000401070 <.text>:
16Disassembly of section .fini:
170000000000401308 <.fini>:

We do see some of the functions that we noticed earlier from the strings output, but there’s some notable ones missing. __libc_start_main should be under the .plt section, but it’s not there. We also apparently don’t have a main or _start function, both of which should be under .text. main is where the application code is, so we’ll need to know where it is to debug effectively with all this assembly code.

Since there’s no visible entry point function to this application, you won’t be able to set breakpoints to main or _start in your debugging tool. You can attempt it with GDB now, and you’ll encounter an error message:

1gdb ./auth
1Reading symbols from ./auth...
2(No debugging symbols found in ./auth)

If I try to set a breakpoint…

1break *main
1Reading symbols from ./auth...
2(No debugging symbols found in ./auth)

It doesn’t work, and this confirms our findings from the file analysis that the executable was stripped. This means that we’ll have to entirely work with addresses to figure out where the main function truly is.

File Headers#

ELF executables contain headers, and one of these headers contains the entry point address! There are a couple of ways to do this2, one of which is already an objdump option, the f flag.

1objdump -f ./auth
1./auth:     file format elf64-x86-64
2architecture: i386:x86-64, flags 0x00000112:
4start address 0x0000000000401070

Alright, so we have the entry point at 0x401070. Now we can start dynamic analysis!

Dynamic Analysis: GDB#

Now, the application doesn’t actually start from main right away. First, it loads all the memory and dynamic libraries needed, which occurs in a _start procedure3. That _start procedure is where the entry point drops us off at, and so we need to look for the address for main from there.

Note: I was running GDB with Intel’s assembly style, which is easier on the eyes in my opinion. You can set this within GDB by running:

1set disassembly-flavor intel

Locating the main function#

Load the file with gdb, and set a breakpoint at the entry point, using break *0x401070. Then run to be placed at the start of the application. Look at the next couple of instructions to see where main will be called from.

1disas 0x401070,+50
 1Dump of assembler code from 0x401070 to 0x4010a2:
 2=> 0x0000000000401070:  endbr64 
 3   0x0000000000401074:  xor    ebp,ebp
 4   0x0000000000401076:  mov    r9,rdx
 5   0x0000000000401079:  pop    rsi
 6   ...
 7   0x000000000040108a:  mov    rcx,0x401290
 8   0x0000000000401091:  mov    rdi,0x401160
 9   0x0000000000401098:  call   QWORD PTR [rip+0x2f52]        # 0x403ff0
10   0x000000000040109e:  hlt    
11   0x000000000040109f:  nop
12   0x00000000004010a0:  endbr64 
13End of assembler dump.

That hlt instruction puts the CPU into an idle state4. For us, it means that the program executable has finished, so we can ignore all the instructions after it. This means that main logically would have to run before hlt, and it would be invoked via the call instruction.

Note: So apparently, the address that gets moved into the rdi register, is in fact the address of the main function5. I could only manage to find this out from multiple StackOverflow answers to similar questions, for example here, here, here, and here. No answers have explained precisely why this is the case however, and so I will have to come back to this guide and explain the reasoning behind this. For now, we’ll just accept that we know the address of main from rdi.

The rdi register is set to 0x401160, so we can now set a breakpoint for main with break *0x401160.

Examining main#

Now, let’s look at the instructions for main. We know this function ends when the ret instruction is invoked, so everything between here and there contains the application code! I didn’t include all the output below, just only kept the highlights.

1disas 0x401160,+350
 1=> 0x0000000000401160:  push   rbp
 2   ...
 3   0x00000000004011f1:  cmp    DWORD PTR [rbp-0x8],0x14
 4   0x00000000004011f5:  jge    0x40121f
 5   0x00000000004011fb:  movsxd rax,DWORD PTR [rbp-0x8]
 6   0x00000000004011ff:  movsx  ecx,BYTE PTR [rbp+rax*1-0x40]
 7   0x0000000000401204:  add    ecx,0xa
 8   0x0000000000401207:  mov    dl,cl
 9   0x0000000000401209:  movsxd rax,DWORD PTR [rbp-0x8]
10   0x000000000040120d:  mov    BYTE PTR [rbp+rax*1-0x20],dl
11   0x0000000000401211:  mov    eax,DWORD PTR [rbp-0x8]
12   0x0000000000401214:  add    eax,0x1
13   0x0000000000401217:  mov    DWORD PTR [rbp-0x8],eax
14   0x000000000040121a:  jmp    0x4011f1
15   0x000000000040121f:  mov    esi,0x15
16   0x0000000000401224:  mov    rdi,QWORD PTR [rbp-0x28]
17   0x0000000000401228:  mov    rdx,QWORD PTR ds:0x404050
18   0x0000000000401230:  call   0x401040 <fgets@plt>
19   0x0000000000401235:  lea    rdi,[rbp-0x20]
20   0x0000000000401239:  mov    rsi,QWORD PTR [rbp-0x28]
21   0x000000000040123d:  mov    QWORD PTR [rbp-0x50],rax
22   0x0000000000401241:  call   0x401050 <strcmp@plt>
23   0x0000000000401246:  cmp    eax,0x0
24   0x0000000000401249:  je     0x401268
25   ...
26   0x000000000040125b:  call   0x401030 <printf@plt>
27   ...
28   0x0000000000401263:  jmp    0x401280
29   ...
30   0x0000000000401278:  call   0x401030 <printf@plt>
31   ...
32   0x0000000000401288:  ret    

This program seems as though it creates some string, and then prompts the user for a response. Depending on the answer given, a different message will be printed out. If you play around with the app, you’ll see that the wrong answer always outputs I said, you can't c me!, as seen earlier in our strings analysis. Also, that text this_is_the_password is just a trick, and doesn’t even work here.

Looking at the code closely, you may notice a loop between 0x4011f1 and 0x40121a. The only time it exits is when DWORD PTR [rbp-0x8] is greater than or equal to 0x146. In decimal, 0x14 is 20, so this loop only ends when a 20-character DWORD is generated.

Getting the Flag#

rax is being used as the counter, and every iteration, dl is being shifted into the corresponding index:

1  0x000000000040120d:  mov    BYTE PTR [rbp+rax*1-0x20],dl

So at the end of 20 iterations, we should be able to access the complete set of 20 bytes, which is pointed to by rbp - 0x20. We can do this by setting a breakpoint just after the loop ends, at 0x40121f.

At this point, we can just print out the value stored at the address of the byte pointer (hidden):

1call (void)puts($rbp-0x20)

We can continue on from here to where we input our answer, and this is in fact the correct one!


Final Thoughts#

This was a cool reverse engineering challenge where I learned a lot about assembly, debugging, and some of the process of how programs are run on Linux. It did require a lot of googling when I got stuck, and I’ve included below some of the resources that helped me along the way. Happy hacking!

  1. die.net: strip(1) - Linux man page. https://linux.die.net/man/1/strip ↩︎

  2. StackOverflow: Reversing ELF 64-Bit LSB Executable x86-64 gdb. https://reverseengineering.stackexchange.com/a/3816 ↩︎

  3. Embedded Artistry: A General Overview of What Happens Before main(). https://embeddedartistry.com/blog/2019/04/08/a-general-overview-of-what-happens-before-main/#genview ↩︎

  4. Wikipedia: HLT (x86 instruction). https://en.wikipedia.org/wiki/HLT_(x86_instruction) ↩︎

  5. StackOverflow: How to handle stripped binaries with GDB? https://reverseengineering.stackexchange.com/a/1936 ↩︎

  6. Wikibooks: x86 Assembly/Control Flow https://en.wikibooks.org/wiki/X86_Assembly/Control_Flow#Jump_if_Greater_or_Equal ↩︎