Using self modifying code under Linux

by Karsten Scheibler
thanks to Stefan Esser and Maciej Hrebien

Remark (a):
Save this page as smc.html and use the following command to extract the source code with a sample Makefile:
sh -c "( mkdir smc && cd smc && awk '/^<!--eof/{d=0}{if(d)print>f}/^<!--sof/{f=\$2;d=1}' ) < smc.html"
Remark (b):
Some browsers try to be super smart and reformat the html code so you may have trouble to extract the files. This is the right time to learn more about wget, a tool to download files via http or ftp ;-)
Remark (c):
Some versions of nasm are known to generate broken code. I have used nasm-0.98 for this example which worked for me.

I know that this is not the cleanest way of programming, but sometimes this programming style is faster. Furthermore this method is used by JIT (just in time) compilers. Transmeta also uses some sort of self modifying code to implement a x86 software emulation. They call it Codemorphing ;-) This example code will show how this can be done under Linux.

The idea behind: There is a syscall sys_mprotect which allows a program to change the flags for (nearly) every page. A page is the smallest memory unit available for virtual memory management. On the x86 architecture the page size is 4KB. But as i noticed we didn't need this call to make a page in the section .bss executable, because on x86 a readable page is also a executable page, and the section .bss is read/write. But this behavior may be change with the appearance of the NX-flag in modern processors, so the sys_mprotect becomes mandatory.

The first two examples copy codesnippets to section .bss and execute them. Because we have read/write/execute permissions in this memory area the code is allowed to modify itself. The first example copies a snippet of code which writes text to screen (via sys_write), but before calling it we modify some values in the code itself (the start and the length of the text). The second one does some real self modification (see code2_start). The rep stosb instruction overwrites the first four inc ebx with nop, so that the line put on screen contains a 04h instead of the expected 08h. The third example modifies code in section .text. It looks like an endless loop, but it isn't. Find out yourself why ;-).

Note 1:

The call instruction in this code must be done indirect, because if the address is given directly it is a relative address (signed 32 Bit) after copying and starting the code you will get a SEGFAULT. If the address is given indirectly it is an absolute value, which should work position independent.

Note 2:

If you see a 08h instead of a 04h in the output, you get to know another strange effect of self modifying code. If i remember right there was something like a prefetch queue. This prefetch queue holds the next bytes after the currently processed instruction in the processor (on my old 386SX it was 13 Bytes long, examined with a DOS Assemblyprogram long time ago ;-). If you overwrite this queued instructions it depends on the processor if the prefetch queue is reloaded or not. I expected it also on my K6-2, but as Stefan Esser mentioned: "any Clone of the Pentium or above should behave friendly cause with the Pentium Intel build some Prefetch Queue Modification Detection into the CPU. If you write to the code that normaly would be inside the prefetch range the pentium automaticly discards its queque and reloads...". If you want to make sure that the reload happens use a jmp instruction (try a jmp after the rep stosb).

 ;**************************************************************************** ;**************************************************************************** ;* ;* USING SELF MODIFYING CODE UNDER LINUX ;* ;* written by Karsten Scheibler, 2004-AUG-09 ;* ;**************************************************************************** ;**************************************************************************** global smc_start ;**************************************************************************** ;* some assign's ;**************************************************************************** %assign SYS_WRITE 4 %assign SYS_MPROTECT 125 %assign PROT_READ 1 %assign PROT_WRITE 2 %assign PROT_EXEC 4 ;**************************************************************************** ;* data ;**************************************************************************** section .bss alignb 4 modified_code: resb 0x2000 ;**************************************************************************** ;* smc_start ;**************************************************************************** section .text smc_start: ;calculate the address in section .bss, it must lie on a page ;boundary (x86: 4KB = 0x1000) ;NOTE: In this example obsolete because each segment is page ; aligned and we use it only once, so we know that it is ; aligned to a page boundary, but if you have more than ; one section .bss in your code (or link several objects ; together) you can't be sure about that mov dword ebp, (modified_code + 0x1000) and dword ebp, 0xfffff000 ;change flags of this page to read + write + executable, ;NOTE: On x86 Architecture this call is obsolete, because for ; section .bss PROT_READ and PROT_WRITE are already set. ; PROT_EXEC is on x86 also set if PROT_READ is set, this ; results in rwx for this segment, but this behavior may ; change with appearance of the NX-flag in modern processors mov dword eax, SYS_MPROTECT mov dword ebx, ebp mov dword ecx, 0x1000 mov dword edx, (PROT_READ | PROT_WRITE | PROT_EXEC) int byte 0x80 test dword eax, eax js near smc_error ;execute unmodified code first code1_start: mov dword eax, SYS_WRITE mov dword ebx, 1 mov dword ecx, hello_world_1 code1_mark_1: mov dword edx, (hello_world_2 - hello_world_1) code1_mark_2: int byte 0x80 code1_end: ;copy code snippet from above to our page (address is still in ebp) mov dword ecx, (code1_end - code1_start) mov dword esi, code1_start mov dword edi, ebp cld rep movsb ;append 'ret' opcode to it, so that we can do a call to it mov byte al, [return] stosb ;change some values in the copied code: start address of the text ;and its length mov dword eax, hello_world_2 mov dword ebx, (code1_mark_1 - code1_start) mov dword [ebx + ebp - 4], eax mov dword eax, (hello_world_3 - hello_world_2) mov dword ebx, (code1_mark_2 - code1_start) mov dword [ebx + ebp - 4], eax ;finally call it call dword ebp ;copy second example mov dword ecx, (code2_end - code2_start) mov dword esi, code2_start mov dword edi, ebp rep movsb ;do something real nasty: edi points right after the 'rep stosb' ;instruction, so this will really modify itself mov dword edi, ebp add dword edi, (code2_mark - code2_start) call dword ebp ;modify code in section .text itself endless: ;allow us to write to section .text mov dword eax, SYS_MPROTECT mov dword ebx, smc_start and dword ebx, 0xfffff000 mov dword ecx, 0x2000 mov dword edx, (PROT_READ | PROT_WRITE | PROT_EXEC) int byte 0x80 test dword eax, eax js near smc_error ;write message to screen mov dword eax, SYS_WRITE mov dword ebx, 1 mov dword ecx, endless_loop mov dword edx, (hello_world_1 - endless_loop) int byte 0x80 ;here comes the magic, which prevents endless execution mov dword ecx, (smc_end_1 - smc_end) mov dword esi, smc_end mov dword edi, endless rep movsb ;do it again jmp short endless ;**************************************************************************** ;* code2 ;**************************************************************************** ;this is the ret opcode we copy above ;and the nop opcode needed by code2 return: ret no_operation: nop ;here some real selfmodifying code, if copied ;to .bss and edi correctly loaded ebx should contain 0x4 instead ;of 0x8 code2_start: mov byte al, [no_operation] xor dword ebx, ebx mov dword ecx, 0x04 rep stosb code2_mark: inc dword ebx inc dword ebx inc dword ebx inc dword ebx inc dword ebx inc dword ebx inc dword ebx inc dword ebx call dword [function_pointer] ret code2_end: align 4 function_pointer: dd write_hex ;**************************************************************************** ;* write_hex ;**************************************************************************** write_hex: mov byte bh, bl shr byte bl, 4 add byte bl, 0x30 cmp byte bl, 0x3a jb short .number_1 add byte bl, 0x07 .number_1: mov byte [hex_number], bl and byte bh, 0x0f add byte bh, 0x30 cmp byte bh, 0x3a jb short .number_2 add byte bh, 0x07 .number_2: mov byte [hex_number + 1], bh mov dword eax, SYS_WRITE mov dword ebx, 1 mov dword ecx, hex_text mov dword edx, 9 int byte 0x80 ret section .data hex_text: db "ebx: " hex_number: db "00h", 10 ;**************************************************************************** ;* some text ;**************************************************************************** endless_loop: db "No endless loop here!", 10 hello_world_1: db "Hello World!", 10 hello_world_2: db "This code was modified!", 10 hello_world_3: ;**************************************************************************** ;* smc_error ;**************************************************************************** section .text smc_error: xor dword eax, eax inc dword eax mov dword ebx, eax int byte 0x80 ;**************************************************************************** ;* smc_end ;**************************************************************************** section .text smc_end: xor dword eax, eax xor dword ebx, ebx inc dword eax int byte 0x80 smc_end_1: ;*********************************************** linuxassembly@unusedino.de * 
close