Network Security Library
Javascript Feeds    RSS Feed    Security Dashboard    SearchSecurity.com
About | Contact | Advertise | Site Map
Print Printer Friendly      PDF PDF Version
intrusion detection E-mail      Save Save This

The Basics of Shellcoding


{LANG_NAVORIGIN} Exploits
Angelo Rosiello 10/01/2004



Introduction
A shellcode is a group of instructions which can be executed while another program is running. Nowadays lots of examples show how a shellcode can be executed while an application is running and its followings is proposed us by vulnerabilities' exploits. In order to get advantage from a vulnerability it is indispensable to inject a shellcode because we have to get the control of a running application. The goal of this article is not to explain all the possibilities of injecting a shellcode developed during last years, but to analyze and understand its essence.

Registers
Before analyzing the assembly code and then the binary's one, it is necessary to give an overview of the CPU's registers in order to understand their importance in the assembly language. The architecture we are going to show is the Intel-x86's one. All the registers of the Intel's platform support 32 bits which can be divided in sub sections of 16 and 8 bits, just to let an heuristic use of the memory.

32 bits    16 bits    8 bits (high)   8 bits (low)
  EAX          AX              AH                AL
  EBX          BX              BH                BL
  ECX         CX              CH                CL
  EDX         DX              DH                DL


EAX, AX, AH, AL These registers are said accumulators and can be used for arithmetical and input/output operations or to execute interrupt calls. We will see how it's indispensable to use them when we have to realize system calls.
EBX, BX, BH, BL These registers are the base registers and they are used as base pointers to access in the memory. We will use these registers to pass the system calls' arguments. Now and then they are also used to store the return value of an interrupt. (e.g. When we call an open(), the descriptor's value of the file is stored in the register EBX.)
ECX, CX, CH, CL These registers are said counter registers.
EDX, DX, DH, DL These registers are the data registers and they can be used for arithmetical operations, interrupt calls and some input/output operation.

Introducing the Assembly language
The assembly language we are going to approach is named "Inline Assembly" and it adopts the syntax of AT&T. The name of the registers is preceded by the symbol "%", thus if we have to use the register eax we must type "%eax". If we are going to refer to numerical constants, its value must be preceded by the symbol "$". In the following scheme, one can observe the most used in- structions in the assembly language.

MOV - This instruction let us to move a value in a register.
mov $0x4, %al - moves 0x4 into al
mov %eax, %ebx - moves what is in eax into ebx

PUSH - Put a value in the stack.

POP - Get a value from the stack and store it in a register or in a variable.

INT - interrupt call.
int $0x80 - it gives the control to the kernel.

Codification phase
The algorithm we are going to implement in assembly language and then in binary code(as hexadecimal version) is the print on the video of the string "WWW.ROSIELLO.ORG".

The solution of the problem in C language is the following piece of code:

int main()
{
write(0, "WWW.ROSIELLO.ORG", 16);
exit(0);
}

In order to realize the write() and the exit() we have to exe- cute their system calls. It is possible to find in Linux the library "unistd.h" where are stored all the system calls that one can use.

angelo@rosiello.org$ cat /usr/include/asm-i386/unistd.h
/*
* This file contains the system call numbers.
*/

#define _NR_exit 1 <- This is our exit()
#define _NR_fork 2
#define _NR_read 3
#define _NR_write 4 <- This is our write()
#define _NR_open 5

write(0, "WWW.ROSIELLO.ORG", 16);
...............................
...............................
The first argument "0" is the standard output(video) where we have to print the string wich appears as second argument. The last argument "16" indicates the length of the string.

Let's try to implement this instruction in assembly.

xor %eax, %eax <- It cleans the register %eax
xor %ebx, %ebx
xor %edx, %edx
push %eax <- It inserts NULL into the stack closing the string, thus, no garbage characters will appear.
push $0x47524f2e #push GRO. into the stack
push $0x4f4c4c45 #push OLLE into the stack
push $0x49534f52 #push ISOR into the stack
push $0x2e575757 #push .WWW into the stack

The above four push insert into the stack the string "WWW.ROSIELLO.ORG" in its hexadecimal codify. As one can notice the string must be pushed into the stack overturned because of the stack's working strategy. The Standard Output's descriptor is associated with the %ebx register wich contains at the moment the value 0 then we have not to indicate anything else. (write(0,..)).

mov %esp, %ecx # it moves %esp into %ecx

Now the string's address is in the register %esp (remember that esp is increased/decreased only by pop/push) and we put it in the register %ecx, thus the CPU will be able to find the accurate position of the string in the stack (write(0, string, ..)).

mov $0x10,%dl #size 16 bytes

Exactly as in C language we indicate that the string size is 16 bytes (write(0, string, 16)).

mov $0x4,%al #syscall for write()
We put in the register eax (in the low part: al) the number of the write() routine.

int $0x80 #execute the syscall
Now the kernel will get the control of the application and will execute our write() routine.

The implementation of the exit(0) is even easier.

exit(0):

xor %eax, %eax
xor %ebx, %ebx
eax and ebx registers are clean.

mov $0x1, %al #syscall for exit()
Let's insert the value of the exit into al.
int $0x80 #execute the syscall
Let's give the control to our kernel.

Compile and Execute
The last step to do is the codification in binary code. In order to reach our purpose we will use the gnu debugger (gdb).

angelo@rosiello.org:~shellcode$ gdb rosiello

(gdb) disas main
Dump of assembler code for function main:
0x80482f4
: push %ebp
0x80482f5 : mov %esp,%ebp
0x80482f7 : sub $0x8,%esp
0x80482fa : and $0xfffffff0,%esp
0x80482fd : mov $0x0,%eax
0x8048302 : sub %eax,%esp
0x8048304 : xor %eax,%eax
0x8048306 : xor %ebx,%ebx
0x8048308 : xor %edx,%edx
0x804830a : push %eax
0x804830b : push $0x47524f2e
0x8048310 : push $0x4f4c4c45
0x8048315 : push $0x49534f52
0x804831a : push $0x2e575757
0x804831f : mov %esp,%ecx
0x8048321 : mov $0x10,%dl
0x8048323 : mov $0x4,%al
0x8048325 : int $0x80
0x8048327 : xor %eax,%eax
0x8048329 : xor %ebx,%ebx
0x804832b : mov $0x1,%al
0x804832d : int $0x80
End of assembler dump.

Our code begins at the instruction and termi- nates at .

To gain the opcode you should adopt the following way.

(gdb) x/bx main+16
0x8048304 : 0x31 <- OPCODE
(gdb)
0x8048305 : 0xc0 <- OPCODE
(gdb)
0x8048306 : 0x31 <- OPCODE
....
and so on till .

Now it's indispensable to put anything as this pattern"x31xc0x31..".
"x31xc0x31xdbx31xd2x50x68x2ex4f"
"x52x47x68x45x4cx4cx4fx68x52x4f"
"x53x49x68x57x57x57x2ex89xe1xb2"
"x10xb0x04xcdx80x31xc0x31xdbxb0"
"x01xcdx80"

To compile and execute the shellcode you can organize it in a C program as the following scheme.

angelo@rosiello.org:~shellcode$ cat shellcode.c

#include

char shellcode[]=
"x31xc0x31xdbx31xd2x50x68x2ex4f"
"x52x47x68x45x4cx4cx4fx68x52x4f"
"x53x49x68x57x57x57x2ex89xe1xb2"
"x10xb0x04xcdx80x31xc0x31xdbxb0"
"x01xcdx80";
main()
{
void (*routine) ();
(long) routine = &shellcode;
printf("Size: %d bytesn", sizeof(shellcode));
routine();
}

angelo@rosiello.org:~shellcode$ gcc shellcode.c -o shellcode
angelo@rosiello.org:~shellcode$ ./shellcode
Size: 44 bytes.
WWW.ROSIELLO.ORG

Conclusions
Making a shellcode isn't difficult, but you will need patience and practice to become skilled in doing it. Shellcoding is very important mainly in the low level appli- cations. For example, if you want to write an exploit you will need to write shellcode to have the exploited program execute the code you want. Personally I think that anyone interested in security of com- puter science should know these basic concepts and theories wich support research of new bugs and exploiting ways.


http://www.rosiello.org
contact: angelo@rosiello.org














E-Mail Link

Your IP address will be sent with this e-mail
From e-mail to e-mail



4751 Views
4.36/5 Rating
14 Votes
Newest
Highest Rated
Most Viewed
Reference

Javascript Feeds
RSS (New Papers)
Security Dashboard

About SecurityDocs
Advertise
Contact

Valid HTML 4.01!
Valid CSS!


Unless otherwise noted, all paper copyrights are owned by the author. The rest copyright 2003-2005 TechTarget

Privacy : Contact