Compiling Shellcode

There are a lot of exploits for different kind of software available but only the least black hats know how to write their own shellcode. It is not that difficult creating an own shellcode, it is just simple assembly language used directly in connection with the Windows subsystem. As shellcode is the code referred that is executed due to a bug in a software (exploit), which does usually the loading of the real software (so it is kinda like a dropper).

Peter Kleissner, Software developer


Shellcode is written in assembly language and should be as small as possible. Typical exploits do only allow a small buffer (few hundred bytes) to execute, which must load the real executable from the internet and execute it. The location of the shellcode in memory is generally unknown, so the code must not use any absolute branches or references.

The code still has to use the Windows API, thus it must resolve its own function calls (for normal executables there is the PE format and the PE loader of Windows which does the resolving). Windows offers the Thread Information Block which gots pointers to important variables and is always located at at the base of fs (address [fs:0]).

[fs:30h] -> Process Environment Block Address
[PEB + 12] -> PEB_LDR_DATA:

typedef struct _LDR_DATA_TABLE_ENTRY {
    BYTE Reserved1[2];
    LIST_ENTRY InMemoryOrderLinks;
    PVOID Reserved2[2];
    PVOID DllBase;
    PVOID EntryPoint;
    PVOID Reserved3;
    BYTE Reserved4[8];
    PVOID Reserved5[3];
    union {
        ULONG CheckSum;
        PVOID Reserved6;
    ULONG TimeDateStamp;

The Reserved3 is in real InInitializationOrderModuleList and points to a double linked list of all modules (dlls etc.) in their initialization oder. Under Windows 2000, XP, Server 2003, Vista, Server 2008 the first module will be (normally) always ntdll.dll, and the second kernel32.dll. For Windows 7, this is no longer valid (kernel32.dll is not the second module loaded).

With the InInitializationOrderModuleList you find the base addresses of the loaded modules - also of kernel32.dll. You can then iterate through all exports of the modules and look for the searched function. This is usually done by calculating a 32-bit hash of the export name and comparing it against the searched functions. The most common algorithm for that hash is a bitwise rotation of each character by 7 or 13 and xoring it. I have a useful macro for generating the hash with the netwide assembler:

%macro HASH 1.nolist
; hash generation algorithm (hash = hash rotate left 7 ^ char)
  %assign i 1     ; i = 1
  %assign h 0     ; h = 0
  %strlen len %1      ; len = strlen(%2)
  %rep len
    %substr char %1 i ; fetch next character
    %assign h   (((h << 7) & 0xFFFFFF80) + ((h >> (32-7) ) & 0x7F)) ^ char
    %assign i i+1     ; increment i
  dd h            ; only hash

HASH 'CreateProcessA'   ; will be 46318AC7h

I only want to discuss the technical principles here, you can review as example the Conficker shellcode (that installs conficker over exploits) from The Conficker Report on page 13 there.

Testing the shellcode

When the shellcode is developed using assembler and is integrated as binary, testing becomes necessary. Once you have compiled the raw assembly code to a binary, you can use the perl script to convert the binary to a C++ array (which you can then include in your C++ code).

perl Shellcode.bin 1 > Shellcode.cpp

You can then include it in your C++ file and call it using a function pointer cast to the array:

#include "Shellcode.cpp"

    void (*Shellcode)(void * Variable);
    Shellcode  = (void (*)(void * Variable))(PVOID)&bin_data1;

    VirtualProtect (&bin_data1, 4096, PAGE_EXECUTE_READWRITE, &RValue);

Be sure to first set the necessary execution rights for the page, otherwise the data execution prevention (if enabled) will raise an exception.