Microsofts Rich Header


I want also talk about the Microsofts Rich Header. The Rich Header is a structure in PE files between the DOS Header and the NT Header (between DOS stub and PE Header). It contains version information of linked libraries and the linkers versions.

The Rich Header

It is created by Microsoft compilers and usually looks like following:

00000080  54 62 EF 9B 10 03 81 C8 10 03 81 C8 10 03 81 C8  Tb.........
00000090  37 C5 EF C8 11 03 81 C8 37 C5 FC C8 12 03 81 C8  7...7...
000000A0  37 C5 FA C8 0B 03 81 C8 10 03 80 C8 C9 03 81 C8  7.......
000000B0  37 C5 EC C8 33 03 81 C8 37 C5 FD C8 11 03 81 C8  73..7...
000000C0  37 C5 F9 C8 11 03 81 C8 52 69 63 68 10 03 81 C8  7...Rich...

This is the Rich Header of notepad.exe. The start offset is 80h, directly after the DOS stub. Immediately after the Rich Header follows the PE Header. You may notice a pattern which comes from the fact that the header is encrypted.


The header is encrypted via xor operation, and the value (dword) to xor everything is stored after the "Rich" keyword at the end of the header. After xoring the header looks like following:

00000080  44 61 6e 53 00 00 00 00 00 00 00 00 00 00 00 00  DanS............
00000090  27 c6 6e 00 01 00 00 00 27 c6 7d 00 02 00 00 00  'n.....'}.....
000000A0  27 c6 7b 00 1b 00 00 00 00 00 01 00 d9 00 00 00  '{............
000000B0  27 c6 6d 00 23 00 00 00 27 c6 7c 00 01 00 00 00  'm.#...'|.....
000000C0  27 c6 78 00 01 00 00 00 52 69 63 68 10 03 81 c8  'x.....Rich...

We recognize the first dword to be "DanS", which seems to be the initials of Dan Ruder, Mechanics of Dynamic Linking at Microsoft (in 1993).

The Format

The format of the rich structure from the documentation by "lifewire":

'DanS'^b, b, b, b      -- identification block / header

compid^b, r^b          -- from 0
..                     --      :
compid^b, r^b          -- to   n

'Rich', b              -- terminator


All values are dwords, b is the xor value, a checksum. ^ means xor operation. compid is the compiler id. A minimal Rich Header has a size of 8*4 (containing one compiler id at least). The header can be verified using the "Rich" keyword and the "DanS" keyword.

The compiler id is the linker version used for compiling the library. The rich header contains a list of all library versions linked with the file, which are the comp id values. The last comp id value is the linker version used for linking the file. The value is stored as double word. The low word contains the build number, the high word the major number (low 4 bits) and the subversion number (high 4 bits).

The checksum and xor value b is calculated in these steps (also documented by "lifewire"):

b=sizeof(dos_stub)              // (almost always 0x80)

for (int i = 0; i < sizeof(dos_stub); i++)
    b += dos_stub[i] ROL i;     // ROL is the x86 rotate over left operation.

for (int i = 0; i < n; i++)
    b += compid[i] ROL r[i];

The first loop makes a checksum over the complete DOS stub, the second over all library versions. value

As described previously, the compid value is the linker version. The linker takes all @comp.ip values of all static linked libraries (for example kernel32.lib) and stores them into a list. At the end of the list the linker puts its own version. You can directly look into a lib file and search for "", and the next dword is the linker version (compiler id).

Using the last compid value in the list, you can determine the compiler version, linker version and development environment used for compiling the application. For example compiler id 0x0078C627 means "Version 8.00.50727, Microsoft Visual Studio 2005". Note Microsoft stores different, but unique last compid values with different compiler versions. To identify whether compiler (cl.exe) or linker (link.exe) version was used you can use the build number to identify the two versions (the build number is equal, the major/minor version number of compiler/linker isn't).

You can create and use a translation table (value to compiler/linker version) to use the Rich Header effective.