Identifying Source Code

Time for another analysis! This time it is about identifying source code, how to verify if someone uses other source code. You do not necessarily have to add watermarks in your software as I have discussed previously or use "tracking" systems such as AV Tracker - the binary tells more than necessary when other source code is used. Cheers from Vienna,

Peter Kleissner, Software Architect

Background story: The past months

Currently updating :)

Ikarus and Bochs debugger

Important: For following analysis you just need a computer & internet access. No special knowledge is required for following analysis (well it's just opening a file in notepad and googling for the text).

Long before I was at Ikarus I have worked on my operating system ToasterOS. For that reason, I used and still use the bochs debugger, an Intel Architecture Emulator together with a command line debugger. Before summer 2008 I applied at Ikarus - because they searched someone familiar with "x86/x64 Assembler" skills.

Before I came to Ikarus I inspected their products, here they are saying (it is important for the background story):

Pattern/ Checksummen/Heuristic/Simulatoren/Verhaltensbasierte Analyse
[...]
Arbeitet unterstützend mit Simulator- und Sandboxtechniken, um eine verbesserte heuristische Erkennung zu ermöglichen. 

Part 1: Back in 2008, before I came to Ikarus, I took a short look on their binaries. Let's download an old copy of "Ikarus T3 Scan" from mid 2007 - that should be somewhat close to my version in May 2008. Do not forget, if you want to use Ikarus (I do not use it) you must buy a valid license. Lets look on the files alphabetically. The first one in the self-extracting zip archive is "default.t3p". Just open it in notepad.exe (or you can also use Microsoft Word :). Scroll down and search for all ASCII strings.

Part 2: Long before Ikarus I used bochs and also send multiple patches to the developers, thus I know the bochs source very well. You can check it out, my SF user name is < removed >. Software is not something you can argue on, if you use other source, proprietary or not, you will be able to identify the source through string analysis and disassembly.

Here are the exact strings with the positions:

[...]
1107DCF8: 'cmos',0
1107DD00: 'CMOS RTC',0
1107DD0C: 'CMOS RAM',0
1107DD38: 'VGABios Info Port',0
1107DD4C: 'VGABios Debug Port',0
1107DD60: 'VGABios Panic Port 2',0
1107DD78: 'VGABios Panic Port 1',0
1107DD90: 'Bios Info Port (legacy)',0
1107DDA8: 'Bios Info Port',0
1107DDB8: 'Bios Debug Port',0
1107DDC8: 'Bios Panic Port 2',0
1107DDDC: 'Bios Panic Port 1',0
[...]

Now compare 'em to http://bochs.sourceforge.net/cgi-bin/lxr/source/iodev/biosdev.cc <- Open Source!:

 92   DEV_register_iowrite_handler(this, write_handler, 0x0400, "Bios Panic Port 1", 3);
 93   DEV_register_iowrite_handler(this, write_handler, 0x0401, "Bios Panic Port 2", 3);
 94   DEV_register_iowrite_handler(this, write_handler, 0x0402, "Bios Info Port", 1);
 95   DEV_register_iowrite_handler(this, write_handler, 0x0403, "Bios Debug Port", 1);
 96 
 97   DEV_register_iowrite_handler(this, write_handler, 0x0500, "VGABios Info Port", 1);
 98   DEV_register_iowrite_handler(this, write_handler, 0x0501, "VGABios Panic Port 1", 3);
 99   DEV_register_iowrite_handler(this, write_handler, 0x0502, "VGABios Panic Port 2", 3);
100   DEV_register_iowrite_handler(this, write_handler, 0x0503, "VGABios Debug Port", 1);

I am sure every developer on the world knows now what I want to say.

References