The Unofficial Newsletter of Delphi Users - by Robert Vivrette
Self-Modifying Code With Delphi
by Marcus M?nnig - email@example.com
Lots of people using high-level languages, like Object Pascal, do not know much about what happens with their code when they click compile in Delphi. If you have a basic knowledge about assembler and about the exe file format, the comments in the source code should make everything pretty clear. For everyone else, I will try to explain what's done in the source code.
My own knowledge about assembler and the exe format is limited and I learned most of it while looking for information about piracy protection and how to implement self-modifying code myself. The reason why I did this article is that I found very little information about this issue, so I put everything I found together to share it. Further, english is not my native language, so excuse any spelling and grammatical mistakes.
Self-modifying code with Delphi
What is it? Normally, we modify our code at design time. This usually happens inside Delphi before the code is compiled. Well, we all know this.
Then, sometimes compiled code gets modified, e.g. a patch might be applied to a (non-running) exe file to do changes to the original exe. This is often used when applications are distributed widely and the users want to update to a newer version. To save download time and to prevent that the user has to reinstall the whole application again, only the differences between two versions of an exe file are distributed in a patch file an applied to the old version of the exe. Another example of patch files are "cracks"... little com or exe files that remove built-in limitations (evaluation time limits, etc.) from applications.
These two kinds of code modifications are obviously done before the exe is executed. When an exe file is executed the file gets loaded into memory. The only way to affect the behavior of the program after this point is to modify the memory where the exe now resides.
A program that modifies itself while it is running by doing changes to the memory uses "self-modifying code".
Why is it bad?
Self-modifying code makes debugging harder, since there is a difference in what is in the memory and what the debugger thinks is in the memory.
Self-modifying code also has a bad reputation, especially because the most prominent use for it are viruses, that do all kinds of hide and seek tricks with it. This also means that if you use self-modifying code it's always possible that a virus checker will complain about your application.
Why is it good?
Self-modifying code makes debugging harder. While this is bad if you want to debug your code, it's good to prevent others from debugging your code or at least make it harder for them. This is the reason why self-modifying code can be an effective part of a piracy protection scheme. It won't prevent that an application can be cracked, however a wise use of this technique can make it very hard.
What functions are needed?
In a Windows environment we can make use the following API calls:
This function is used, well, to read the memory of a process. Since this article is about _self_-modifying code, we will always use this function on our process only.
Used for writing data to a process memory.
Used to change the access protection of a region in memory. To learn more about these functions, refer to the Win32 help file that ships with Delphi and take a look how they are used in the sample code.
What does the example code do?
The code that will be modified is inside the CallModifiedCode procedure:
procedure TForm1.CallModifiedCode(Sender: TObject);
c := clgreen;
b := true;
if b then goto 1;
c := clred;
form1.Color := c;
After studying the code you might be puzzled about some things. Obviously, this code sets the color of Form1, but as it is, the color will always be green, since b is always true, so it will always jump to label 1 and c:=clred before never gets called.
However, there is another function in the program that will change the line if b then goto 1; to if NOT(b) then goto 1; while the program is running, so after this modification in memory is done and this function is called again the form will actually be changed to red. Note that we will not change the boolean value of b, but virtually insert a "NOT" into the if statement.
Surely you noticed the six "nop"'s. "nop" is an assembler instruction and means "no operation", so these 6 lines do just nothing. 6 nop's in a row are quite unusual in a compiled exe, so we will use these nops as a marker for the position of the if statement above inside the compiled exe.
To understand how we will modify the code, we need to take a look at what the compiler will make from our pascal code. You can do this by running the project from Delphi, setting a breakpoint on the line with the if statement and (once you called the CallModifiedCode procedure by clicking the button and the debugger stopped the execution) opening the CPU window from Delphi's debug menu. You will see something like this:
807DFB00 cmp byte ptr [ebp-$05],$00
750D jnz TForm1.CallModifiedCode + $2A
Well, we can clearly see the 6 nops we placed in our code. The two lines above are the assembler code of the if statement. The first line compares a value (as we know from the pascal code this has to be the boolean value of b) with $00, the hexadecimal notation of 0, that in the case of a boolean variable means false.
The second line starts with jnz, what means "jump if not equal" (technically, "jump if not zero") and the address to jump to if the compared values from line one are not equal. So, the first two lines mean: "Compare the value of variable b with 0 (false) and if they are not equal jump away."
Note the hexadecimal values to the left of the asm code above. Each assembler instruction has a unique hexadecimal identifier. Obviously, $90 means "nop". $75 means "jnz", which is followed by the address (relative to the current address) to jump to ($0D in this case). $80 means "cmp" followed by some hexadecimal data specifying what and how it it compared. This hexadecimal representation of the assembler instructions is what makes the exe. If you have a hex editor, load the compiled exe and try to search for "909090909090". You will quickly find it and you will notice that the values before will be identical with the values above.
So, coming back to our task, if we want to insert "NOT" into our if statement, we will need to replace "jnz" with "jz". "jz" means "jump if zero" or "jump if equal". Replacing "jnz" with "jz" will reverse the condition in the original if statement, so once this modification is done the jump will not be done and the line c:=clRed; will be executed and the form will get red. As I said, "jnz" is represented by the hexadecimal value $75. The hexadecimal value for "jz" is $74.
Let's summarize what we have to do to change "if b then goto 1;" to "if NOT(b) then goto 1;": Locate $909090909090 in memory. From this position, go back two bytes and replace $75 with $74. If we want to go back to the original code, we do the same, but replace $74 with $75.
This is what is done in procedure TForm1.ModifyCode. I'll not go into further details here, but the source has lots of comments. You can download the sample code for this article by clicking here. After calling ModifyCode by clicking one of the two buttons on the right, click the "Execute code" button again and open the CPU view in Delphi to see that $75 was actually replaced with $74 or vice versa.
There are easier ways to set the color of a form depending on which button was clicked ;-), but of course the purpose here is to demonstrate the concept of self-modifying code. Self-modifying code is a powerful technique and the example code might be very useful to implement a piracy protection scheme.
Finally, a small warning: You should take care when using a series of assembler nop's as a marker in real world applications, as these kind of unused code sections can be a nest for some viruses, e.g. the