If you’ve been through the [Worldmail exploit write-up]({% post_url 2020-05-09-worldmail-exploit %}) or spent any time developing shellcode, you’ve run into bad characters. Null bytes that kill your TCP connection. Characters that get mangled by string functions before they ever reach your buffer. Values that simply don’t survive the journey from your machine to the target.
Encoders are how you get around that.
The idea is straightforward: encode your shellcode before sending it, and prepend a small decoder stub that runs first on the target and decodes it back before handing off execution. The decoder runs, your shellcode is restored, and the bad characters never had to exist in the payload.
The simplest encoder uses XOR — and that’s what this post builds from scratch.
Why XOR?
XOR has a useful property: if you XOR the same data with the same key twice, you get back to where you started. That means your encoder and your decoder are the same logic. One implementation, two purposes. For something that needs to be compact enough to fit in limited shellcode space, that matters.
Start with pseudocode
Before writing a single byte of assembly, plan it out in plain English:
clear the register
save the current address
get the length of the shellcode
xor the current byte of shellcode
increment the address
check if we've reached the end
jump back to the xor if not
Seven steps. That’s the whole encoder. Now let’s turn each one into assembly.
Step 1: Clear a register
ECX will be used to track position and length. Start by zeroing it — XORing a register against itself always produces zero:
| |
Step 2: Get the current address (position-independent)
This is the clever bit. The encoder needs to know where it is in memory at runtime — but that address changes depending on where the shellcode lands. The solution is a self-referencing CALL:
| |
CALL pushes the address of the next instruction onto the stack before jumping. POP EAX retrieves it. EAX now contains the current memory address — dynamically, regardless of where the shellcode loaded. This is the foundation of position-independent shellcode.
Step 3: Store the shellcode length
Two versions here, depending on shellcode size.
For shorter shellcode, 8 bits of ECX (the CL register) is enough:
| |
For larger shellcode, use 16 bits (the CX register):
| |
Because ECX was zeroed in step 1, loading into CL or CX leaves the upper bits clean. No masking needed.
Step 4: Calculate the end address
ECX holds the shellcode length. EAX holds the current address. Adding them together gives you the address where the shellcode ends — which is what the loop needs to know when to stop:
| |
Step 5: XOR the current byte
This is the encoding instruction itself. Rather than XORing a register, the encoder operates on the memory the register points to — because the shellcode lives in memory, not in a register:
| |
This XORs the byte at memory address EAX+16 with a seed value of 0F. The +F offset accounts for the length of the encoder stub itself — so the encoder doesn’t accidentally XOR its own instructions before they’ve run.
Step 6: Increment the counter
Move EAX forward one byte:
| |
Step 7: Check if done, loop if not
Compare the current position against the end address calculated in step 4:
| |
If EAX equals ECX, the loop ends — all bytes have been encoded. If not, JNZ jumps back to the XOR instruction and the process continues.
The complete encoder
Assembled and annotated:
| |
Ten instructions. Compact by design.
How this plugs into nullsploit
In the example above, the shellcode length and XOR seed are static values — useful for understanding the logic, but not practical for a real tool. In the [nullsploit exploitation engine]({% post_url 2019-05-03-nullsploit-engine %}), both values are generated at runtime. The length is calculated automatically from the payload being encoded, and the seed is dynamic — which means each generated payload looks different, even for the same shellcode. That’s the version worth using in practice.
What to take away from this
A few things worth internalising:
Position-independent code matters. Any shellcode that hardcodes memory addresses will break when it lands somewhere unexpected. The CALL/POP EAX pattern for getting the current address at runtime is something you’ll see again and again in real-world shellcode.
Encoders are small by necessity. Every byte of encoder stub is a byte that isn’t shellcode. The constraint forces you to think carefully about efficiency in a way that most programming doesn’t.
And understanding how an encoder works at the assembly level means you understand what Metasploit’s shikata_ga_nai is actually doing when you select it — rather than just knowing that it handles bad characters. Know your tools from the inside out. That’s the standard worth holding yourself to.