Accessing and Modifying Upper Bits in x86 and x64 Registers August 15, 2016
Through one’s journey with x86/x64 Assembly, there comes a time where one might want to access or modify the contents of the upper half of a register — that is, the upper 16 bits of 32-bit registers, or the upper 32 bits of 64-bit registers. In this post, I’m going to show you a number of ways to go about accessing and modifying your data. We’ll even get a bit crazy and take two nibbles from one register and replace two nibbles in another register with them.
Before I get started, if you’re not familiar with the Intel Developer’s Manuals, they are an invaluable resource. In them, every instruction is meticulously detailed. It’s even fun to just start from the table of contents and see which instructions appeal to you, then go read how to use them!
Light a fire, pour some wine, turn out the lights, and SPOON that manual, ’cause, baby, it will love you long time, lol. Onward!
If you’re comfortable converting and working with bin/hex/dec, then you can skip this section. For others, here’s a quick primer on what you’ll need to familiarize yourself with to feel confident about the bitwise instructions I’m going to be delving into. It may not make much sense initially, but just keep beating your brain with it (don’t forget the value of diffuse learning) and I promise it’ll click for you.
Getting right to it, a bit (or binary digit) consists of a value of 1 or 0.
Fun fact: A quantum bit (qubit/qbit) can be 1 and 0 at the same time! [Read More]
Four bits (ex: 1101) equals one nibble (0-F hex), and eight bits (ex: 11010010) equals one byte (ex: 4E). To that end, a nibble is half of a byte, and a byte is comprised of two nibbles.
To help you visualize this, let’s start with this 4-byte value: FA 5D B1 20. You could also call that an 8-nibble value, as well as a 32-bit value. The first byte, FA, is an 8-bit value that breaks down into two nibbles: F and A, each of which are comprised of four bits. F‘s bit value is 1111. A‘s bit value is 1010. Thus, FA in binary is 11111010. FA is itself a hexadecimal value, while its decimal equivalent is 250.
Hexadecimal’s primary function is to provide us humans with an easy way to reference binary numbers. You will see hexadecimal numbers referenced a handful of ways, such as the number itself (45EDF14C), preceded with 0x (0x45EDF14C), post-fixed with ‘h’ (45EDF14Ch) or ‘H’ (45EDF14CH), or other significantly less common ways, all of which you can read about here.
You can simply use a calculator to convert values between binary, hexadecimal, and decimal. You can also learn what the bin/hex/dec values are for a nibble via a table like this (credit: this presentation by Xeno Kovah), or learn to easily convert values manually as detailed in this video.
The key here is that we’re going to be taking hexadecimal numbers and looking at their binary equivalents, so 1s and 0s, here we come!
There are two ways you can go about accessing data in the upper half of a register by using bitwise instructions (instructions that operate at the bit level of data): destructively (destroying irrelevant data) and non-destructively (preserving data). The one you choose will directly inform your approach, but luckily, the logical underpinnings of the instructions you’ll use are quite similar.
You may or may not be familiar with the standard bitwise fare of AND, OR, and XOR — all of which can be used to manipulate/toggle bits via masking, etc. — but what about instructions like BSWAP, BEXTR, BZHI, ROL/ROR, or SHLD/SHRD? (Seriously, though…Intel Developer’s Manuals, *cough, cough*.)
SHL and SHR Instructions:
First, let’s discuss SHL (Shift Left) and SHR (Shift Right), both destructive instructions. Because these instructions work at the bit level, you can do all kinds of arithmetic tricks with them (divide by X, etc.), but we’re interested in these instructions moving data to where we can easily access it. (They also affect the flags register, something you should always, always, always be mindful of!)
Consider the following 32-bit value: FE62A89C. In binary, that number is 11111110011000101010100010011100 (note there are 32 digits there — specifically, 32 bits). When you SHL or SHR, you shift the bits of a number to the left or right by however many places you specify. Let’s assume register EAX contains our value of FE62A89C.
EAX now equals E62A89C0, or 11100110001010101000100111000000 in binary. The bits were shifted left by 4, which means all bits moved to the left by 4 positions. Since a nibble equals 4 bits, us shifting 4 bits to the left means the F nibble was shifted to the left (making it go away because the leftmost bits fell off the value as if you’d pushed them off a ledge), and a nibble of 0 (which equals 0000, or a 4-bit value of 0) fills in on the end of the number. When you shift bits either direction, 0 bits (that is, bits with a value of 0) occupy what would otherwise be empty spaces. Let’s now shift some bits to the right.
EAX now equals 0000000E, or 00000000000000000000000000001110 in binary (which would actually be referenced as just 1110, or 1110b for reasons similar to the ‘h’ I discussed in the primer). The bits were shifted right by 28, which means all bits moved to the right by 28 positions.
What we’ve accomplished here is making AL (the lowest byte of EAX) equal 0E. So if the E was what we wanted to access from FE62A89C, then we’ve accomplished our goal by clearing the F with an SHL instruction, and finally clearing the other unwanted bytes with an SHR instruction. There are other ways to achieve the same goal, a couple of which I discuss in examples ahead.
ROL and ROR Instructions:
Now let’s discuss ROL (Rotate Left) and ROR (Rotate Right), both non-destructive instructions. This means that any bits displaced will still reside in the register for you to restore; they simply roll over into the bit positions that would have otherwise been filled with 0 bits (again, bits with a value of 0). Sticking with our original value in EAX from above (FE62A89C):
EAX now equals E62A89CF, or 11100110001010101000100111001111 in binary. The bits were rotated left by 4, which puts E where F was, 6 where E was, and so on until you get to the end. The difference here is that instead of F dropping off the front and a 0 being added to the end as with the SHL instruction above, F rolls over to the end of the value. Remember, we’re dealing with bits, so all bits shifted left 4 positions, but instead of four 0 bits being added to the end, the four 1 bits from the front were rolled over to the end.
We now have the F that was in the most significant nibble position of EAX, in the least significant position of EAX (the low nibble in AL)! Now we can access it via other instructions if we wanted to do something with it, like copy it to another register. If we wanted to keep only the F, then we could do something like this if EBX was the register we wanted to store our value in:
AND BL,0x0F ;Clears all bits in the upper nibble, leaving only our F
If you don’t understand the purpose of that AND instruction, take a look at this video when you get a chance.
Now, to restore EAX back to what it was, we simply rotate the bits back, like so:
EAX now equals FE62A89C, our original value. I did a little something to trick some of you with that instruction, though. All we needed to do was rotate the bits left by 4 to end back up with our original value. I rotated them 4 plus an additional 32! The reason I did that was to show you that you can rotate bits ’til the cows come home; they’ll all stay in the register and roll to wherever you tell them to. Rotating a 32-bit value by 32 bits means your value won’t change. It would be like you picking up a cup from a desk, spinning around exactly 360 degrees, then placing the cup down in the exact same spot. So in that instruction above, rotating 4 bits left is all we needed. The 32 additional bits rotated were pointless, but the bits still rotated all the same because the CPU does exactly what you tell it to do.
Now that you’ve got a feel for moving bits around to put the data you want in places where you can access it, consider the following 32-bit scenario:
You want to base a code injection on the contents of, say, EBX; however, you need to modify the upper 16 bits while preserving the lower 16 bits, which could change in any given cycle, thus you can’t just write your own full 32-bit value to EBX in place of whatever instruction you’re basing your injection around. This is where the stack can really come in handy! With it, you could preserve/restore the lower half, allowing you to write an immediate to the upper half.
Before: EBX = D34DC0DEPUSH BX ;Push 0xC0DE, the lower 16 bits of EBX, onto the stack
MOV EBX,0xB17E0000 ;Write an immediate 32-bit value of 0xB17E0000 to EBX
POP BX ;Pop 0xC0DE off the stack into the lower 16 bits of EBX
After: EBX = B17EC0DE
Things get a bit quirkier when you want to do this with a 64-bit register since performing certain instructions on their lower 32-bit mnemonics will zero-out the upper 32 bits. In other words, an instruction like mov eax,ebx in a 64-bit program would actually equate to mov rax,ebx (meaning that after that instruction executed, RAX would contain the contents of EBX only, even if the upper 4 bytes of RAX had data in them prior to that instruction’s execution).
With that in mind, we need to get more creative with our instructions. This is where perusing through the Intel Developer’s Manual comes in handy! You might not find the most optimal instruction or solution, but as you run into scenarios like what we’re discussing in this article, you’ll begin familiarizing yourself with not only solutions you can implement, but also with what you’re seeing if you’re digging through disassembled code.
Let’s now assume we have two 64-bit registers, RAX and RBX, each filled with values as specified below. Our first goal is to combine the lower half of RAX with the upper half of RBX, storing the result in RAX. Then, getting a bit crazier, our goal is (counting right to left) to change nibbles 14 and 15 of RBX with nibbles 8 and 9 of RAX, storing the result in RBX.
Goal 1: Combine EAX (the lower half of RAX) with the upper 32-bits of RBX. Store the result in RAX.
RAX = 22222222FFFFFFFF
RBX = D34DC0DEF00DBEEFROR RAX,0x20 ;Rotate RAX 32 bits to the right (20 hex is 32 dec)
;RAX = FFFFFFFF22222222
SHR RBX,0x20 ;Shift RBX 32 bits to the right, clearing upper RBX
;RBX = 00000000D34DC0DE
SHRD RAX,RBX,0x20 ;Imagine bits being placed like RBX:RAX, like this:
;Now the instruction executes, which shifts bits 32
;places to the right; however, RBX bits are copied,
;not permanently shifted.
RAX = D34DC0DEFFFFFFFF
RBX = 00000000D34DC0DE
Goal 2: Replace nibbles 14 and 15 in RBX with nibbles 8 and 9 from RAX. Store the result in RBX.
RAX = 22222221EFFFFFFF
RBX = D34DC0DEF00DBEEFSHR RAX,0x1C ;Shift 28 bits right
ROL RBX,0xC ;Rotate 12 bits left
MOV BL,AL ;Copy AL to BL
ROR RBX,0xC ;Rotate 12 bits right
RAX = 000000022222221E
RBX = D1EDC0DEF00DBEEF
I’ve covered a lot of ground in this rather lengthy post, so if you’re not quite clear with certain concepts, do be sure to reference the external links I’ve peppered throughout and come back to sections herein once your understanding increases. With so many potential scenarios to provision for and instructions to utilize, there are likely more optimal solutions than those I’ve provided; however, you should now have a solid foundation to build and expand upon where bitwise operations/instructions are concerned.
It’s in the spirit of learning and growth that I highly encourage feedback in the form of additional scenarios and more optimal uses of instructions from those of you who feel so inclined, so please help to spread some knowledge or provide clarity where you feel there should be!
Thanks for reading.8