Assemby + C - Part # 3+4

During my journey of learning Assembly, i was taking some notes .. and now, I'm sharing it openly with you <3 ...

Lets first learn something intersting and funny 😄

3. Your First Instruction: No-Operation (nop)

Your First x68-64 instruction: NOP

  • NOP - No-Operation! No registers, no values, no nothin'!

  • Just there to pad/align bytes, or to delay time

  • Attackers use it to make simple exploit more reliable. But that's another notes will be shared soon.

Extra! Extra! Late-breaking NOP news!

  • Amaze those who know x86 by citing this interesting bit of trivia:

    • "The one-byte NOP instruction is alias mnemonic for the XCHG (E)AX, (E)AX instruction"

      • I have never looked in the manual for NOP apparently :)

    • XCHG instruction is not officially in this notes. But if i hadn't just told you what it does, I bet you would have guessed right anyway.

4. The Stack

A stack is Last-In-First-Out (LIFO) data structure where is "pushed" on the top of the stack and "popped" off the top.

  • The stack is a conceptual area of main memory (RAM) which is designated by the OS when a program is started.

    • Different OSes start it at different addresses by their own convention, or if they're using address space layout randomization (ASLR).

  • By convention the stack grows toward lower memory address. Adding something to the stack means the top of the stack is now at a lower memory address.

RSP points to the top of the stack - the lowest address which is being used. - While data will exist at addresses beyond the top of the stack, it is considered undefined.

What can you find on the stack ?

  • "Return addresses" so a called function can return back to the function that called it.

  • Local variable

  • Sometimes used to pass arguments between function

  • Save space for registers so functions can share registers without smashing the value for each other.

  • Save space for registers when the compiler has to juggle too many in a function.

  • Dynamically allocated memory via alloca()

Simple Stack Diagram

We are going to use the following C code:

#include <stdio.h>

int bar(int y){
    int a = 3 * y;
    printf("bar returns %d", a);
    return a;
}

int foo(int x){
    int b = 5 * x; 
    printf("foo returns %d", b);
    return bar(b);
}

int main(){
    int c = foo(7);
    printf("main returns %d", c);
}

The stack format related to previous code is as the following:

Push & Pop instructions

As seen in Hello.c ! (on macOS)

is the same as the following using Ubuntu 20.04 "objdump -d"

Before knowing what to do with push and pop, we can figure that are balancing each other

But .. if we review the same code disassembling using Visual Studio 2019, we will not find this two function

Mystery Listery (will be shown on the following parts)

  • Why do the GCC/Clang HelloWorlds have balanced Push/Pop instruction but Visual Studio doesn't ?

PUSH

  • Push quadword onto the stack

  • The push instruction automatically decrements the stack pointer, RSP, by 8 bytes.

  • In 64-bit execution mode, operand can be:

    • The value in a 64-bit register

    • A 64-bit value memory, as given in "r/mX" form talked about next.

"r/mX" Addressing Forms

  • "r/mX" is a term I made up to refere anywhere you see "r/m8", "r/m16", "r/m32", or "r/m64" in the Intel manual.

In Intel syntax, most of the time square brackets [ ] means to treat the value within as a memory address, and fetch the value at that address (like dereferencing a pointer)

  • An "r/mX" can take 4 forms:

    1. Register -> rbx

    2. Memory, base-only -> [rbx]

    3. Memory, base+index*scale -> [rbx+rcx*X]

      1. For X = 1,2,4 or 8

    4. Memory, base+index*scale+displacement -> [rbx+rcx*X+Y]

      1. For Y of A byte (0-2^8) or 4 bytes (0-2^32)

  • [base + index *scale + displacement]

    • Has natural applicability to multi-dimensional array indexing, arrays of structs, etc.

  • So when I say in the future instruction support access to memory, I mean memory as encoded in an "r/mx" form.

  • And when I say "r/mX", I mean something that could be as simple as a single register, or as complicated as memory address calculation in that form.

Note about the ` address convention

  • When writing 64 bit numbers, it can be easy to lose track of whether you have the right number of digits.

  • WinDbg (which we don't use in this notes, but do in future architecture notes), allows you to write 64 bit numbers with a ` between the two 32 bit halves.

  • I think this is helpful to see when a number is > 32 bit or not (because there will be some non-zero value on the left side of the `)

  • So in the following notes i'll often write 64 bit numbers like 0x12345678`12345678

  • But keep in mind that the only tool which probably supports you entering them like that is WinDbg

push RAX

POP

  • In 64-bit execution mode, operand can be

    • a 64-bit register

    • A 64-bit memory address, as given in "r/mX" form

pop RAX

Push/Pop 32-bit Throwback

  • if you are executing in 32-bit mode, push/pop will add/remove value 32-bits at a time, rather than 64 bits, and thus they decrement/increment RSP by 4 rather than 8 at a time.

  • Likewise, if you're in 16-bit mode, they push pop 16-bit values and decrement/increment by 2 at a time.

Instruction we know until now

  • nop (6% of total)

  • push (12% of total)

  • pop (5% of total)

Last updated