Assembly + C - Part # 1

During my journey of learning Assembly, i was taking some notes .. and now, I'm sharing it openly with you <3 ...

During my journey learning assembly, i was following the 'Architecture 1001: x86-64 Assembly" Course by "OpenSecurityTraining2". Now, I'm sharing it hoping that can helps you somehow finding a way to learn or take notes about Assembly.

So, let's start.

Miss Alaineous

  • It's called x86 because of the progression of intel chips from 8086, 8086 80286, etc.

  • Originally 16-bit architecture. Later evolved to 32 and 64 bit, but kept the backwards compatibility. The hardware actually starts up in 16 bit before software transitions it to 32 or 64 bit operation

  • Intel originally wanted to break from x86 when moving to 64 bit. This was IA64 (Intel Architecture 64 bit) aka ltanium. However, AMD decided to extend x86 to 64 bits itself, leading to the AMD64 architecture. When Itanium had very slow adoption, Intel decided to bite the bullet and license the 64 bit extensions from AMD.

  • In the Intel manuals you will see the 64 bit extensions refereed to as IA32e or EMT64 or Intel64 (but never IA64. Again that's ltanium, a completely different architecture).

  • You might sometimes see it called amd64 or x64 by MS or some Linux distributions.

Where is x86-64 used?

  • More powerful (but thus power-hungy) systems such as PCs, servers, and even super-computers.

    • Minimal adoption on phones or embedded systems. Intel does have the entire Atom line of lower-power chips targeted towards embedded systems though (and they're starting to focus more on performance per Watt, which is where ARM has always been better)

What you're going to learn

Let's first see the translations between very simple C code and the assembly that underlies them.

#include <stdio.h>
int main(){
	printf("Hello World!\n");
	return 0x1234;
}

Is the same as: Windows Visual Studio 2019 Community /GS (buffer overflow protection) option turned off

Dissembled with Visual C++

Dissembled with Ghidra 9.2

Dissembled with "IDA Freeware 7 (with some omissions for fitting on screen)"

Ubuntu 20.04, GCC 9.30

Dissembled with "objdump -d"

Take Heart!

  • By one measure, only 14 assembly instructions acount for 90% of code!

  • You've already seen 10 common instructions, just in the hello world variation! (And 2 special security instructions we'll talk about later.)

  • I think that knowing about 20-30 (not counting variation) is good enough that you will have the check the manual very infrequently.

References

"By one measure, only 14 assembly instructions account for 90% of code!" citation: "Statistical Structures: Fingerprinting Malware for Classification and Analysis", Daniel Bilar, http://www.blackhat.com/presentations/bh-usa-06/BH-US-06-Bilar.pdf

The x86-64 instruction frequency pi charts are from: "An Analysis of x86-64 Instruction Set for Optimization of System Softwares", Ibrahim et al., https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.407.5071&rep=rep1&type=pdf

Last updated