-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathAssembly.txt
More file actions
1062 lines (841 loc) · 36.2 KB
/
Assembly.txt
File metadata and controls
1062 lines (841 loc) · 36.2 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
DATA SIZES
=============
Bit: 0 or 1
Byte: 8 bits
Word: 16 bits
Dword (Intel)/Long (ATT): 32 bits
Qword (Intel)/Quad (ATT): 64 bits
GENERAL SYNTAX
=================
INTEL ATT
mov al, 23 movb $23, %al
mov bl, 6 movb $6, %bl
add al, bl addb %bl, %al
- Suffix: b = byte; w = word; l = long; q = quad
ATT is more "intuitive" syntax and here is why.
In Intel syntax, destination is usually first operand. In ATT, destination
is 2nd operand. Hence, reading the syntax becomes intuitive.
Ex:
INTEL ATT
mov eax, 100 movl $100, %eax
ATT can be read as "Move 100 to eax"
Intel can be read as "Move in to eax value 100"
- In ATT syntax if no suffix is present to a mnemonic then the size is
inferred from destination operand.
x86 Register Set
================
AL = 8 bit register. Can hold signed & unsigned ints or single byte chars.
AH = Same as AL
AX = AH + AL. Changing values in AH and AL will affect AX, but AL and AH
are independent. H & L represent High and Low.
AX (ASCII register ??)
+------+------+
| AH | AL |
+------+------+
EAX = 32 bit register. Lower 16 bits is AX. There is no way to directly
access top 16 bits of EAX.
EAX (Extra AX)
+------+------+------+------+
| | | AX |
| | | AH | AL |
+------+------+------+------+
RAX = 64 bit register.
RAX (64 bit)
+------+------+------+------+------+------+------+------+
| | | | | EAX |
| | | | | | | AX |
| | | | | | | AH | AL |
+------+------+------+------+------+------+------+------+
- In x64 CPU, there are 16 such registers as RAX. Its called General
Purpose Registers (GPR).
64 low 32 low 16 low 8 Notes
bits bits bits bits
------------------------------------------------------------
RAX EAX AX AH/AL Accumulator
RBX EBX BX BH/BL Base
RCX ECX CX CH/CL Counter
RDX EDX DX DH/DL Data
RSP ESP SP SPL Stack Pointer
RBP EBP BP BPL Base Pointer
RSI ESI SI SIL Source Index
RDI EDI DI DIL Destination Index
R7 R7D R7W R7B Only on 64 bit
R8 R8D R8W R8B Only on 64 bit
R9 R9D R9W R9B Only on 64 bit
R10 R10D R10W R10B Only on 64 bit
R11 R11D R11W R11B Only on 64 bit
R12 R12D R12W R12B Only on 64 bit
R13 R13D R13W R13B Only on 64 bit
R14 R14D R14W R14B Only on 64 bit
R15 R15D R15W R15B Only on 64 bit
------------------------------------------------------------
- There are Special Registers:
RIP = Instruction Pointer. Location in RAM where next instruction
is present.
RFLAGS = Helps to check condition/results of previous instruction.
S[] = Set of registers used by floating point unit (called x87
floating unit). They are 80 bits wide and work like a stack.
SIMD = Many SIMD registers, depending on CPU.
- Many registers specific to CPU such as: counting clock ticks,
measuring performance of branching, segment regs, etc.
- CS - Code Segment Register
DS - Data Segment
SS - Stack Segment
ES - Extra
The above 4 registers are no longer in x64, but were there earlier.
BASIC INSTRUCTIONS
====================
- mov, add, sub, inc, dec, neg, imul, ret
- C and C++, each have their own calling conventions.
- In x64, the first 6 integer variables (function parameters) is passed
via registers. Pretty much anything that can be represented as integer
variable is passed via registers.
Param Register
----- --------
1 RDI
2 RSI
3 RCX
4 RDX
5 R8
6 R9
>6 passed on Stack
Ex: void Func(int a, char b, unsigned long c)
int a: passed in EDI portion of RDI register
char b: passed in SIL portion of RSI register
unsigned long c: passed in RCX
- Integer values are returned in C++ in RAX registers.
Type Register
---- --------
char/bool AL
short AX
int EAX
long RAX
MEMORY ADDRESSING MODES (for x86)
(a) Indirect (R) Mem[Reg(R)]
Ex: movl (%ecx), %eax
(b) Displacement D(R) Mem[Reg(R)+D]
Ex: movl 8(%ebp), %edx
(c) Complete Memory Addressing Mode. The most general form:
D(Rb, Ri, S) = Mem[Reg(Rb) + S*Reg(Ri) + D]
Ex: leal instructions use this
leal (%eax, %ebx, 4), %ecx
(d) PC Relative Addressing.
0x100 cmp r2, r3
0x102 je 0x70 # PC relative addressing
0x104 ...
... ...
0x172 add r3, r4
PC relative addressing are relocatable. Absolute branches are not.
More on Addressing Modes:
-------
+ Direct Addressing (or Absolute Addressing)
+ Immediate Addressing
+ Register (Direct) Addressing
+ Register (Indirect) Addressing
DIRECT (or ABSOLUTE) ADDRESSING
----
The operand is in memory. Address of the operand is held in instruction.
instruction
---
operation, <register>, <memory-location>
|
|
+-----> memory-location
+---------+
| Operand |
+---------+
Ex:
ld, r2, (100) ; r2 = contents of memory. Address is 100
instruction
---
+---------------+
| ld | r2 | 100 |
+---------------+ Memory
| | +-------+
| | | |
| | +-------+
| +----> 100 | |-----------+
| +-------+ |
| | | |
| +-------+ |
| |
| Registers |
| +----------+ |
| r1 | | |
| +----------+ |
+---------> r2 | |<-------+
+----------+
r3 | |
+----------+
...
Another example of Direct (or Absolute) Addressing
Ex:
st, r2, (100) ; r2 = contents of memory. Address is 100
instruction
---
+---------------+
| st | r2 | 100 |
+---------------+ Memory
| | +-------+
| | | |
| | +-------+
| +----> 100 | |<----------+
| +-------+ |
| | | |
| +-------+ |
| |
| |
| Registers |
| +----------+ |
| r1 | | |
| +----------+ |
+---------> r2 | |--------+
+----------+
r3 | |
+----------+
...
IMMEDIATE ADDRESSING
----
The operand is held in the instruction.
instruction
---
operation, <register>, <operand>
Ex: mov r2, 123 ; r2=123
instruction
---
+---------------+
|mov | r2 | 123 |---------------------------+
+---------------+ |
| |
| Registers |
| +----------+ |
| r1 | | |
| +----------+ |
+---------> r2 | |<-------+
+----------+
r3 | |
+----------+
...
+ Useful for constants: int x = 123;
C code Assembly code
------ -------
int x = 123; mov r1, 123
a = b+34; add r3, r4, 34 ; r3 is a, r4 is b
REGISTER DIRECT ADDRESSING
----
The operand is held in a register, which is specified in instruction.
instruction
---
operation, <register>, <register-no>
|
|
v
register-no
+---------+
| Operand |
+---------+
Ex: mov r3, r2 ; r3=r2
instruction
---
+---------------+
|mov | r3 | r2 |
+---------------+
Registers
+----------+
r1 | |
+----------+
r2 | | |
+----|-----+
r3 | v |
+----------+
...
+ Useful for integer variables.
C Code Assembly
------ --------
x = y; mov r1, r2
a = b+c mov r3, r4, r5
REGISTER INDIRECT ADDRESSING
----
The operand is held in memory. The address of operand location is held in a
register, that is specified in instruction.
instruction
---
operation, <register>, <register-no>
|
|
v
register-no
+---------+
| Memory |
| Address |
+---------+
|
|
v
Memory Address
+-----------+
| Operand |
+-----------+
Ex: ld, r3, (r2) ; r3=contents of memory, address in r2
+---------------+
| ld | r3 |(r2) |
+---------------+ Memory
| | +-------+
| | | |
| | +-------+
| | | * |100 <------+
| | +---|---+ |
| | | | | |
| | +---|---+ |
| | | |
| | +----------+ |
| | | |
| | Registers | |
| | +----------+ | |
| | r1 | | | |
| | +----------+ | |
| +----> r2 | 100 |---|----+
| +----------+ |
+---------> r3 | |<--+
+----------+
...
C Code Assembly Code
------ -------------
int *b
int a; // say address of a is 100
b = &a; mov r1, 120
*b = 123; ld (r1), 123
REGISTER INDIRECT ADDRESSING PLUS OFFSET
----
Similar to above except an offset held in the instruction is added to register
contents to form the effective address.
instruction
---
operation, <register>, <register-no>, offset
| |
| |
v |
register-no |
+---------+ |
| Memory |-----+
| Address | |
+---------+ |
|
|
v
Memory Address
+-----------+
| Operand |
+-----------+
Ex: ld, r3, 100(r2)
REGISTER INDIRECT ADDRESSING VARIATION
(Index Register Addressing)
-----
+ Register Indirect Addressing in which register is seen as an "index" in to a
list (1D array).
+ Register holds number of locations from starting point. Starting point given
in instruction.
- Some Arithmetic Ops
Instr S, D Result
--------------------------
addl S, D D = D+S
subl S, D D = D-S
imull S, D D = D*S
sall S, D D = D<<S
sarl S, D D = D>>S # Arithmetic shift
shrl S, D D = D>>S # Logical shift
xorl S, D D = D^S
andl S, D D = D&S
orl S, D D = D|S
+ There is no distiction between signed and unsigned numbers in
Assembly.
- Jump instructions
jX Condition Description
---------------------------------------------
jmp 1 Unconditional
je ZF Equal/Zero
jne ~ZF Not Equal/Not Zero
js SF Negative
jns ~SF Nonnegative
jg ~(SF^OF) & ~ZF Greater (Signed)
jge ~(SF^OF) Greater or Equal (Signed)
jl (SF^OF) Less (Signed)
jle (SF^OF) | ZF Less or Equal (Signed)
ja ~CF & ~ZF Above (Unsigned)
jb CF Below (Unsigned)
- Status registers (implicitly set, most of instructions):
CF = Carry Flag, only for unsigned
ZF = Zero Flag
SF = Sign Flag, only for signed
OF = Overflow Flag
Conditionals: x86-64 (Performance Improvement using `cmovle` instruction)
int absdiff(int x, int y) | # function prologue
{ | movl %edi, %eax # eax = x
int res; | movl %esi, %edx # edx = y
if (x > y) | subl %esi, %eax # eax = x-y
res = x-y; | subl %edi, %eax # edx = y-x
else | cmpl %esi, %edi # x:y
res = y-x; | cmovle %edx, %eax # eax = edx, if <=
return res; | ret
}
+ Jump Tables:
- Represents switch statements in C. Indirect addressing is used.
- Doesn't show up in disassembled code.
- Can inspect using GDB
db asm-cntl
- Preferred if the range of cases are small.
+ Program Stack [from 5]
R = Readable; O = Read-Only; W = Writeable; X = Executable
+-----------------------------------+
RW | STACK | |
+-----------------------------------+ | grows down
| | |
| | v
| |
| |
+-----------------------------------+
RW | HEAP (Dynamic Data) | malloc()
+-----------------------------------+
RW | Static Data & Global Vars |
+-----------------------------------+
O | String Literals |
+-----------------------------------+
OX | Program Instructions |
+-----------------------------------+
+ Stack Overview
+-----------+ <--+
| | |
| | |
| | Caller
+-----------+ Frame
| Args | |
+-----------+ |
| Ret Addr | |
+-----------+ <--+
| old %ebp |
+-----------+
| |
| saved regs|
| + |
| local |
| variables |
+-----------+
| Args |
| build | Stack Pointer
+-----------+ <---%esp
+ Register saving conventions (IA32/Linux):
- Caller saves following registers if its being used after call to
another routine: %eax, %edx, %ecx
- Callee saves following registers if it wants to use them: %ebx,
%esi, %edi
- %eax is used to return integer value.
- %esp, %ebp: special form of callee save-restored to original
values upon exit from procedure.
+ X64-64 Procedure Call Highlights
- Arguments (up to first 6) are in registers.
- Local vars also in registers if there is room.
- callq instruction stores 64-bit return address on stack.
- No frame pointer (%ebp):
- All references to stack frame made relative to %rsp. Eliminates
need to update %ebp/%rbp, which is now available for general
purpose use.
- Functions can access memory up to 128 bytes beyond %rsp: the "red zone"
- Can store some temps on stack without altering %rsp
- Registers still designated "caller saved" or "callee saved"
- Ideally, x86-64 functions need no stack frame at all !!!!!!!
- Just a return address is pushed on to the stack.
- A func does need a stack frame when it:
- Has too many local vars
- Has local vars that are arrays or structs
- Uses & operator to compute address of local var
- Calls another function that takes more than 6 args
- Needs to save the state of callee-save regs b4 modifying them.
Cache Performance Metrics
=========================
Miss Rate:
- Fraction of memory references NOT found in cache
(misses/accesses) = (1 - hit rate)
- Typical numbers = 3% to 10% for L1 cache
Hit Time:
- Time to deliver a line in the cache to the processor. Includes
time to determine if line is in the cache.
- Typical hit times: 1-2 clock cycles for L1 cache.
Miss Penalty:
- Additional time required because of a miss.
- Typically 50 - 200 cycles.
Memory Hierarcy
===============
registers
|
on-chip L1
cache (SRAM)
|
off/on-chip L2
cache (SRAM)
|
main memory
(DRAM)
|
local secondary
storage
|
remote disks
Example: Intels Core i7 Cache Hierarchy
Core 0 Core 1
------ ------
registers registers
| |
+-----+-----+ +-----+-----+
| | | |
v v v v
L1 L1 L1 L1
d-cache i-cache d-cache i-cache
| | | |
L2 unified cache L2 unified cache
| |
| |
L3 unified cache, shared by all cores
|
|
Main Memory
+ L1 i-cache and d-cache:
32KB, 8-way set-associative, Access: 4 cycles
+ L2 unified cache:
256KB, 8-way set-associative, Access: 11 cycles
+ L3 unified cache:
8MB, 16-way, Access: 30-40 cycles
+ Block Size: 64 Bytes for all caches.
Cache Organization
==================
+ Where will data go in cache?
(a) Direct Mapped Caching: Data in memory is directly mapped to
location in cache.
(b) Fully Associative Caching: Data in memory can go anywhere in
cache.
(c) Set-Associative Caching: Cache is divided in to fixed number
of sets. Data can go anywhere in those sets.
+ Layout of Set-Associative Caching:
+-------------+------------+----+
| | | |
+-------------+------------+----+
(Tag) (Index) (Offset)
Offset - Location of data inside a set.
Index - Indicates which set.
Tag - Type of data in Cache.
+ K-way associativity means, in EACH set in the cache, there are K
entries to fill. So, number of sets in a Cache = (M/K) where M is size
of Cache.
+ General Cache Organization:
- There are S sets in a cache (2^s, where small 's' indicates
# of bits representing sets).
- Each set has E blocks (or cache lines) = 2^e where small 'e'
indicates # of bits representing blocks or cache lines.
- Each block or cache line has: V (a valid bit), Tag (certain # of
bits indicating data type), B bytes of data (2^b, where b = # of
bits).
- Size of Cache = S * E * B bytes
+ When processor gets an address, it does following:
- Locate Set.
- Check if any line in the Set has matching tag.
- Yes + Valid bit set = Hit
- Else miss.
- Locate data starting at offset.
+ How are writes handled in Cache?
- Write-Through: write immediately to main memory.
- Write-Back: Defer write to main memory until line is evicted.
Need a dirty bit to indicate if line is different than
main memory.
- Write-Allocate: Load data in to Cache and update line in cache.
Good if more writes to the location follow.
- No-Write-Allocate: Just write immediately to memory.
+ Types of Cache Misses: CCC
- Compulsory
- Conflict
- Capacity
+ Exceptions: Synchronous and Asynchronous.
- Asynchronous: Occurs due to events outside processor control.
Interrupts are an example.
- Synchronous:
- Traps: Intentional transfer control of OS to perform
some function. Ex: Sys calls, breakpoints.
- Faults: Unintentional but possibly recoverable. Ex: Page
Faults (recoverable), Segment Faults (unrecov),
Divide/Zero (unrecov).
- Aborts: Unintentional and unrecoverable.
+ MMU (Memory Management Unit) uses PTBR (Page Table Base Register) to
locate the start of Page Table. MMU uses Page Table to look up and convert
Virtual Addresses to Physical Addresses.
+ Page Tables can be stored in Cache/Memory. But that can slow down Page
Table lookup's. Hence, TLB (Translation Lookaside Buffer) to speed Page
Table Lookup's. On Intel Core2 Duo, there can be 128 or 256 entries in
TLB.
Dynamic Memory Allocation
=========================
+ sbrk() syscall is used by malloc/calloc family of routines to increase,
decrease or allocated memory from heap.
+ What info is needed by memory allocator to keep track of blocks in heap?
+ When calling free(void *p), how does free know the size to free?
- Keep size info in previous word. Standard practice.
+ How to keep track of free blocks?
Method 1: Implicit Lists: Linked list of blocks (first word being
length) - links ALL blocks, even the used blocks.
Method 2: Explicit Lists: Linked list of free blocks.
Method 3: Segregated free lists: Different free lists for
different size classes.
Method 4: Blocks sorted by size: Can use a balanced tree (R-B
Tree) with pointers within each free block, and the length used as
key.
+ How to keep track if block is allocated or not?
- Standard trick to save memory. If blocks are aligned, the first
word is size (typically, multiples of 8 bytes).
- The last bit of these sizes are always 0. Use it to determine
allocated/free flag.
- When reading size, must remember to mask out this bit.
+ How to find a free block (using Implicit Lists)?
- First fit: Search the list from beginning and return first
match.
- Drawbacks: Linear time, 'Splinters' at the beginning.
- Next fit: Start from where you stopped last time. Faster than
'First Fit' as it avoids scanning unhelpful blocks.
- Drawback: Some research suggests that fragmentation is worse.
- Best fit: Choose best (closest match in size) free block. This
keeps fragementation small.
- Drawback: Slower.
+ Common C related memory pitfalls and perils:
1 Dereferencing bad pointers.
2 Reading uninitialized memory.
3 Overwriting memory.
- bad memory assignments
- Off-by-one error.
- Not checking max string size (classic buffer overflow).
- Misunderstanding pointer arithmetic.
- Referencing a pointer instead of object it points to.
4 Referencing non-existent variables.
- Local variables disappear after returning from
functions.
5 Freeing blocks multiple times.
6 Referencing freed blocks.
7 Failing to Free Blocks (Memory Leaks)
- Returning without freeing.
- Freeing only part of data structure.
3 Overwriting memory
Ex:
--
int **p;
p = (int **) malloc (N * sizeof(int));
// Incorrect. It should be sizeof(int *). Here sizeof(int) is
// assigned, which can be 4 bytes each.
for (i=0; i<N; i++)
p[i] = (int *) malloc (M * sizeof(int));
// because of above, on a 64-bit m/c, p[i] will get 8
// bytes each, hence an overwrite of memory.
+ Data Representation in Java:
- For Chars, 2 bytes (unlike 1 byte in C).
- Chars are NOT null terminated.
- Beginning of the string has length field.
- Same as above for Arrays. First field is length of Array.
+ Data structures (Objects) in Java
C Java
--- ----
struct rec { class Rec {
int i; int i;
int a[3]; int[] a = new int[3];
struct rec *p; Rec p;
}; };
Memory Layout C Memory Layout Java
--------------- ------------------
+---+---------------+----+ +----+----+----+
| i | a | p | | i | a | p -|--->
+---+---------------+----+ +----+----+----+
|
|
v
+----+----------------+
| 3 | int [3] |
+----+----------------+
+ References in Java (pointers equivalent) can only point to an object.
+ And it can point only to first element (not to middle of it).
Reference:
[1]
http://www.youtube.com/watch?v=zRqLU_AxNdU&feature=share&list=PLKK11Ligqiti8g3gWRtMjMgf1KoKDOvME&index=3
[2]
http://www.youtube.com/watch?v=nDj35pMLBQE&list=PLKK11Ligqiti8g3gWRtMjMgf1KoKDOvME&index=3
[3] Complete code set for IA32
http://www.jegerlehner.ch/intel/IntelCodeTable.pdf
[4] Very good description of instructions
http://web.itu.edu.tr/kesgin/mul06/intel/index.html
[5] Notes from Hardware Software Interface course on Coursera.
[6] http://webpages.uncc.edu/abw/ITCS3182F09/slides3.pdf
@@@@@@@@@@@@@@@@@@@@@ ENDIANNESS EXPLAINED @@@@@@@@@@@@@@@@@@@@@@@@@
Consider four byte number: 0x0a0b0c0d; left being smallest memory address
and right being highest.
Number: 0x0a0b0c0d. Here LSB is 0x0d and MSB is 0x0a
Memory Address: 0x100 0x101 0x102 0x103
+--------+--------+--------+--------+
| | | | |
+--------+--------+--------+--------+
Little-Endian: In LE, LSB is in lowest memory and MSB in highest memory
address. So, the number is stored as: 0x0d0c0b0a in memory.
Ex: Intel x86
Number: 0x0a0b0c0d. Here LSB is 0x0d and MSB is 0x0a
Memory Address: 0x100 0x101 0x102 0x103
+--------+--------+--------+--------+
| 0x0d | 0x0c | 0x0b | 0x0a |
+--------+--------+--------+--------+
code:
int i = 0x0a0b0c0d;
char *c = (char *) &i;
printf ("0x%x\n", *c);
// Output is 0xd = little-endian.
Big-Endian (aka Network Byte Order): In BE, MSB is in lowest memory address
and LSB in highest. It looks the way numbers are written down on paper.
So, the number is stored as: 0x0a0b0c0d in memory.
Ex: Motorola 68k, Data transfer on networks use NBO,
Number: 0x0a0b0c0d.
Memory Address: 0x100 0x101 0x102 0x103
+--------+--------+--------+--------+
| 0x0a | 0x0b | 0x0c | 0x0d |
+--------+--------+--------+--------+
code:
int i = 0x0a0b0c0d;
char *c = (char *) &i;
printf ("0x%x\n", *c);
// Output is 0xa = big-endian.
Host-Byte Order: Ordering on the host machine. If processor is x86, its
little-endian, if its Motorola's 68k, its big-endian.
Mixed-Endian (or Middle-Endian): Ordering of bytes within 16-bit word may differ
from ordering of 16-bit words within 32-bit word.
Bi-Endian: Architectures allowing switchable endianness in data segment, code
segment or both. 'Bi-Endian' refers how processor accesses data. Instruction
access (fetching instruction words) is usually fixed endian. Intel's Itanium CPU
allows bi-endian data and instruction access.
Ex: ARM version 3 and above, PowerPC,
Reference:
[1] http://en.wikipedia.org/wiki/Endianness
@@@@@@@@@@@@@@@@ C TO ASSEMBLY CODE GENERATION & OTHER TOOLS @@@@@@@@@@@@@@@@@@
$ gcc -c -S test.c
/* generates a test.s assembly file */
$ gcc -g -c test.c
/* generates test.o object file, with debug symbols present (-g flag) */
$ objdump
$ readelf
$ nm
$ strings
$ size <object or exec file>
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ readelf commands @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
/* header info */
$ readelf --file-header <elf-file>
Ex:
$ readelf --file-header vim.core
ELF Header:
Magic: 7f 45 4c 46 02 01 01 09 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - FreeBSD
ABI Version: 0
Type: CORE (Core file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x0
Start of program headers: 64 (bytes into file)
Start of section headers: 0 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 79
Size of section headers: 64 (bytes)
Number of section headers: 0
Section header string table index: 0
/* notes sections of elf-files have more info about the file (like .core files) */
$ readelf --notes <elf-file>
Ex:
$ readelf --notes vim.core
Notes at offset 0x00001188 with length 0x0000ad18:
Owner Data size Description
FreeBSD 0x00000078 NT_PRPSINFO (prpsinfo structure)
FreeBSD 0x000000e0 NT_PRSTATUS (prstatus structure)
FreeBSD 0x00000200 NT_FPREGSET (floating point registers)
FreeBSD 0x00000018 NT_THRMISC (thrmisc structure)
FreeBSD 0x000000e0 NT_PRSTATUS (prstatus structure)
FreeBSD 0x00000200 NT_FPREGSET (floating point registers)
FreeBSD 0x00000018 NT_THRMISC (thrmisc structure)
FreeBSD 0x00000884 NT_PROCSTAT_PROC (proc data)
FreeBSD 0x0000156c NT_PROCSTAT_FILES (files data)
FreeBSD 0x00008574 NT_PROCSTAT_VMMAP (vmmap data)
FreeBSD 0x00000008 NT_PROCSTAT_GROUPS (groups data)
FreeBSD 0x00000006 NT_PROCSTAT_UMASK (umask data)
FreeBSD 0x000000d4 NT_PROCSTAT_RLIMIT (rlimit data)
FreeBSD 0x00000008 NT_PROCSTAT_OSREL (osreldate data)
FreeBSD 0x0000000c NT_PROCSTAT_PSSTRINGS (ps_strings data)
FreeBSD 0x00000114 NT_PROCSTAT_AUXV (auxv data)
/* segments info. Segments are info on chunks of code from various files that a
* process/binary includes. */
$ readelf --segments <elf-file>
@@@@@@@@@@@@@@@@@@@@@@@ UDEMY's ASSEMBLY ADVENTURES @@@@@@@@@@@@@
FASM (Flat Assembler) Instructions
---
First Instructions
mov dst, src ; move contents of source to destination
add dst, src ; dst = dst + src
sub dst, src ; dst = dst - src = dst + (-src), 2's complement
; in 'add' and 'sub', dst can wrap around. Wrap around is done
; as per arg size. So, if there's a carry over, it's dropped.
Arithmetic Instructions
inc dst ; increment dst by 1. Wrap around can happen.
dec dst ; decrement dst by 1. Underflow is possible.
mul arg ; multiply numbers (unsigned numbers)
arg is of following forms
ax = al * arg ; if arg is of size 8 bits
dx:ax = ax * arg ; if arg is of size 16 bits
edx:eax = eax * arg ; if arg is of size 32 bits
; there's a 64 bit version too
dx:ax mean concatenation of bits in dx and ax registers
Examples:
mul ecx ; multiply eax and ecx and store in edx:eax
mul si ; dx:ax = ax * si
mul al ; ax = al * al