Notes
Slide Show
Outline
1
Practical Foundations of Debugging
Chapter 2
  • Number representations and  Pointers
2
Review – disassembling C program
  • int a, b;


  • int _tmain(int argc, _TCHAR* argv[])
  • {
  • a = 1;           // mov [a], 1
  • b = 1; // mov [b], 1


  • b = b + a;   // mov eax, [a]
  •                     // add [b], eax
  • ++a;              // inc eax
  •        // mov [a], eax
  • b = a * b;        // imul [b]
  • // mov [b], eax


  • // results: [a] = 2 and [b] = 4


  • return 0;
  • }
3
WinDbg disassembly output – Debug executable
4
WinDbg disassembly output – Release executable
  • ArithmeticProjectC!main:
  • 00401000 c705c472400002000000 mov dword ptr [ArithmeticProjectC!a (004072c4)],0x2
  • 0040100a c705c072400004000000 mov dword ptr [ArithmeticProjectC!b (004072c0)],0x4



  • mov [a], 2   ; [a] := 2
  • mov [b], 4   ; [b] := 4
5
Numbers and their representations
  • number of stones
  • a guy can only count up to 3



  • The last picture is representation (notation) of the number of stones


6
Decimal representation (base 10)
  • 12 stones
  • 12dec = 1 * 10 + 2       or     1*101 + 2*100


  • 123 stones
  • 123dec = 1 * 100 + 2 * 10 + 3    or   1 * 102 + 2 * 101 + 3*100


  • Ndec = an*10n + an-1*10n-2 + … + a2*102 + a1*101 + a0*100              0 <= ai <= 9


  •                           n
  • Ndec = ∑ ai*10i
  •                  i=0
7
Ternary representation (base 3)
  • 12 stones
  • 110 in ternary representation (notation)
  • 12dec = 1*32 + 1*31 + 0*30


  • Ndec = an*3n + an-1*3n-2 + … + a2*32 + a1*31 + a0*30              ai = 0 or 1 or 2


  •                            n
  • Ndec = ∑ ai*3i
  •                 i=0



8
Binary representation (base 2)
  • 12 stones
  • 1100 in binary representation (notation)
  • 12dec = 1*23 + 1*22 + 0*21 + 0*20


  • Ndec = an*2n + an-1*2n-2 + … + a2*22 + a1*21 + a0*20               ai = 0 or 1


  •                            n
  • Ndec = ∑ ai*2i
  •                  i=0


9
Hexadecimal representation (base 16)
  • 12 stones
  • 12dec = C in hexadecimal representation (notation)
  • 123 stones
  • 123dec = 7B in hexadecimal representation (notation)


  • 123dec = 7*161 + 11*160        …8, 9, A, B, C, D, E, F
  •             n
  • Ndec = ∑ ai*16i
  •                 i=0
10
Why hexadecimal is used?
  • 110001010011  (binary notation)
  • 3155dec = 1*211 + 1*210 + 0*29 + 0*28 + 0*27 + 1*26 + 0*25 + 1*24 + 0*23 + 0*22 + 1*21 + 1*20
  • 110001010011   is C53 in hexadecimal
  • 12dec 5dec   3dec
  • Chex   5hex  3hex
  • In WinDbg memory addresses are always displayed in hexadecimal notation
11
Binary <-> Decimal <-> Hexadecimal
  • 0000 0 0
  • 0001 1 1
  • 0010 2 2
  • 0011 3 3
  • 0100 4 4
  • 0101 5 5
  • 0110 6 6
  • 0111 7 7
  • 1000 8 8
  • 1001 9 9
  • 1010 10 A
  • 1011 11 B
  • 1100 12 C
  • 1101 13 D
  • 1110 14 E
  • 1111 15 F
12
Pointers (Picture 1)
  • Pointer is a memory cell or a register that contains the address of another memory cell. Has its own address (as any memory cell)
  • Another name: Indirect address (vs. direct address, the address of memory cell). Another level of indirection
  • Levels of indirection: pointer to a pointer
  •    (Memory cell or register that contains the address of another memory cell that contains the address of another memory cell)
13
Picture 1
14
“Pointers” Project – Memory Layout and Registers (Picture 2)
  • Two memory addresses (locations):
  • “a” and “b”. We can think about “a” and “b” as names of addresses (locations)
  • Notation [a] means contents at the memory address (location) “a”
  • Registers EAX and EBX; pointers to “a” and “b”; contain addresses of “a” and “b”
  • Notation [EAX] means contents of the memory cell whose address is in EAX
  • In C we declare pointers to “a” and “b” as:
  • int *a, *b;
15
Picture 2
16
“Pointers” Project - Calculations
  • eax := address a
  • [eax] := 1  ; [a] = 1
  • ebx := address b
  • [ebx] := 1  ; [b] = 1
  • [ebx] := [ebx] + [eax]
  •              ; [b] = 2
  • [ebx] := [ebx] * 2
  •            ; [b] = 4


17
Using pointers to assign numbers to memory cells
  • eax := address a
  • [eax] := 1
  • means using the contents of eax as address of a memory cell and assign a value to the contents of this memory cell


  • In C language it is called a “pointer” and we write:
  • int *a;       // definition of a pointer
  • *a = 1;      // get memory cell (dereferencing a pointer) and assign a value to it


  • In Assembler we write:
  • lea    eax, a          ; load the address a into eax
  • mov [eax], 1        ; use eax as a pointer


  • In WinDbg disassembly output we see:
  • 00411a2e 8d0504854200     lea       eax,[PointersProject!a (00428504)]
  • 00411a34 c60001                 mov     byte ptr [eax],0x1
18
“Arithmetical” Project – Calculations (Picture 3)
  • eax := address a
  • [eax] := 1  ; [a] = 1
  • ebx := address b
  • [ebx] := 1  ; [b] = 1
  • [ebx] := [ebx] + [eax]
  •              ; [b] = 2
  • [ebx] := [ebx] * 2
  •            ; [b] = 4
19
Picture 3
20
Adding numbers using pointers
  • [ebx] := [ebx] + [eax]
  • [eax] and [ebx] mean contents of memory cells whose addresses (locations) are stored in eax and ebx
  • In C language we write:
  • *b = *b + *a;
  • In Assembler we use instruction add
  • We cannot use both memory addresses in one step (instruction):
  • add [ebx], [eax]
  • We can only use add [ebx], register
  • register := [eax]
  • [ebx] := [ebx] + register
  • In Assembler we write:
  • mov eax, [eax]
  • add [ebx], eax
  • In WinDbg disassembly output we see:
  • 00411a40 8b00             mov     eax,[eax]
  • 00411a42 0103             add     [ebx],eax


21
“Arithmetical” Project – Calculations (Picture 4)
  • eax := address a
  • [eax] := 1  ; [a] = 1
  • ebx := address b
  • [ebx] := 1  ; [b] = 1
  • [ebx] := [ebx] + [eax]
  •              ; [b] = 2
  • [ebx] := [ebx] * 2
  •            ; [b] = 4
22
Picture 4
23
Multiplying numbers using pointers
  • [ebx] := [ebx] * 2
  • Means multiply contents of the memory whose address is stored in ebx by 2


  • In C language we write:
  • *b =  *b * 2; or *b *= 2;


  • In Assembler we use instruction imul (integer multiply)
  • imul [ebx]
  • Means [ebx] := [ebx] * eax, so we have to put 2 into eax, but we already have 1 in eax so we use inc eax before imul to increment by 1
  • Result of multiplication is put into registers eax only! This is because the compiler recognized that we multiply small numbers.


  • In WinDbg disassembly output we see:
  • 00411a44 40               inc     eax
  • 00411a45 f62b             imul    byte ptr [ebx]
  • 00411a47 8903             mov     [ebx],eax
24
“Arithmetical” Project – Calculations (Picture 5)
  • eax := address a
  • [eax] := 1  ; [a] = 1
  • ebx := address b
  • [ebx] := 1  ; [b] = 1
  • [ebx] := [ebx] + [eax]
  •              ; [b] = 2
  • [ebx] := [ebx] * 2
  •            ; [b] = 4
25
Picture 5
26
What’s next?
  • More practice with binary and hexadecimal notations.
  • Bits, bytes, words and double words.
  • Pointers to bytes and double words.
  • Pointers as variables. Null pointer.
  • We will rewrite our “arithmetical” project using pointers as variables.