the notes of PE workshop#4

introduction

i attended PE workshop#4. i implemented a something compiler but i couldn't finish it there.

after coming home, i finished it and confirmed it works. so i'll introduce it.

links

what i did

i made a something compiler that generates PE binary from my own code.

my own code

you can use only one variable.

letting the variable 1.

let 1

adding 1 to the variable.

add 1

subtracting 1 from the variable.

sub 1

after executing, you can get result with "echo $?"

example

% cat code.txt
add 1
add 2
add 3

% ./compile ./code.txt ./hoge

% ./hoge

% echo $?
6

% objdump.exe -d -M intel ./hoge

./hoge:     file format pei-i386


Disassembly of section .text:

00401000 <.text>:
  401000:	05 01 00 00 00       	add    eax,0x1
  401005:	05 02 00 00 00       	add    eax,0x2
  40100a:	05 03 00 00 00       	add    eax,0x3
  40100f:	c3                   	ret    
% cat code.txt
add 1
add 2
add 3
let 100
sub 10
add 5

% ./compile ./code.txt ./hoge

% ./hoge

% echo $?
95

% objdump.exe -d -M intel ./hoge

./hoge.o:     file format pei-i386


Disassembly of section .text:

00401000 <.text>:
  401000:	05 01 00 00 00       	add    eax,0x1
  401005:	05 02 00 00 00       	add    eax,0x2
  40100a:	05 03 00 00 00       	add    eax,0x3
  40100f:	b8 64 00 00 00       	mov    eax,0x64
  401014:	2d 0a 00 00 00       	sub    eax,0xa
  401019:	05 05 00 00 00       	add    eax,0x5
  40101e:	c3                   	ret    

source code

you can get the source code from gogle code.


generating machine code is only here. i got machine operation with disassemble referring to the workshop.

  // output .text
  line_num = 0 ;

  while( fscanf( fp, "%s %d\n", op, &opr1 ) != EOF ) {

    line_num++ ;
    match = 0 ;

    if( strcmp( op, "add" ) == 0 ) {
      putc( 0x05, out ) ;
      match = 1 ;
    } else if( strcmp( op, "let" ) == 0 ) {
      putc( 0xb8, out ) ;
      match = 1 ;
    } else if( strcmp( op, "sub" ) == 0 ) {
      putc( 0x2d, out ) ;
      match = 1 ;
    }

    if( match ) {
      putc( opr1 & 0x000000ff, out ) ;
      putc( ( opr1 & 0x0000ff00 ) >> 8, out ) ;
      putc( ( opr1 & 0x00ff0000 ) >> 16, out ) ;
      putc( ( opr1 & 0xff000000 ) >> 24, out ) ;
    } else {
      printf( "error! can't understand %s at %d\n", op, line_num ) ;
      fclose( out ) ;
      fclose( fp ) ;
      return -1 ;
    }

  }

  putc( 0xC3, out ) ;

others are just to set parameters in structures, to output them and to insert padding.

i know it's very cheap, but it take me to compiler world :)

note that i used magic numbers for padding size and so on. if you use large code, perhaps it doesn't work correctly.

trouble thing

i had a trouble to implement it. although i generated pe file, it didn't work with the message "Bad number" or something.

i couldn't solve it so i used "analysis" i made before and "diff" between the pe file and another pe file generated by gcc.

like this

% gcc -o fuga fuga.c
% ./analysis hoge > hoge.txt
% ./analysis fuga > fuga.txt
% diff hoge.txt fuga.txt

... snip ...

< MajorImageVersion         : 0005 <0000C4> 
< MinorImageVersion         : 0001 <0000C6> 
< MajorSubsystemVersion     : 0000 <0000C8> 
< MinorSubsystemVersion     : 0000 <0000CA> 
---
> MajorImageVersion         : 0000 <0000C4> 
> MinorImageVersion         : 0000 <0000C6> 
> MajorSubsystemVersion     : 0005 <0000C8> 
> MinorSubsystemVersion     : 0001 <0000CA> 

... snip ...

and i finally found that i had set contrary parameters between Major/MinorImageVersion and Major/MinorSubsystemVersion. the "analysis" helped me.

conclusion

i implemented the compiler with only .text due to using eax. i'm going to expand it, for example to use .data and .idata.