the notes of ZIP workshop 1-3
introduction
this entry follows http://d.hatena.ne.jp/takahirox/20110628/1309267617
- the blog of the host. you can get handout.
- atnd
what i did
i implemented CRC 32 calculation logic into archive_zip i made at last entry.
the summary of CRC32
- it is the error detecting code
- not perfect logic
- the target is limited like inverted several bits due to network error
- it can't detect an intentional alteration
- fast
- not perfect logic
- the key point of the calculation
- handling data as numerical number
- the CRC32 value is the reminder of division by certain constant
- using modulo2 because division is heavy
- modulo2 ignores carry bits
- modulo2 calculation is the same as XOR. so modulo2 division is much lighter than original division
- the flow is
- first, pretreating a dividend(original data)
- second, doing modulo2 division
- finally, aftertreating remainder. and identifying it as CRC32 value
- we can gather the part of the calculating process due to the commutative low of XOR
- in most case, first of all calculating remainder of all 8bits patterns and save them into table
modulo2 division example 10011 --------- 1011) 10101001 1011 <-- key point -------- doing calculation 1100 if the highest bits are both 1 1011 even divisor is greater than dividend. ----- 1111 1011 ---- 100 <-- remainder
you can see the detail at the host blog. http://d.hatena.ne.jp/n7shi/20110529
source code
you can see the source code implemented into archive_zip.c at google code.
uint32_t table[ 256 ] ; void init_table( ) { int i, j ; for( i = 0; i < 256; i++ ) { uint32_t a = i ; for( j = 0; j <= 7; j++ ) { uint32_t old_a = a ; a >>= 1 ; if( old_a & 1 ) a ^= 0xedb88320 ; } table[i] = a ; } } uint32_t crc32( FILE *fp ) { uint32_t a = ~0 ; int c ; fseek( fp, 0, SEEK_SET ) ; while( ( c = getc( fp ) ) != EOF ) { a ^= ( uint8_t ) c ; a = ( a >> 8 ) ^ table[ a & 0xff ] ; } return ~a ; } ... snip ... zip_headers[ i ].crc32 = crc32( fp ) ;
it is the almost same as the sample code :)
i just edited to use File *fp on crc32( ).
others
it was so boring, that's because i just copied it from beautiful sample code :), that i translated it into python.
class Homu: table = [0] * 256 def init_table( self ): for i in xrange( 256 ): a = i for j in xrange( 8 ): old_a = a a >>= 1 if old_a & 1: a ^= 0xedb88320 self.table[i] = a return def get_crc32( self, str ): a = 0xffffffff for s in str.encode( 'utf-8' ): a = a ^ ord( s ) & 0xffffffff a = ( a >> 8 ) ^ self.table[ a & 0xff ] return a ^ 0xffffffff
and i made the bot @crc32hoge replies CRC32 value with python and GAE.
i'm going to introduce it another entry in detail.