• Fast 64-bit division by a constant on 32-bit ARMs

    Modern versions of gcc and clang do not optimize long division by a constant into a multiplication by its reciprocal. On 32-bit ARM processors, a library call is generated (e.g. __aeabi_uldivmod or __aeabi_ldivmod), which takes more than a hundred clock cycles to execute. Reduction of a 64-bit division by a constant to multiplication requires multiplying two 64 bit numbers to produce a 128-bit result, although its lower half is discarded. Such a long multiplication is not available on 32-bit arms (same is true for other 32 bit processors). However, all modern arms (i.e. arm v7 and later) include a relatively fast 32x32 to 64 bit multiplication instruction (UMULL). The 64x64 multiplication operation can be written using four 32x32 multiplications and a few additions.

    ...
  • Defeating Kernel Modules Version Check

    For various reasons, it may be necessary to compile modules for an existing binary kernel. If your module is rejected by the kernel with the disagrees about version of symbol error like:

    [root@localhost /mnt]# insmod dm9000.ko
    dm9000: disagrees about version of symbol platform_driver_register
    dm9000: Unknown symbol platform_driver_register (err -22)
    
    ...
  • Classic!

    It’s time to start learning C HTML, no warranties of any kind!

    ...