The IDEA cipher uses multiplication modulo 2^16 + 1. Is there an algorithm to perform this operation without general modulo operator (only modulo 2^16 (truncation))? In the context of IDEA, zero is interpreted as 2^16 (it means zero isn't an argument of our multiplication and it cannot be the result, so we can save one bit and store value 2^16 as bit pattern 0000000000000000). I am wondering how to implement it efficiently (or whether it is possible at all) without using the standard modulo operator.
Enter the Modulo Converting everyday terms to math, an “even number” is one where it's “0 mod 2” — that is, it has a remainder of 0 when divided by 2. An odd number is “1 mod 2” (has remainder 1).
You can utilize the fact, that (N-1) % N == -1.
Thus, (65536 * a) % 65537 == -a % 65537.
Also, -a % 65537 == -a + 1 (mod 65536), when 0 is interpreted as 65536
uint16_t fastmod65537(uint16_t a, uint16_t b)
{
    uint32_t c;
    uint16_t hi, lo;
    if (a == 0)
        return -b + 1;
    if (b == 0)
        return -a + 1;
    c = (uint32_t)a * (uint32_t)b;
    hi = c >> 16;
    lo = c;
    if (lo > hi)
        return lo-hi;
    return lo-hi+1;
}
The only problem here is if hi == lo, the result would be 0. Luckily a test suite confirms, that it actually can't be...
int main()
{
    uint64_t a, b;
    for (a = 1; a <= 65536; a++)
       for (b = 1; b <= 65536; b++)
        { 
            uint64_t c = a*b;
            uint32_t d = (c % 65537) & 65535;
            uint32_t e = m(a & 65535, b & 65535);
            if (d != e)
                printf("a * b % 65537 != m(%d, %d) real=%d m()=%d\n",
                       (uint32_t)a, (uint32_t)b, d, e);
        }
    }
Output: none
First, the case where either a or b is zero. In that case, it is interpreted as having the value 2^16, therefore elementary modulo arithmetic tells us that:
result = -a - b + 1;
, because (in the context of IDEA) the multiplicative inverse of 2^16 is still 2^16, and its lowest 16 bits are all zeroes.
The general case is much easier than it seems, now that we took care of the "0" special case (2^16+1 is 0x10001):
/* This operation can overflow: */
unsigned result = (product & 0xFFFF) - (product >> 16);
/* ..so account for cases of overflow: */
result -= result >> 16;
Putting it together:
/* All types must be sufficiently wide unsigned, e.g. uint32_t: */
unsigned long long product = a * b;
if (product == 0) {
    return -a - b + 1;
} else {
    result = (product & 0xFFFF) - (product >> 16);
    result -= result >> 16;
    return result & 0xFFFF;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With