I am looking for a bit-wise test equivalent to (num%2) == 0 || (num%3) == 0.
I can replace num%2 with num&1, but I'm still stuck with num%3 and with the logical-or.
This expression is also equivalent to (num%2)*(num%3) == 0, but I'm not sure how that helps.
Divisibility rule for 3 states that a number is completely divisible by 3 if the sum of its digits is divisible by 3. Consider a number, 308. To check whether 308 is divisible by 3 or not, take sum of the digits (i.e. 3+0+8= 11). Now check whether the sum is divisible by 3 or not.
A number is divisible by another number if it can be divided equally by that number; that is, if it yields a whole number when divided by that number. For example, 6 is divisible by 3 (we say "3 divides 6") because 6/3 = 2, and 2 is a whole number.
An integer n can be divided by 2: floor(log(n)/log(2)) times.
The divisibility rule for 2 states that any number with the last digit of 0, 2, 4, 6, or 8 will be divisible by 2. Simply put, any even number (numbers that end in 0, 2, 4, 6, or 8) is divisible by 2. If the number is not an even number, it is not divisible by two.
Yes, though it's not very pretty, you can do something analogous to the old "sum all the decimal digits until you have only one left" trick to test if a number is divisible by 9, except in binary and with divisibility by 3. You can use the same principle for other numbers as well, but many combinations of base/divisor introduce annoying scaling factors so you're not just summing digits anymore.
Anyway, 16n-1 is divisible by 3, so you can use radix 16, that is, sum the nibbles. Then you're left with one nibble (well, 5 bits really), and you can just look that up. So for example in C# (slightly tested) edit: brute-force tested, definitely works
static bool IsMultipleOf3(uint x)
{
    const uint lookuptable = 0x49249249;
    uint t = (x & 0x0F0F0F0F) + ((x & 0xF0F0F0F0) >> 4);
    t = (t & 0x00FF00FF) + ((t & 0xFF00FF00) >> 8);
    t = (t & 0x000000FF) + ((t & 0x00FF0000) >> 16);
    t = (t & 0xF) + ((t & 0xF0) >> 4);
    return ((lookuptable >> (int)t) & 1) != 0;
}
The trick from my comment, x * 0xaaaaaaab <= 0x55555555, works through a modular multiplicative inverse trick. 0xaaaaaaab * 3 = 1 mod 232, which means that 0xaaaaaaab * x = x / 3 if and only ifx % 3 = 0. "if" because 0xaaaaaaab * 3 * y = y (because 1 * y = y), so if x is of the form3 * y then it will map back to y. "only if" because no two inputs are mapped to the same output, so everything not divisible by 3 will map to something higher than the highest thing you can get by dividing anything by 3 (which is 0xFFFFFFFF / 3 = 0x55555555).
You can read more about this (including the more general form, which includes a rotation) in Division by Invariant Integers using Multiplication (T. Granlund and P. L. Montgomery).
You compiler may not know this trick. For example this:
uint32_t foo(uint32_t x)
{
    return x % 3 == 0;
}
Becomes, on Clang 3.4.1 for x64,
movl    %edi, %eax
movl    $2863311531, %ecx       # imm = 0xAAAAAAAB
imulq   %rax, %rcx
shrq    $33, %rcx
leal    (%rcx,%rcx,2), %eax
cmpl    %eax, %edi
sete    %al
movzbl  %al, %eax
ret
G++ 4.8:
mov eax, edi
mov edx, -1431655765
mul edx
shr edx
lea eax, [rdx+rdx*2]
cmp edi, eax
sete    al
movzx   eax, al
ret
What it should be:
imul eax, edi, 0xaaaaaaab
cmp eax, 0x55555555
setbe al
movzx eax, al
ret
I guess I'm a bit late to this party, but here's a slightly faster (and slightly prettier) solution than the one from harold:
bool is_multiple_of_3(std::uint32_t i)
{
    i = (i & 0x0000FFFF) + (i >> 16);
    i = (i & 0x00FF) + (i >> 8);
    i = (i & 0x0F) + (i >> 4);
    i = (i & 0x3) + (i >> 2);
    const std::uint32_t lookuptable = 0x49249249;
    return ((lookuptable >> i) & 1) != 0;
}
It's C++11, but that doesn't really matter for this piece of code. It's also brute-force tested for 32-bit unsigned ints. It saves you at least one bit-fiddling op for each of the first four steps. It also scales beautifully to 64 bits - only one additional step needed at the beginning.
The last two lines are obviously and shamelessly taken from harold's solution (nice one, I wouldn't have done that so elegantly).
Possible further optimizations:
& ops in the first two steps will be optimized away by just using the lower-half registers on architectures that have them (x86, for example).60, and from the fourth step it's 15 (when the function argument is 0xFFFFFFFF). Given that, we can eliminate the fourth step, use a 64-bit lookuptable and shift directly into that following the third step. This turns out to be a bad idea for Visual C++ 2013 in 32-bit mode, as the right shift turns into a non-inline call to code that does a lot of tests and jumps. However, it should be a good idea if 64-bit registers are natively available.75 and 21 respectively, which means we can no longer eliminate the last step.The first four steps are based on the fact that a 32-bit number can be written as
(high 16 bits) * 65536 + (low 16 bits) = 
(high 16 bits) * 65535 + (high 16 bits) + (low 16 bits) = 
(high 16 bits) * 21845 * 3 + ((high 16 bits) + (low 16 bits))
So the whole thing is divisible by 3 if and only if the right parenthesis is divisible by 3. And so on, as this holds for 256 = 85 * 3 + 1, 16 = 5 * 3 + 1, and 4 = 3 + 1. (Of course, this is generally true for even powers of two; odd powers are one less than the nearest multiple of 3.)
The numbers that are input into the following steps will be larger than 16-bit, 8-bit, and 4-bit respectively in some cases, but that's not a problem, as we're not dropping any high-order bits when shifting right.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With