i have a code that creates primes turns them in to Mersenne numbers and then prime checks it again to see if it is a Mersenne prime. it is working fine apart form that fact that it stops at 2^31 - 1... (2147483647)
i was using a function that produces infinite numbers:
def infinity():
i = 0
while True:
i += 1
yield i
and then changed it to a while True loop with a i += 1 at the end but it still does not work.
def isPrime(i):
isprime = True
for j in range(2, int(math.sqrt(i) + 1)):
if i % j == 0:
return False
if isprime and i!=1:
return True
i=1
while True:
isprime = True
for j in range(2, int(math.sqrt(i) + 1)):
if i % j == 0:
isprime = False
break
if isprime and i!=1:
test = (2**i)-1
result = isPrime(test)
if result:
print (test)
i+=1
Your algorithm looks good and working to me. I decided to make some improvements though, and also implemented faster algorithms of finding Mersenne Primes. Just for reference, all world-known Mersenne Primes are listed here.
In all next steps first goes description, then code, and after code you can see console output with running times.
Because in my answer I have a lot of code snippets and many are large, you can use nice Chrome extension (Clipboardy) (described how to install in this answer) which allows to easily copy to clipboard StackOverflow code snippets.
Step 1. Algorithm Version 0, Python-only.
First, I improved your algorithm a bit, also a bit beautified your code. One easy improvement is that when checking is_prime you can check only odd divisors, this will double the speed. This is what I got:
Try it online!
import math, time
def print_result(exp, *, tb = time.time()):
print(exp, 'at', round(time.time() - tb, 3), 'secs', flush = True, end = ', ')
def mersennes_v0():
def is_prime(n):
if n <= 2:
return n == 2
if n & 1 == 0:
return False
for j in range(3, math.floor(math.sqrt(n + 0.1)) + 1, 2):
if n % j == 0:
return False
return True
for i in range(2, 63):
if is_prime(i) and is_prime((1 << i) - 1):
print_result(i)
mersennes_v0()
Output:
2 at 0.0 secs, 3 at 0.0 secs, 5 at 0.001 secs, 7 at 0.001 secs,
13 at 0.001 secs, 17 at 0.001 secs, 19 at 0.002 secs, 31 at 0.009 secs,
61 at 243.5412 secs,
Step 2. Algorithm Version 0, plus Numba.
Then without change of algorithm I made very simple improvement, I used Numba JIT code optimizer/compiler! To use it in your program you need to install PIP module one time through python -m pip install numba.
In code above I just did 3 little changes to use Numba - 1) did import numba 2) added function decorator @numba.njit 3) add line with numba.objmode():.
Using Numba improved AlgoV0 running time from 243 seconds to 15 seconds i.e. boosted speed 16x times !!!
Important note - unlike regular Python Numba doesn't support big integer arithmetic, it means only primes fitting 64-bit integer can be tested!
Try it online!
import math, time
import numba
def print_result(exp, *, tb = time.time()):
print(exp, 'at', round(time.time() - tb, 3), 'secs', flush = True, end = ', ')
@numba.njit
def mersennes_v0():
def is_prime(n):
if n <= 2:
return n == 2
if n & 1 == 0:
return False
for j in range(3, math.floor(math.sqrt(n + 0.1)) + 1, 2):
if n % j == 0:
return False
return True
for i in range(2, 63):
if is_prime(i) and is_prime((1 << i) - 1):
with numba.objmode():
print_result(i)
mersennes_v0()
Output:
2 at 1.257 secs, 3 at 1.257 secs, 5 at 1.257 secs, 7 at 1.257 secs,
13 at 1.257 secs, 17 at 1.257 secs, 19 at 1.258 secs, 31 at 1.258 secs,
61 at 15.052 secs,
Step 3. Algorithm Version 1, Python-only.
Then I decided to implement Lucas Lehmer Primality Test, which will solve same task much faster. As it is totally different algorithm I called it Version-1 algorithm.
Numba can't help us much here, because it doesn't have long arithmetics, so I didn't use it, I used built-in default Python long arithmetics. Python default long arithmetics implementation is good enough at speed to be used as it is. Also main running time of next code is within low level C-based functions of long arithmetics, so next code is very efficient variant of Lucas Lehmer Test.
Notice that Lucas Lehmer Test is valid for exponents starting from 3, so exponent 2 is not shown in console output.
So next code uses just raw Python with standard modules, and doesn't need installing any PIP modules.
Try it online!
import math, time
def print_result(exp, *, tb = time.time()):
print(exp, 'at', round(time.time() - tb, 3), 'secs', flush = True, end = ', ')
def mersennes_v1():
def is_prime(n):
if n <= 2:
return n == 2
if n & 1 == 0:
return False
for j in range(3, math.floor(math.sqrt(n + 0.1)) + 1, 2):
if n % j == 0:
return False
return True
def lucas_lehmer_test(p):
# Lucas Lehmer Test https://en.wikipedia.org/wiki/Lucas%E2%80%93Lehmer_primality_test
mask = (1 << p) - 1
def mer_rem(x, bits):
# Below is same as: return x % mask
while True:
r = 0
while x != 0:
r += x & mask
x >>= bits
if r == mask:
r = 0
if r < mask:
return r
x = r
s = 4
for k in range(2, p):
s = mer_rem(s * s - 2, p)
return s == 0
for p in range(3, 1 << 30):
if is_prime(p) and lucas_lehmer_test(p):
print_result(p)
mersennes_v1()
Output:
3 at 0.0 secs, 5 at 0.001 secs, 7 at 0.001 secs, 13 at 0.001 secs,
17 at 0.001 secs, 19 at 0.001 secs, 31 at 0.002 secs, 61 at 0.003 secs,
89 at 0.004 secs, 107 at 0.005 secs, 127 at 0.006 secs, 521 at 0.062 secs,
607 at 0.087 secs, 1279 at 0.504 secs, 2203 at 2.338 secs, 2281 at 2.634 secs,
3217 at 7.819 secs, 4253 at 20.578 secs, 4423 at 23.196 secs,
9689 at 386.877 secs, 9941 at 419.243 secs, 11213 at 590.477 secs,
Step 4. Algorithm Version 1, plus GMPY2.
Next possible improvement of algorithm above is to use GMPY2 Python library. It is highly-optimized library of long arithmetics, considerably faster than regular Python implementation of long arithmetics.
But GMPY2 can be quite challenging to install. Most recent precompiled versions for Windows are available by this link. Download .whl file for your version of Python and install in through pip, e.g. for my Windows 64-bit Python 3.7 I downloaded and installed pip install gmpy2-2.0.8-cp37-cp37m-win_amd64.whl. For Linux it is easiest to install through sudo apt install -y python3-gmpy2.
To use gmpy2 in code from previous Step, I added line import gmpy2 and used gmpy2.mpz(...) in several places.
Usage of gmpy2 compared to Python-only version improved speed by around 5.2x times for exponent 11213, and for larger exponents this speed gain will be even more, probably due to use of Fast Fourier Transform multiplication algorithms inside GMPY2 library.
Try it online!
import math, time
import gmpy2
def print_result(exp, *, tb = time.time()):
print(exp, 'at', round(time.time() - tb, 3), 'secs', flush = True, end = ', ')
def mersennes_v1():
def is_prime(n):
if n <= 2:
return n == 2
if n & 1 == 0:
return False
for j in range(3, math.floor(math.sqrt(n + 0.1)) + 1, 2):
if n % j == 0:
return False
return True
def lucas_lehmer_test(p):
# Lucas Lehmer Test https://en.wikipedia.org/wiki/Lucas%E2%80%93Lehmer_primality_test
mask = gmpy2.mpz((1 << p) - 1)
def mer_rem(x, bits):
# Below is same as: return x % mask
while True:
r = gmpy2.mpz(0)
while x != 0:
r += x & mask
x >>= bits
if r == mask:
r = gmpy2.mpz(0)
if r < mask:
return r
x = r
s = 4
s, p = gmpy2.mpz(s), gmpy2.mpz(p)
for k in range(2, p):
s = mer_rem(s * s - 2, p)
return s == 0
for p in range(3, 1 << 30):
if is_prime(p) and lucas_lehmer_test(p):
print_result(p)
mersennes_v1()
Output:
3 at 0.0 secs, 5 at 0.0 secs, 7 at 0.001 secs, 13 at 0.001 secs,
17 at 0.001 secs, 19 at 0.001 secs, 31 at 0.002 secs, 61 at 0.002 secs,
89 at 0.004 secs, 107 at 0.005 secs, 127 at 0.006 secs, 521 at 0.055 secs,
607 at 0.073 secs, 1279 at 0.315 secs, 2203 at 1.055 secs, 2281 at 1.16 secs,
3217 at 2.637 secs, 4253 at 5.674 secs, 4423 at 6.375 secs,
9689 at 72.535 secs, 9941 at 79.605 secs, 11213 at 114.375 secs,
19937 at 796.705 secs, 21701 at 1080.045 secs, 23209 at 1421.392 secs,
Step 5. Algorithm Version 1, and GMPY2 + Multi-Core
One more obvious improvement over previous step would be to use multi-core CPU capabilities, i.e. to parallelize Lucas-Lehmer Test on all available CPU cores.
For that we use standard Python's multiprocessing module.
On my 4-core CPU speed boost for exponent 23209 is 2.65x times compared to single-core version.
Try it online!
import math, time, multiprocessing
import gmpy2
def print_result(exp, *, tb = time.time()):
print(exp, 'at', round(time.time() - tb, 3), 'secs', flush = True, end = ', ')
def lucas_lehmer_test(p):
# Lucas Lehmer Test https://en.wikipedia.org/wiki/Lucas%E2%80%93Lehmer_primality_test
def num(n):
return gmpy2.mpz(n)
mask = num((1 << p) - 1)
def mer_rem(x, bits):
# Below is same as: return x % mask
while True:
r = num(0)
while x != 0:
r += x & mask
x >>= bits
if r == mask:
r = num(0)
if r < mask:
return r
x = r
s, p, two = num(4), num(p), num(2)
for k in range(2, p):
s = mer_rem(s * s - two, p)
return p, s == 0
def mersennes_v1():
def is_prime(n):
if n <= 2:
return n == 2
if n & 1 == 0:
return False
for j in range(3, math.floor(math.sqrt(n + 0.1)) + 1, 2):
if n % j == 0:
return False
return True
print('Num Cores Used:', multiprocessing.cpu_count())
with multiprocessing.Pool(multiprocessing.cpu_count()) as pool:
for p, _ in filter(lambda e: e[1], pool.imap(lucas_lehmer_test, filter(is_prime, range(3, 1 << 30)))):
print_result(p)
if __name__ == '__main__':
mersennes_v1()
Output:
3 at 0.746 secs, 5 at 0.747 secs, 7 at 0.747 secs, 13 at 0.747 secs,
17 at 0.747 secs, 19 at 0.748 secs, 31 at 0.748 secs, 61 at 0.748 secs,
89 at 0.749 secs, 107 at 0.749 secs, 127 at 0.75 secs, 521 at 0.782 secs,
607 at 0.79 secs, 1279 at 0.876 secs, 2203 at 1.09 secs, 2281 at 1.122 secs,
3217 at 1.529 secs, 4253 at 2.329 secs, 4423 at 2.529 secs,
9689 at 25.208 secs, 9941 at 29.454 secs, 11213 at 50.498 secs,
19937 at 337.926 secs, 21701 at 441.264 secs, 23209 at 537.84 secs,
Step 6. Advanced Code, with Trial Division Test
Providing next code just for reference, it is much more complex than previous step. Main difference is added Trial Division Test (to existing Lucas-Lehmer Test).
To use code you need to pip install numpy colorama, also gmpy2 needs to be installed as described in previous steps. Also unlike previous steps this one needs at least Python 3.8 to work (due to shared memory capability).
At the beginning of mersennes_v2() function you may setup some configurational variables. One important variable there is prime_bits, it should be set to as much as possible for your computer, but not more than 32, this variable controls size of primes table, 32 meaning all 32-bit primes are generated. This prime table is needed for trial division test. If you generate more then is higher chance of this test success hence more time saved instead of using heavy Lucas-Lehmer. If you have not much memory then set prime_bits to small value, but not less than 24, for 24 only 5MB of memory will be used, while for 32 will be used around 650MB.
Another added things are 1) generation of primes table using Sieve of Eratosthenes 2) usage of shared memory 3) fancy console output with statistics.
Timings provided at the end are not precise (don't compare them to previous steps) because my CPU today is overheated and gives twice less speed.
In console output tboost signifies amount of boost given by extra trial division test, i.e. how much on average less time is spent with using both TrialDivision+LucasLehmer compared to just LucasLehmer. If tboost value is bigger than 1.0x then it means that extra TrialDivision was benefitial, otherwise (if less than 1.0) it means slowdown, which can happen on some systems then code needs extra tweaking. In console tbits:X signifies that trial division tests are done for all prime numbers below 2^X.
Try it online!
import math, time, multiprocessing, multiprocessing.pool, multiprocessing.shared_memory, sys, os, faulthandler, traceback, secrets
faulthandler.enable()
import numpy as np, gmpy2, colorama
colorama.init()
def is_mersenne(args, *, data = {'tf': 0}):
try:
def lucas_lehmer_test(exp):
# Lucas Lehmer Test https://en.wikipedia.org/wiki/Lucas%E2%80%93Lehmer_primality_test
def num(n):
return gmpy2.mpz(n)
mask = num((1 << exp) - 1)
def mer_rem(x, bits):
# Below is same as: return x % mask
while True:
r = num(0)
while x != 0:
r += x & mask
x >>= bits
if r == mask:
r = num(0)
if r < mask:
return r
x = r
tb = time.time()
s, exp, two = num(4), num(exp), num(2)
for k in range(2, exp):
s = mer_rem(s * s - two, exp)
#print('ll', '-+'[s==0], '_', int(exp), '_', round(time.time() - tb, 3), sep = '', end = ' ', flush = True)
return s == 0
def trial_div_test(exp):
import numpy as np
if 'ptaba' not in data:
data['shp'] = multiprocessing.shared_memory.SharedMemory(args['all_primes_shname'])
data['ptaba'] = np.ndarray((args['all_primes_size'] // 4,), dtype = np.uint32, buffer = data['shp'].buf)
if exp <= 32:
return True, {'bits': 0}
bits = max(16, min(-19 + math.floor(math.log(exp) / math.log(1.25)), math.ceil(args['prime_bits'])))
if ('ptabb', bits) not in data:
cnt = min(np.searchsorted(data['ptaba'], 1 << bits), len(data['ptaba']))
data[('ptabb', bits)] = np.ndarray((cnt,), dtype = np.uint32, buffer = data['shp'].buf)
sptab = data[('ptabb', bits)]
tb, bl, probably_prime = time.time(), 1 << 13, True
spows = np.empty((bl,), dtype = np.uint64)
for i in range(0, sptab.size, bl):
ptab = sptab[i : i + bl]
pows = spows if spows.size == ptab.size else spows[:ptab.size]
pows[...] = 2
for b in bin(exp)[3:]:
pows *= pows
if b == '1':
pows <<= 1
pows %= ptab
if np.count_nonzero(pows == 1) > 0:
probably_prime = False
break
#print('td', '-+'[probably_prime], '_', int(exp), '_', round(time.time() - tb, 3), sep = '', end = ' ', flush = True)
return probably_prime, {'bits': bits}
p, stats = args['p'], {'tt': time.time()}
r, tinfo = trial_div_test(p)
stats['tt'], stats['tr'], stats['tte'], stats['tinfo'] = time.time() - stats['tt'], r, time.time(), tinfo
if not r:
return p, r, stats
stats['lt'] = time.time()
r = lucas_lehmer_test(p)
stats['lt'], stats['lte'] = time.time() - stats['lt'], time.time()
return p, r, stats
except:
return None, True, '', traceback.format_exc()
def mersennes_v2():
prime_bits = 28
num_cores = multiprocessing.cpu_count()
console_width = os.get_terminal_size().columns
def gen_primes(stop, *, dtype = 'int64'):
# https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
import math, numpy as np
if stop < 2:
return np.zeros((0,), dtype = dtype)
primes = np.ones((stop >> 1,), dtype = np.uint8)
primes[0] = False
for p in range(3, math.floor(math.sqrt(stop)) + 1, 2):
if primes[p >> 1]:
primes[(p * p) >> 1 :: p] = False
return np.concatenate((np.array([2], dtype = dtype), (np.flatnonzero(primes).astype(dtype) << 1) + 1))
def RoundFix(x, n):
s = str(round(x, n))
return s + '0' * (n - len(s) + ('.' + s).rfind('.'))
prime_bits = max(24, min(prime_bits, math.log(math.sqrt((1 << 63) - 1)) / math.log(2)))
print(f'Generating {round(prime_bits, 2)}-bits primes...', end = ' ', flush = True)
tb = time.time()
all_primes = gen_primes(math.floor(2 ** prime_bits), dtype = 'uint32')
print(f'{round(time.time() - tb)} secs.', flush = True)
all_primes_size = len(all_primes.data) * all_primes.dtype.itemsize
all_primes_shname = secrets.token_hex(16).upper()
shp = multiprocessing.shared_memory.SharedMemory(
name = all_primes_shname, create = True, size = all_primes_size)
shp.buf[:all_primes_size] = all_primes.tobytes()
del all_primes
all_primes = np.ndarray((all_primes_size // 4,), dtype = np.uint32, buffer = shp.buf)
print('Using', num_cores, 'cores.', flush = True)
print('\n\n')
offx, tstart, tlast, ptimes = 0, time.time(), time.time(), []
tstats = {'tt': 0.0, 'tc': 0, 'tts': [], 'tta': 0.0, 'lt': 0.0, 'lc': 0, 'lts': [], 'lta': 0.0, 'st': [(0, 0, time.time())], 'tbits': []}
with multiprocessing.Pool(num_cores) as pool:
for e in pool.imap(is_mersenne, ({
'p': int(e), 'all_primes_size': all_primes_size, 'all_primes_shname': all_primes_shname, 'prime_bits': prime_bits,
} for e in all_primes)):
if e[0] is None:
pool.terminate()
print('!!!Exception!!!\n', e[3], sep = '', flush = True)
break
stats = e[2]
def fill(p):
tstats[f'{p}t'] += stats[f'{p}t']
tstats[f'{p}ts'] += [(stats[f'{p}t'], stats[f'{p}te'])]
while len(tstats[f'{p}ts']) > 20 and tstats[f'{p}ts'][-1][1] - tstats[f'{p}ts'][0][1] > 120:
tstats[f'{p}ts'] = tstats[f'{p}ts'][1:]
tstats[f'{p}c'] += 1
tstats[f'{p}ta'] = sum(e[0] for e in tstats[f'{p}ts']) / len(tstats[f'{p}ts'])
if p == 't':
tstats['st'] += [(stats['tt'] + stats.get('lt', 0), stats.get('lt', tstats['st'][-1][1]), stats.get('lte', stats['tte']))]
while len(tstats['st']) > 50 and tstats['st'][-1][2] - tstats['st'][0][2] > 300:
tstats['st'] = tstats['st'][1:]
tstats['sta'] = sum(e[1] for e in tstats['st']) / max(0.001, sum(e[0] for e in tstats['st']))
tstats['tbits'] = (tstats['tbits'] + [stats['tinfo']['bits']])[-20:]
tstats['tbitsa'] = sum(tstats['tbits']) / len(tstats['tbits'])
fill('t')
if 'lt' in stats:
fill('l')
if not e[1]:
s0 = f'{str(e[0]).rjust(6)}| trial:{RoundFix(stats["tt"], 3)}/lucas:' + (f'{RoundFix(stats["lt"], 3)}' if 'lt' in stats else '-' * len(RoundFix(tstats['lta'], 3))) + f' secs (avg t:{RoundFix(tstats["tta"], 3)}/l:{RoundFix(tstats["lta"], 3)}) '
s1 = f'{"".rjust(6)}| cnt t:{tstats["tc"]}({tstats["tc"] - tstats["lc"]})/l:{tstats["lc"]}, tboost:{RoundFix(tstats["sta"], 3)}x tbits:{RoundFix(tstats["tbitsa"], 2)} '
print('\033[2A' + s0[:console_width - 4] + '\n' + s1[:console_width - 4] + '\n', end = '', flush = True)
else:
s = str(e[0]).rjust(6) + ' at ' + str(round(time.time() - tstart)).rjust(6) + ' secs, '
if offx + len(s) <= console_width - 4:
print('\033[3A\033[' + str(offx) + 'C' + s + '\n\n\n', end = '', flush = True)
else:
print('\033[2A' + (' ' * (console_width - 4)) + '\n\033[1A' + s + '\n\n\n', end = '', flush = True)
offx = 0
offx += len(s)
tlast = time.time()
if __name__ == '__main__':
mersennes_v2()
Generating 28-bits primes... 4 secs.
Using 8 cores.
3 at 1 secs, 5 at 1 secs, 7 at 1 secs,
13 at 1 secs, 17 at 1 secs, 19 at 1 secs,
31 at 1 secs, 61 at 1 secs, 89 at 1 secs,
107 at 1 secs, 127 at 1 secs, 521 at 1 secs,
607 at 1 secs, 1279 at 1 secs, 2203 at 2 secs,
2281 at 2 secs, 3217 at 3 secs, 4253 at 4 secs,
4423 at 5 secs, 9689 at 57 secs, 9941 at 68 secs,
11213 at 106 secs, 19937 at 832 secs, 21701 at 1145 secs,
23209 at 1448 secs,
13327| trial:0.599/lucas:5.510 secs (avg t:0.290/l:3.592)
| cnt t:1582(490)/l:1092, tboost:1.348x tbits:23.00
Step 7. C++ Code.
Although the question is about Python, I implemented C++ solution.
Main benefit of C++ over Python is because same algorithm in pure C++ compared to pure Python is sometimes even 50x-200x faster, so much!
I used GMP library for implementing big integer arithmetic. Also made partial support of Boost Multiprecision library.
Try it online!
CODE HERE. Unfortunately my code together with post appears to be bigger than 30K symbols, which is more than StackOverflow limit of 30K per post. That's why I'm posting my code here as Github Gist (link below). Also, check out Try it online! link above, it also has full code, but, Note, it is with some limitations of exponents size and console verbosity.
Github Gist full code
Code output:
Pre-Computing Table Primes till 2^28.0... 4 sec
3 at 0 sec, 5 at 0 sec, 7 at 0 sec,
13 at 0 sec, 17 at 0 sec, 19 at 0 sec,
31 at 0 sec, 61 at 0 sec, 89 at 1 sec,
107 at 1 sec, 127 at 1 sec, 521 at 8 sec,
607 at 9 sec, 1279 at 21 sec, 2203 at 37 sec,
2281 at 38 sec, 3217 at 57 sec, 4253 at 79 sec,
4423 at 82 sec, 9689 at 259 sec, 9941 at 273 sec,
11213 at 343 sec, 19937 at 1351 sec, 21701 at 1926 sec,
23209 at 2499 sec,
Exp 23565, Time 2629, TrialCnt 11783, LLCnt 4980, TrialBoost 1.932x,
TrialTime% 18%, Threads 2, TrialPrimes 2^10.73
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With