Create a list of random numbers and filter the list to only have numbers larger than 50

Question

I am using list comprehension to create a list of random numbers with numpy. Is there a way to check if random number generated is larger than 50 and only then append it to the list.

I know I can simply use:

numbers = [np.random.randint(50,100) for x in range(100)]

and that would solve the issue, but I just want to know if its possible to somehow check if np.random.randint(1,100) generated number greater than 50

Something like

numbers = [np.random.randint(1,100) for x in range(100) if {statement}]

Because comparing np.random.randit generates another number which is not the same as the first one.

I just want to know if there is a possibility to filter generated numbers before adding them to a list.

Florian Bernard · Accepted Answer

You use numpy, so we can leverage indexing method.

my_array = np.random.randint(1, 100, size=100)
mask = my_array > 50
print(my_array[mask]) # Contain only value greater than 50

But of course, the best way to do what you want is that.

results = np.random.randint(51,100, size=100)
# If you really need a list
results_list = results.tolist()

Please, don't loop over a numpy array in general.

Edit : replace my_list by my_array based on @norok2 comment.

Edit2 : Speed considerations

Numpy

With mask

%%timeit
my_array = np.random.randint(1, 100, size=100)
mask = my_array > 50
my_array[mask]

5.31 µs ± 127 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

10,000,000 elements:

1 loop, best of 3: 198 ms per loop

With numpy where (@Severin Pappadeux anwser)

%%timeit
q = np.random.randint(1, 100, 1000)
m = np.where(q > 50)
q[m]

20.9 µs ± 663 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

10,000,000 elements:

1 loop, best of 3: 196 ms per loop

Pure python

Part of @Alexander Cécile answer

%%timeit
rand_nums = (random.randint(0, 99) for _ in range(10))
arr = [val for val in rand_nums if val > 50]

19.4 µs ± 1.99 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

10,000,000 elements:

1 loop, best of 3: 11.4 s per loop

Mix numpy and list

@DrBwts answer

%%timeit
number = [x for x in np.random.randint(1, high=100, size=100) if x > 50]

28.9 µs ± 1.52 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

10,000,000 elements:

1 loop, best of 3: 2.76 s per loop

@makis answer

%%timeit 
numbers = [x for x in (np.random.randint(1,100) for iter in range(100)) if x > 50]

164 µs ± 19.4 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

10,000,000 elements:

1 loop, best of 3: 12.2 s per loop

@Romero Valentine answer

rand = filter(lambda x: x>50, np.random.randint(1,100,100))
rand_list = list(rand)

35.9 µs ± 1.97 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

10,000,000 elements:

1 loop, best of 3: 3.41 s per loop

Conclusions

On simple tasks, these methods are all fine.
On large arrays, numpy crushes the competition.

Answers that provide more information

@norok2
@Alexander Cécile

norok2 · Answer

In your question you have a some code:

numbers = [np.random.randint(50, 100) for x in range(100)]

which should probably be

numbers = np.random.randint(50, 100, 100)

Then you are asking if there is a way of replicate this using list-comprehension combined with filtering, starting from something that reads like:

numbers = [np.random.randint(1, 100) for x in range(100) if ...]

The answer to this is in general NO.

The reason for this is that the condition-filtering syntax will act as a filter to the generated numbers, after the numbers are generated and tentatively yielded.

Therefore, while the code [np.random.randint(50,100) for x in range(100)] will generate 100 items, any method based on comprehension filtering or even explicit use of filter() will have an unknown number of elements (typically less than 100, depending on how many items do meet the specified condition).

To get to that level of control, you may use a while-based generator (which cannot be included in a comprehension), e.g.:

import numpy as np


def my_brand_new_generator(n, a=1, b=100):
    i = 0
    while i < n:
        x = np.random.randint(a, b)
        if x > 50:
            yield x
            i += 1


numbers = list(my_brand_new_generator(100))
print(numbers[10])
# [94 97 50 53 53 89 59 69 71 86]

print(len(numbers))
# 100

By contrast:

numbers = [x for x in (np.random.randint(1,100) for iter in range(100)) if x > 50]
print(len(numbers))
# 54

x = np.random.randint(1, 100, 100)
numbers = x[x > 50]
print(len(numbers))
# 43

As side notes:

the my_brand_new_generator() is to be considered only a toy example to illustrate the aforementioned idea.
in general NumPy offers better alternatives to explicit iteration, so, when available, please use that.
for generating a single random integer, you can use random.randint() from the Python standard library.

Create a list of random numbers and filter the list to only have numbers larger than 50

Tags:

python

random

numpy

Jonas Palačionis

2 Answers

Numpy

With mask

With numpy where (@Severin Pappadeux anwser)

Pure python

Part of @Alexander Cécile answer

Mix numpy and list

@DrBwts answer

@makis answer

@Romero Valentine answer

Conclusions

Answers that provide more information

Florian Bernard

norok2

Recent Activity

Donate For Us

Create a list of random numbers and filter the list to only have numbers larger than 50

Tags:

python

random

numpy

Jonas Palačionis

2 Answers

Numpy

With mask

With numpy where (@Severin Pappadeux anwser)

Pure python

Part of @Alexander Cécile answer

Mix numpy and list

@DrBwts answer

@makis answer

@Romero Valentine answer

Conclusions

Answers that provide more information

Florian Bernard

norok2

Related questions

Recent Activity

Donate For Us