Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numpy.where on an array of strings using regex

Tags:

python

numpy

I have an array like:

array = ['A0','A1','A2','A3','A4','B0','B1','C0']

and want to obtain an array which is true for values with an A followed by a number ranging from 0 to 2.

So far, this is the way I do it:

selection = np.where ((array == 'A0') | (array == 'A1') | (array == 'A2'), 1, 0)

But is there a more elegant way to do this by using e.g., a regular expresion like:

selection = np.where (array == 'A[0-1]', 1, 0)
like image 328
acb Avatar asked Sep 19 '25 04:09

acb


2 Answers

If using pandas is an option:

import numpy as np
import pandas as pd

a = np.array(['A0','A1','A2','A3','A4','B0','B1','C0'])
pd.Series(a).str.match(r'A[0-2]')
# 0     True
# 1     True
# 2     True
# 3    False
# 4    False
# 5    False
# 6    False
# 7    False
# dtype: bool
like image 154
Nils Werner Avatar answered Sep 20 '25 19:09

Nils Werner


I don't think numpy if your best solution here. You can accomplish this using built-in python tools such as map.

import re

array = ['A0','A1','A2','A3','A4','B0','B1','C0']
p = r'A[0-2]'

list(map(lambda x: bool(re.match(p, x)), array))
# returns
[True, True, True, False, False, False, False, False]

# to get an array:
np.array(list(map(lambda x: bool(re.match(p, x)), array)))
# returns:
array([ True,  True,  True, False, False, False, False, False])
like image 22
James Avatar answered Sep 20 '25 20:09

James