Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to map values in place?

I have a dataframe like this:

df = pd.DataFrame({'c1': list('aba'), 'c2': list('aaa'), 'ignore_me': list('bbb'), 'c3': list('baa')})

  c1 c2 ignore_me c3
0  a  a         b  b
1  b  a         b  a
2  a  a         b  a

and a dictionary that looks like this

d = {'a': "foo", 'b': 'bar'}

I now want to map the values of d to columns that match the regex ^c\d+$.

I can do

df.filter(regex='^c\d+$').apply(lambda x: x.map(d))

    c1   c2   c3
0  foo  foo  bar
1  bar  foo  foo
2  foo  foo  foo

however, then the there are all the columns missing that don't match the regex.

So, I can therefore do:

tempdf = df.filter(regex='^c\d+$')

df.loc[:, tempdf.columns] = tempdf.apply(lambda x: x.map(d))

which gives the desired output

    c1   c2 ignore_me   c3
0  foo  foo         b  bar
1  bar  foo         b  foo
2  foo  foo         b  foo

Is there a smarter solution that avoids the tempory dataframe?

like image 767
Cleb Avatar asked Oct 26 '25 08:10

Cleb


1 Answers

There absolutely is, use str.contains.

df.columns.str.contains(r'^c\d+$') # use raw strings, it's good hygene
# array([ True,  True, False,  True])

Pass the mask to loc:

df.loc[:, df.columns.str.contains(r'^c\d+$')] = df.apply(lambda x: x.map(d))

If you want to be as efficient as possible,

m = df.columns.str.contains(r'^c\d+$')
df.loc[:, m] = df.loc[:, m].apply(lambda x: x.map(d))

df

    c1   c2 ignore_me   c3
0  foo  foo  b         bar
1  bar  foo  b         foo
2  foo  foo  b         foo
like image 137
cs95 Avatar answered Oct 28 '25 21:10

cs95