Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas use variable for column names [duplicate]

Given the following data frame:

import pandas as pd
import numpy as np
df = pd.DataFrame({'A':[1,2,3],
                   'B':[4,5,6],
                   'C':[7,8,9]})

df

    A   B   C
0   1   4   7
1   2   5   8
2   3   6   9

How can I access columns via a variable?

I tried this:

cols='A','B'
df[cols]

...which resulted in this:

KeyError: ('A', 'B')

Bonus Question: What if my data frame were like this?:

import pandas as pd
import numpy as np
df = pd.DataFrame({'A':[1,2,3],
                   'B':[4,5,6],
                   'C':[7,8,9],
                   'D':[1,3,5],
                   'E':[5,3,6],
                   'F':[7,4,3]})

df

    A   B   C   D   E   F
0   1   4   7   1   5   7
1   2   5   8   3   3   4
2   3   6   9   5   6   3

and I wanted to do this?:

cols=['A','B']
cols2=['C','D']
df[cols,'F',cols2]

Thanks in advance!

like image 375
Dance Party Avatar asked Oct 25 '25 03:10

Dance Party


1 Answers

You can try subset by list of column names:

cols=['A','B']
print df[cols]
   A  B
0  1  4
1  2  5
2  3  6

It is same as:

print df[['A','B']]
   A  B
0  1  4
1  2  5
2  3  6

Bonus answer:

cols=['A','B']
cols2=['C','D']

allcols = cols + ['F'] + cols2
print df[allcols]
   A  B  F  C  D
0  1  4  7  7  1
1  2  5  4  8  3
2  3  6  3  9  5
like image 50
jezrael Avatar answered Oct 26 '25 18:10

jezrael



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!