Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

type of return value in itertuples and print column names of itertuples in pandas

I have a DataFrame as follows:

          a         b         c         d
0  0.140603  0.622511  0.936006  0.384274
1  0.246792  0.961605  0.866785  0.544677
2  0.710089  0.057486  0.531215  0.243285

I want to iterate the df with itertuples() and print the values and column names of each row. Currently I know the following method:

df=pd.DataFrame(np.random.rand(3,4),columns=['a','b','c','d'])
for item in df.itertuples():
    print(item)

And the output is:

Pandas(Index=0, a=0.55464273035498401, b=0.50784779485386233, c=0.55866384351761911, d=0.35969591433338755)
Pandas(Index=1, a=0.60682158587529356, b=0.37571390304543184, 
c=0.13566419305411737, d=0.55807909125502775)
Pandas(Index=2, a=0.73260693374584385, b=0.59246381839030349, c=0.92102184020347211, d=0.029942550647279687)

Question:

1) I thought the return data of each iteration is a tuple (as suggested by the function name) when the type(df) returns Pandas()?

2) What is the best way to extract the value of 'a', 'b', 'c', 'd' being the column names as I loop through the items of each row?

like image 760
user7786493 Avatar asked Oct 14 '25 23:10

user7786493


1 Answers

It's a named tuple.

To access the values of the named tuple, either by label:

for item in df.itertuples():
    print(item.a, item.b)

or by position

for item in df.itertuples():
    print(item[1], item[2])

When DataFrame has more than 254 columns, the return type is a tuple and the only available access is by position. To be anyway able to access by label, restrict df just to columns you need

for item in df.loc[:, [a, b]].itertuples():
    print(item.a, item.b)