KeyError: "The following 'id_vars' are not present in the DataFrame

Question

I'm a new learn in pandas and trying to count the number of products bought by a user with two excel:

1.customer id(selected customer id to be used)

2.transaction record (all the transactions),like:

|cusid|products|
-----------------------
|    1      | 12,13,14|    
|    1      | 05,12,12|

some of the code:


import pandas as pd
import numpy as np
import time
import turicreate as tc
from sklearn.model_selection import train_test_split

import sys
sys.path.append("..")

customers=pd.read_csv('testdata/data/recommend_1.csv')
transactions=pd.read_csv('testdata/data/trx_data.csv')
print(list(customers))
print (list(transactions))
print(customers.shape)
customers.head()
print(transactions.shape)
transactions.head()

data=pd.melt(transactions.set_index('cusid')['products'].apply(pd.Series).reset_index(drop=True),id_vars=['cusid'],value_name='products')\
        .dropna().drop(['variable'],axis=1)\
        .groupby(['cusid','products'])\
        .agg({'products':'count'})\
        .rename(columns={'products':'purchase_count'})\
        .reset_index(drop=True)\
        .rename(columns={'products':'productId'})

data['productId']=data['productId'].astype(np.int64)        
print(list(data))
print(data.shape)
data.head()

Results as following:

['cusid']
['cusid', 'products']
(1000, 1)
(62483, 2)
Traceback (most recent call last):
  File "recomm.py", line 20, in <module>
    data=pd.melt(transactions.set_index('cusid')['products'].apply(pd.Series).reset_index(drop=True),id_vars=['cusid'],value_name='products')\
  File "/Users/bijing/anaconda2/envs/turi/lib/python2.7/site-packages/pandas/core/reshape/melt.py", line 48, in melt
    "".format(missing=list(missing)))
KeyError: "The following 'id_vars' are not present in the DataFrame: ['cusid']"

Satria Hidayat · Accepted Answer

Try to run data = data.reset_index() before you run the melting.

KeyError: "The following 'id_vars' are not present in the DataFrame

Tags:

python

pandas

Poppy Bee

1 Answers

Satria Hidayat

Recent Activity

Donate For Us

KeyError: "The following 'id_vars' are not present in the DataFrame

Tags:

python

pandas

Poppy Bee

1 Answers

Satria Hidayat

Related questions

Recent Activity

Donate For Us