I have been scratching my head trying to figure out why my attempts to create a column of tuple from a column of lists doesn't work as I expect it to. So, Here is sommething that does work:
def convert(list):
return tuple(list)
list = ['aaaa', 'aaaa', 'd33f3f', 'sdjd3i']
print(convert(list))
('aaaa', 'aaaa', 'd33f3f', 'sdjd3i')
Now, in my case my dataframe looks like this:
CLOUD_STORE_ID CalculationId Date
0 772 ff80818178f93bd9017bf5094e602b05 2021-09-17
1 772 ff80818178f93bd9017bf509b4f52b0a 2021-09-17
2 772 ff80818178f93bd9017bf50ccb2c2b20 2021-09-17
0 773 ff80818178f93bd9017bf510f2a12b40 2021-09-17
1 773 ff80818178f93bd9017bf514e2582b55 2021-09-17
2 773 ff80818178f93e26017bf510b2b30c0e 2021-09-17
0 900 ff80818178f93bd9017bf510791f2b39 2021-09-17
1 900 ff80818178f93bd9017bf5109db22b3b 2021-09-17
2 900 ff80818178f93bd9017bf5139e9a2b4d 2021-09-17
0 777 ff80818178f93e26017bf50a780e0bec 2021-09-17
but I need list of strinngs:
CalculationID_df = CalculationID_df.groupby('CLOUD_STORE_ID').agg({ 'CalculationId': lambda x: '{}'.format(list(x))}).reset_index()
which gives:
CLOUD_STORE_ID CalculationId
0 486 ['ff80818178f93bd9017bf521bbfd2b6b', 'ff808181...
1 522 ['ff80818178f93e26017bf4efb6fb0bb6']
2 665 ['ff80818178f93bd9017bf4f7dda12acb', 'ff808181...
3 671 ['ff80818178f93bd9017bf51890802b5d', 'ff808181...
4 772 ['ff80818178f93bd9017bf5094e602b05', 'ff808181...
I actually need tuple to pass to a SQL-query, so I need to transform these lists into tuples. I thought that the above method would do the trick:
def convert(list):
return tuple(list)
CalculationID_df['Calcs'] = CalculationID_df.CalculationId.apply(convert)
But it returns:
CLOUD_STORE_ID CalculationId \
0 486 ['ff80818178f93bd9017bf521bbfd2b6b', 'ff808181...
1 522 ['ff80818178f93e26017bf4efb6fb0bb6']
2 665 ['ff80818178f93bd9017bf4f7dda12acb', 'ff808181...
Calcs
0 ([, ', f, f, 8, 0, 8, 1, 8, 1, 7, 8, f, 9, 3, ...
1 ([, ', f, f, 8, 0, 8, 1, 8, 1, 7, 8, f, 9, 3, ...
2 ([, ', f, f, 8, 0, 8, 1, 8, 1, 7, 8, f, 9, 3, ..
which is clearly not what I expected.
What am I doing wrong?
The problem is that your list is a string "['dasdas','dasd']" you need to convert it to a list before your convert it to tuples.
After this row:
CalculationID_df = CalculationID_df.groupby('CLOUD_STORE_ID').agg({ 'CalculationId': lambda x: '{}'.format(list(x))}).reset_index()
Add this code:
import ast
CalculationID_df.CalculationId = CalculationID_df.CalculationId.apply(lambda x: ast.literal_eval(x))
And then convert it to tuples as you did before.
Result:

IIUC
>>> df.groupby('CLOUD_STORE_ID')['CalculationId'] \
.apply(tuple).astype(str).reset_index()
CLOUD_STORE_ID CalculationId
0 772 ('ff80818178f93bd9017bf5094e602b05', 'ff808181...
1 773 ('ff80818178f93bd9017bf510f2a12b40', 'ff808181...
2 777 ('ff80818178f93e26017bf50a780e0bec',)
3 900 ('ff80818178f93bd9017bf510791f2b39', 'ff808181...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With