Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Return column names to a list if value in column is true

I have a dataframe that contains 16 columns. My goal is to return a 17th column containing all the column names in list or tuple format if the cell contained a certain value. The purpose is to efficiently store data from a multi-select survey question so that Python's .explode or SQL's UNNEST methods can be used to count the items in the 17th column.

A sample dataset:

| Q1    |  Q2   |  Q3   |
|-------|-------|-------|
| True  | True  | False |
| False | True  | True  |
| True  | True  | False |

What I'd like to return:

| Q1    |  Q2   |  Q3   |   List   |
|-------|-------|-------|----------|
| True  | True  | False | [Q1, Q2] |
| False | True  | True  | [Q2, Q3] |
| True  | True  | False | [Q1, Q2] |

I'm open to other solutions if I'm not quite thinking about this issue the right way.

like image 636
Essem 186F Avatar asked Sep 14 '25 18:09

Essem 186F


1 Answers

While this solution works for the specific question, I think it only works for NxN dict,list shapes e.g. add a 'Q4' key with a list length 3, or, drop a value from each list and it will break. I found this to be more robust personally even if not the most pythonic...

import itertools
data={'Q1':['True', 'False', 'True'], 'Q2':['True', 'True', 'True'], 'Q3':['False', 'True', 'False']}

output = []
for k,v in data.items():
    z=[]
    for i in v:
        if i =='True':
            z.append(k)
        else:
            z.append(None)
    output.append(z)
print(output)
#[['Q1', None, 'Q1'], ['Q2', 'Q2', 'Q2'], [None, 'Q3', None]]

output1 = list(map(list, itertools.zip_longest(*output, fillvalue=None)))

output2 = output1.copy()
print(output2)
#[['Q1', 'Q2', None], [None, 'Q2', 'Q3'], ['Q1', 'Q2', None]]

for x in output2:
    while None in x:
        x.remove(None)

print(output2)
#[['Q1', 'Q2'], ['Q2', 'Q3'], ['Q1', 'Q2']]
like image 105
Stickleback Avatar answered Sep 17 '25 09:09

Stickleback