Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

lists from a column in dataframe

I have a dataframe that consists of lists of students from different semesters (also grades etc)

I want to create a dictionary of lists of students who started in a given semester. I have filtered the dataframe to consists of only 1A students, and now I want to use this to make the lists. So far I have created a dictionary of dataframes.

but cannot seem to figure out how to systematically make lists instead.

Please help?

I have tried appending .tolist() but this does not work. It does work if I do it to each dataframe but, you know, DRY.

here is what I have so far, basically

import pandas as pd
import numpy as np
data = {'UW ID':[1, 2, 3, 4, 5, 6, 7, 8, 9],
        'Term':[201001, 201101, 201201, 201201, 201001,201001, 
201101, 201201, 201201 ]}
df = pd.DataFrame(data)

dfs = dict(list(df.groupby(by ='Term')['UW ID']))
dfs[201001].tolist()

ideally each of the df items would just be a list.

like image 770
Meg Ward Avatar asked Sep 03 '25 03:09

Meg Ward


1 Answers

Use a groupby.agg with list and convert to_dict:

out = df.groupby('Term')['UW ID'].agg(list).to_dict()

Or a dictionary comprehension:

out = {k: g.tolist() for k, g in df.groupby('Term')['UW ID']}

Output:

{201001: [1, 5, 6], 201101: [2, 7], 201201: [3, 4, 8, 9]}
like image 180
mozway Avatar answered Sep 05 '25 02:09

mozway