Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Group list-of-tuples by second element, take average of first element

I have a list of tuples (x,y) like:

l = [(2,1), (4,6), (3,1), (2,7), (7,10)]

Now I want to make a new list:

l = [(2.5,1), (4,6), (2,7), (7,10)]

with the new list having the average of the first value (x) of tuples if there are more than one tuple with the same second value (y) in the tuple.

Here since for (x,y) = (2,1) and (3,1) the second element in the tuple y=1 is common therefore the average of x=2 and 3 is in the new list. y=1 does not occur anywhere else, therefore the other tuples remain unchanged.

like image 292
ubuntu_noob Avatar asked Oct 15 '25 14:10

ubuntu_noob


1 Answers

Since you tagged pandas:

l = [(2,1), (4,6), (3,1), (2,7), (7,10)]
df = pd.DataFrame(l)

Then df is a data frame with two columns:

    0   1
0   2   1
1   4   6
2   3   1
3   2   7
4   7   10

Now you want to compute the average of the numbers in column 0 with the same value in column 1:

(df.groupby(1).mean()     # compute mean on each group
   .reset_index()[[0,1]]  # restore the column order
   .values                # return the underlying numpy array
 )

Output:

array([[ 2.5,  1. ],
       [ 4. ,  6. ],
       [ 2. ,  7. ],
       [ 7. , 10. ]])
like image 92
Quang Hoang Avatar answered Oct 18 '25 07:10

Quang Hoang