Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to extract last and first rows of numpy array having specific values

I hae a list of numpy arrays and want to xport some rows of each array. My list is:

all_data=[np.array([[1., 1., 2.], [2., 1., 4.], [3., 1., 5.], [1., 2., 4.]]),\
          np.array([[3., 1., 3.], [4., 1., 4.], [3., 2., 7.], [4., 2., 1.]]),\
          np.array([[0., 0., 0.], [1., 1., 1.], [3., 1., 0.], [4., 2., 4.]]),\
          np.array([[2., 2., 2.], [3., 2., 1.], [4., 2., 5.], [5., 2., 4.]])]

My all_data has four arrays here (in reality it has much more). In each array I am firstly interested in knowing how many unique or repeated values are existing in the second column. Than, for even array, I wan to extract the last row of the rows having the same value in their second column. If a row has a unique value, I only extract that row. For odd arrays, I want to extrat the first row of the rows having the same value in their second column and again for rows with unique values, I extract that single row. Finally, I want the following list of array and my output:

[np.array([[3., 1., 5.], [1., 2., 4.]]),\
 np.array([[3., 1., 3.], [3., 2., 7.]]),\
 np.array([[0., 0., 0.], [3., 1., 0.], [4., 2., 4.]]),\
 np.array([[2., 2., 2.]])]

I tried the following code but it was not successful at all.

extracted=[]
for i,h in enumerate (all_data):
    nums, counts=np.unique(h[:,1],return_counts=True) # to find the frequency of values in second column
    if i%2 ==0:
        a=np.where(np.isin(h[:,1],nums).any(-1))[-1]
    else:
        a=np.where(np.isin(h[:,1],nums).any(-1))[0]
    extracted.append (a)

I do appreciate any help in advance.

like image 647
Link_tester Avatar asked Nov 16 '25 21:11

Link_tester


1 Answers

To get "wanted" rows from arr (array No i from all_data) define the following function:

def getRows(i, arr):
    vals, inv, cnts = np.unique(arr[:, 1], return_inverse=True, return_counts=True)
    rows = [inv.size - np.argmax(inv[::-1] == iCnt) - 1 if i % 2 == 0 else\
        np.argmax(inv == iCnt) for iCnt in range(len(cnts))]
    return arr[rows]

Then, to get the extracted rows from each array, as a separate array from each source array, run the following list comprehension:

extracted = [getRows(i, arr) for i, arr in enumerate(all_data)]

For your data sample the result is:

[array([[3., 1., 5.], [1., 2., 4.]]),
 array([[3., 1., 3.], [3., 2., 7.]]),
 array([[0., 0., 0.], [3., 1., 0.], [4., 2., 4.]]),
 array([[2., 2., 2.]])]
like image 70
Valdi_Bo Avatar answered Nov 19 '25 11:11

Valdi_Bo