Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Combine array rows based on difference between prior row's last element and posterior row's first element


As title, say I am given a (n, 2) numpy array recording a series of segment's start and end indices, for example n=6:

import numpy as np
# x records the (start, end) index pairs corresponding to six segments
x = np.array(([0,4],    # the 1st seg ranges from index 0 ~ 4
              [5,9],    # the 2nd seg ranges from index 5 ~ 9, etc.
              [10,13],
              [15,20],
              [23,30],
              [31,40]))

Now I want to combine those segments with small interval between them. For example, merge consecutive segments if the interval is no larger than 1, so desired output would be:

y = np.array([0,13],    # Cuz the 1st seg's end is close to 2nd's start, 
                        # and 2nd seg's end is close to 3rd's start, so are combined.
             [15,20],   # The 4th seg is away from the prior and posterior segs,
                        # so it remains untouched.
             [23,40])   # The 5th and 6th segs are close, so are combined

so that the output segments would turn out to be just three instead of six. Any suggestion would be appreciated!

like image 987
Francis Avatar asked Dec 09 '25 03:12

Francis


2 Answers

If we're able to assume the segments are ordered and none are wholly contained within a neighbor, then you could do this by identifying where the gap between the end value in one range and the start of the next exceeds your criteria:

start = x[1:, 0]  # select columns, ignoring the beginning of the first range
end = x[:-1, 1]  # and the end of the final range
mask = start>end+1  # identify where consecutive rows have too great a gap

Then stitching these pieces back together:

np.array([np.insert(start[mask], 0, x[0, 0]), np.append(end[mask], x[-1, -1])]).T
Out[96]: 
array([[ 0, 13],
       [15, 20],
       [23, 40]])
like image 103
EFT Avatar answered Dec 11 '25 16:12

EFT


Here's a NumPy vectorized solution -

def merge_boundaries(x):
    mask = (x[1:,0] - x[:-1,1])!=1
    idx = np.flatnonzero(mask)
    start = np.r_[0,idx+1]
    stop = np.r_[idx, x.shape[0]-1]
    return np.c_[x[start,0], x[stop,1]]

Sample run -

In [230]: x
Out[230]: 
array([[ 0,  4],
       [ 5,  9],
       [10, 13],
       [15, 20],
       [23, 30],
       [31, 40]])

In [231]: merge_boundaries(x)
Out[231]: 
array([[ 0, 13],
       [15, 20],
       [23, 40]])
like image 25
Divakar Avatar answered Dec 11 '25 16:12

Divakar



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!