Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Check if two intervals overlap in R

Tags:

range

r

intervals

Given values in four columns (FromUp,ToUp,FromDown,ToDown) two of them always define a range (FromUp,ToUp and FromDown,ToDown). How can I test whether the two ranges overlap. It is important to state that the ranges value are not sorted so the "From" value can be higher then the "To" value and the other way round.

Some Example data:

FromUp<-c(5,32,1,5,15,1,6,1,5)
ToUp<-c(5,31,3,5,25,3,6,19,1)

FromDown<-c(1,2,8,1,22,2,1,2,6)
ToDown<-c(4,5,10,6,24,4,1,16,2)

ranges<-data.frame(FromUp,ToUp,FromDown,ToDown)

So that the result would look like:

FromUp ToUp FromDown ToDown   Overlap
      5    5        1      4    FALSE
     32   31        2      5    FALSE
      1    3        8     10    FALSE
      5    5        1      6    TRUE
     15   25       22     24    TRUE
      1    3        2      4    TRUE
      6    6        1      1    FALSE
      1   19        2     16    TRUE
      5    1        6      2    TRUE

I tried a view things but did not get it to work especially the thing that the intervals are not "sorted" makes it for my R skills to difficult to figure out a solution. I though about finding the min and max values of the pairs of columns(e.g FromUp, ToUp) and than compare them?

Any help would be appreciated.

like image 381
Kitumijasi Avatar asked Oct 27 '25 06:10

Kitumijasi


2 Answers

Sort them

rng = cbind(pmin(ranges[,1], ranges[,2]), pmax(ranges[,1], ranges[,2]),
            pmin(ranges[,3], ranges[,4]), pmax(ranges[,3], ranges[,4]))

and write the condition

olap = (rng[,1] <= rng[,4]) & (rng[,2] >= rng[,3])

In one step this might be

(pmin(ranges[,1], ranges[,2]) <= pmax(ranges[,3], ranges[,4])) &
    (pmax(ranges[,1], ranges[,2]) >= pmin(ranges[,3], ranges[,4]))

The foverlap() function mentioned by others (or IRanges::findOveralaps()) would be appropriate if you were looking for overlaps between any range, but you're looking for 'parallel' (within-row?) overlaps.

The logic of the solution here is the same as the answer of @Julius, but is 'vectorized' (e.g., 1 call to pmin(), rather than nrow(ranges) calls to sort()) and should be much faster (though using more memory) for longer vectors of possible ranges.

like image 88
Martin Morgan Avatar answered Oct 28 '25 19:10

Martin Morgan


In general:

apply(ranges,1,function(x){y<-c(sort(x[1:2]),sort(x[3:4]));max(y[c(1,3)])<=min(y[c(2,4)])})

or, in case intervals cannot overlap at just one point (e.g. because they are open):

!apply(ranges,1,function(x){y<-sort(x)[1:2];all(y==sort(x[1:2]))|all(y==sort(x[3:4]))})
like image 22
Julius Vainora Avatar answered Oct 28 '25 19:10

Julius Vainora