Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create a custom filter function in R

Tags:

r

I need to filter coordinate data that is either inside or outside of a predefined area. I was hoping to write a custom function that would speed up that process. Something that could be inserted inside a pipe like this:

df %>% 
  filter(group == "A",
         outside_area(x_coord,y_coord))

I don't know if that's technically legal, but the idea is to be able to call it somewhere in a dplyr pipe

Here's the context to make things a little more clear.

# data
set.seed(123)
list <- c("A","B","C")
df <- tibble (group = sample(list, 500, replace=TRUE),
              x = runif(500,0,105), 
              y = runif(500,0,68))

# plot all the data points
df %>% ggplot(aes(x=x,y=y)) +
  geom_point()

# plot outside an area -- works
df %>% 
  filter(group == "A",
         x <= 88.5 | (x >= 88.5 & y >= 43.2) | (x >= 88.5 & y <= 24.8)) %>% 
  ggplot(aes (x=x, y=y)) +
  geom_point() +
  xlim(0,105) +
  ylim(0,69)

So the function would incorporate

x <= 88.5 | (x >= 88.5 & y >= 43.2) | (x >= 88.5 & y <= 24.8)

Thanks for your help

like image 810
seansteele Avatar asked Dec 10 '25 11:12

seansteele


1 Answers

We could create a function as

outside_area <- function(dat, col1, col2) { 
     dat[[col1]]<= 88.5 | (dat[[col1]] >= 88.5 & dat[[col2]] >= 43.2) | (dat[[col1]] >= 88.5 & dat[[col2]] <= 24.8)
 }

df %>% 
    filter(group == "A", outside_area(., 'x', 'y'))

-output

# A tibble: 164 x 3
#   group      x      y
#   <chr>  <dbl>  <dbl>
# 1 A      74.8  16.4  
# 2 A      98.2  47.0  
# 3 A      18.2  66.1  
# 4 A       9.06 44.1  
# 5 A      29.7  62.3  
# 6 A      44.1  14.7  
# 7 A      61.7  37.3  
# 8 A      77.0   0.169
# 9 A     100.   54.4  
#10 A      17.9  53.6  
# … with 154 more rows
like image 113
akrun Avatar answered Dec 12 '25 03:12

akrun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!