Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

cut() creates NA break category in R

Tags:

r

I am processing my data using R and for that purpose, I have to format my database.

data <- database %>%
  group_by(cat_a, cat_b) %>% 
  mutate(
    lengths = cut(length, breaks = seq(0, (max(length)+50), by = 50)),
    heights = cut(height, breaks = seq(0, (max(height)+1), by = 1), dig.lab=5)
  )

At this point, when I check the values calculated by cut()

unique(data$heights)
 [1] (0,1]   (1,2]   (2,3]   (3,4]   (4,5]   (5,6]   (6,7]   (7,8]   (8,9]   (9,10]  (10,11] (11,12] (12,13] <NA>   
Levels: (0,1] (1,2] (2,3] (3,4] (4,5] (5,6] (6,7] (7,8] (8,9] (9,10] (10,11] (11,12] (12,13]

To better understand my problem max(height) returns 13.3. But, if you see the Levels the last one is (12,13]. This makes me believe that it is the reason to have a <NA> at the end at the first line of the result [1].

So, I tried to fix this by setting the breaks in cut() by +1 (see: (max(height)+1). But, not just that I don't get a new category, I also still have the NA.

Here I have to add, that omitting the NAs is not the solution, since I believe those are the values that didn't end up in a category. Basically values like 13.3.

Therefore, my question is how to fix this? How can I tell cut() to create that one extra category? I know that there is something like include.lower=TRUE, so I am looking the opposite, how to include the highest? Maybe my observation is wrong, so I am looking forward to every idea

UPDATE
As suggested in the comments:
heights = cut(height, breaks = c(-Inf,seq(0, (max(height)+1), by = 1), Inf), dig.lab=5)

like image 426
CroatiaHR Avatar asked Nov 29 '25 06:11

CroatiaHR


1 Answers

We can add -Inf, Inf in breaks to remove the NA

cut(..., breaks = c(-Inf, seq(0, max(frame_size), by = 1), Inf))
like image 162
akrun Avatar answered Dec 01 '25 21:12

akrun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!