Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Add observation number by group in R [duplicate]

This is a silly question but I am new to R and it would make my life so much easier if I could figure out how to do this! So here is some sample data

data <- read.table(text = "Category Y
 A 5.1
 A 3.14
 A 1.79
 A 3.21
 A 5.57
 B 3.68
 B 4.56
 B 3.32
 B 4.98
 B 5.82
 ",header = TRUE)

I want to add a column that counts the number of observations within a group. Here is what I want it to look like:

Category    Y    OBS
A          5.1    1
A          3.14   2
A          1.79   3
A          3.21   4
A          5.57   5
B          3.68   1
B          4.56   2
B          3.32   3
B          4.98   4
B          5.82   5

I have tried:

data <- data %>% group_by(Category) %>% mutate(count = c(1:length(Category)))

which just creates another column numbered from 1 to 10, and

data <- data %>% group_by(Category) %>% add_tally()

which just creates another column of all 5s

like image 591
yaynikkiprograms Avatar asked Oct 29 '25 15:10

yaynikkiprograms


2 Answers

Base R:

data$OBS <- ave(seq_len(nrow(data)), data$Category, FUN = seq_along)
data
#    Category    Y OBS
# 1         A 5.10   1
# 2         A 3.14   2
# 3         A 1.79   3
# 4         A 3.21   4
# 5         A 5.57   5
# 6         B 3.68   1
# 7         B 4.56   2
# 8         B 3.32   3
# 9         B 4.98   4
# 10        B 5.82   5

BTW: one can use any of the frame's columns as the first argument, including ave(data$Category, data$Category, FUN=seq_along), but ave chooses its output class based on the input class, so using a string as the first argument will result in a return of strings:

ave(data$Category, data$Category, FUN = seq_along)
#  [1] "1" "2" "3" "4" "5" "1" "2" "3" "4" "5"

While not heinous, it needs to be an intentional choice. Since it appears that you wanted an integer in that column, I chose the simplest integer-in, integer-out approach. It could also have used rep(1L,nrow(data)) or anything that is both integer and the same length as the number of rows in the frame, since seq_along (the function I chose) won't otherwise care.

like image 186
r2evans Avatar answered Oct 31 '25 07:10

r2evans


library(dplyr) 
data %>% group_by(Category) %>% mutate(Obs = row_number()) 

# A tibble: 10 x 3
# Groups:   Category [2]
   Category     Y   Obs
   <chr>    <dbl> <int>
 1 A         5.1      1
 2 A         3.14     2
 3 A         1.79     3
 4 A         3.21     4
 5 A         5.57     5
 6 B         3.68     1
 7 B         4.56     2
 8 B         3.32     3
 9 B         4.98     4
10 B         5.82     5

OR

data$OBS <- ave(data$Category, data$Category, FUN = seq_along)

data
   Category    Y OBS
1         A 5.10   1
2         A 3.14   2
3         A 1.79   3
4         A 3.21   4
5         A 5.57   5
6         B 3.68   1
7         B 4.56   2
8         B 3.32   3
9         B 4.98   4
10        B 5.82   5
like image 28
AnilGoyal Avatar answered Oct 31 '25 05:10

AnilGoyal