Remove row below conditionally in dataframe and add values together in R

Question

I have a large dataset with 3 columns: Name, Country, and Sales.

I'd like to sum the Sales column by Names that are both identical and occur consecutively. Then I'd like to remove all rows but the first occurrence of a series, replacing the value of Sales with the series sum.

For example:

Name,Country,Sales
A,V,100
A,W,100
B,X,100
B,Y,100
A,Z,100

Would be reduced to:

Name,Country,Sales
A,V,200
B,X,200
A,Z,100

Anyone got any idea how to do this?

CPak · Accepted Answer

Your data

df <- structure(list(Name = c("A", "A", "B"), Country = c("X", "Y", 
"Z"), Sales = c(100L, 100L, 100L)), .Names = c("Name", "Country", 
"Sales"), row.names = c(NA, -3L), class = c("data.table", "data.frame"
))

dplyr solution

library(dplyr)
library(data.table)
ans <- df %>%
         group_by(rleid(Name)) %>%
         summarise(Name = unique(Name), Sales=sum(Sales)) %>%
         select(-1)

Output

   Name Sales
  <chr> <int>
1     A   200
2     B   100

Alternative example

newdf <- rbind(df, data.frame(Name=c("A","A","B","B"),
                              Country=c("A","B","C","D"),
                              Sales=c(100,100,100,100)))
ans <- newdf %>%
         group_by(rleid(Name)) %>%
         summarise(Name = unique(Name), Sales=sum(Sales)) %>%
         select(-1)

Output

    Name Sales
  <fctr> <dbl>
1      A   200
2      B   100
3      A   200
4      B   200

Remove row below conditionally in dataframe and add values together in R

Tags:

r

no nein

1 Answers

CPak

Recent Activity

Donate For Us

Remove row below conditionally in dataframe and add values together in R

Tags:

r

no nein

1 Answers

CPak

Related questions

Recent Activity

Donate For Us