Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using R to calculate total events per day from a data frame that contains all events and their timestamps

Tags:

r

I have a data frame df that contains 'messages'. Each row is a message. Each message has a timestamp called df$messagedate in POSIXct format %Y-%m-%d %H:%M:%S. Example:

> head(df)
messageid   user.id    message.date         
123         999       2011-07-17 17:54:27
456         888       2011-07-19 16:56:50

(Here is the dput()'ed version of the above):

df <- structure(list(messageid = c(123L, 456L), user.id = c(999L, 888L), 
      message.date = structure(c(1310950467, 1311119810), class = c("POSIXct", 
      "POSIXt"), tzone = "")), .Names = c("messageid", "user.id", 
      "message.date"), row.names = c(NA, -2L), class = "data.frame")

How do I create a data frame with the total the number of messages per day? Example:

day                   message.count 
2011-07-17             1
2011-07-18             0
2011-07-19             1

Rather than not including the dates with no messages, I want to make sure the message.count is set to zero for those days.

What I have done so far: I have extracted the calendar day part of message.date by doing:

df$calendar.day<-as.POSIXct(strptime(substr(df$message.date,1,10),"%Y-%m-%d",tz="CST6CDT"))
> head(df$calendar.day)
[1] "2011-07-17 CDT" "2011-07-18 CDT" "2011-07-19 CDT"

And from there I can generate a list of every single calendar date in the date range: daterange <- seq(min(df$calendar.day), max(df$calendar.day), by="day")

like image 750
amh Avatar asked Jan 21 '26 00:01

amh


1 Answers

Here's a fairly straightforward solution that uses sapply() to count the number of messages on each date spanned by your log.

countMessages <- function(timeStamps) {
    Dates <- as.Date(strftime(df$message.date, "%Y-%m-%d"))
    allDates <- seq(from = min(Dates), to = max(Dates), by = "day")
    message.count <- sapply(allDates, FUN = function(X) sum(Dates == X))
    data.frame(day = allDates, message.count = message.count)
}

countMessages(df$message.date)
#          day message.count
# 1 2011-07-17             1
# 2 2011-07-18             0
# 3 2011-07-19             1
like image 60
Josh O'Brien Avatar answered Jan 23 '26 20:01

Josh O'Brien



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!