I have the a dataframe in the following form (its too big to post here entirerly):
      listing_id    date    city    type    host_id availability
1   703451  25/03/2013  amsterdam   Entire home/apt 3542621 245
2   703451  20/04/2013  amsterdam   Entire home/apt 3542621 245
3   703451  28/05/2013  amsterdam   Entire home/apt 3542621 245
4   703451  15/07/2013  amsterdam   Entire home/apt 3542621 245
5   703451  30/07/2013  amsterdam   Entire home/apt 3542621 245
6   703451  19/08/2013  amsterdam   Entire home/apt 3542621 245
and so on...
I would like three new data frames. One counting the number of observations for a particular year (2013,2012, 2011 and so on) another per month (07/2013, 06/2013 and so on) and another per day (28/05/2013, 29/05/2013 and so on). I just want to count how many occurances there are per unit of time.
How would I do that?
count() lets you quickly count the unique values of one or more variables: df %>% count(a, b) is roughly equivalent to df %>% group_by(a, b) %>% summarise(n = n()) .
The function str() shows you the structure of your data set. For a data frame it tells you: The total number of observations (e.g. 32 car types) The total number of variables (e.g. 11 car features) A full list of the variables names (e.g. mpg , cyl ... )
Using data.table, this is pretty straightforward: 
library(data.table)
dt <- fread("listing_id    date    city    type    host_id availability
703451  25/03/2013  amsterdam   Entire_home/apt 3542621 245
703451  20/04/2013  amsterdam   Entire_home/apt 3542621 245
703451  28/05/2013  amsterdam   Entire_home/apt 3542621 245
703451  15/07/2013  amsterdam   Entire_home/apt 3542621 245
703451  30/07/2013  amsterdam   Entire_home/apt 3542621 245
703451  19/08/2013  amsterdam   Entire_home/apt 3542621 245")
dt$date <- as.Date(dt$date, "%d/%m/%Y")
dt[, .N, by=year(date)] 
#    year N
# 1: 2013 6
dt[, .N, by=.(year(date), month(date))] 
#    year month N
# 1: 2013     3 1
# 2: 2013     4 1
# 3: 2013     5 1
# 4: 2013     7 2
# 5: 2013     8 1
dt[, .N, by=date] # or: dt[, .N, by=.(year(date), month(date), day(date)] 
#          date N
# 1: 2013-03-25 1
# 2: 2013-04-20 1
# 3: 2013-05-28 1
# 4: 2013-07-15 1
# 5: 2013-07-30 1
# 6: 2013-08-19 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With