I'm using statsd (the latest version from the git master branch) with graphite (0.9.10) as the backend.
In my (Django) code I call statsd.incr("signups") when a user signs up.  In graphite's web interface, I now see a beautiful graph showing the number of signups per second under Graphite/stats/signups.  When I look at the graph under Graphite/stats_counts/signups, I expect to see the total number of signups, but it looks like it's the number of signups per 10s interval (that's statsd's refresh interval, I guess).
I did configure storage-aggregation.conf, perhaps I got it wrong somehow?  Also, I stopped carbon (not with stop, but really killed it, as apparently just stopping it doesn't allow it to reload the configuration).  I also deleted the /opt/graphite/storage/whisper/stats_counts directory. Then I restarted the carbon daemon.  I still get the number of signups per 10s interval.  :-(
Here's my configuration:
# /opt/graphite/conf/storage-aggregation.conf
[lower]
pattern = \.lower$
xFilesFactor = 0.1
aggregationMethod = min
[upper]
pattern = \.upper$
xFilesFactor = 0.1
aggregationMethod = max
[upper_90]
pattern = \.upper_90$
xFilesFactor = 0.1
aggregationMethod = max
[count]
pattern = \.count$
xFilesFactor = 0
aggregationMethod = sum
[count_ps]
pattern = \.count_ps$
xFilesFactor = 0
aggregationMethod = sum
[sum]
pattern = \.sum$
xFilesFactor = 0
aggregationMethod = sum
[sum_90]
pattern = \.sum_90$
xFilesFactor = 0
aggregationMethod = sum
[stats_counts]
pattern = ^stats_counts\.
xFilesFactor = 0
aggregationMethod = sum
[min]
pattern = \.min$
xFilesFactor = 0.1
aggregationMethod = min
[max]
pattern = \.max$
xFilesFactor = 0.1
aggregationMethod = max
[default_average]
pattern = .*
xFilesFactor = 0.5
aggregationMethod = average
And this:
# /opt/graphite/conf/storage-schemas.conf
[stats]
priority = 110
pattern = ^stats.*
retentions = 10s:6h,1m:7d,10m:1y
I'm starting to think that I did everything right and that Graphite is really doing what it's supposed to be doing. So the question is:
What's the proper way to configure statsd & graphite to draw the total number of signups since the beginning of time?
I guess I could change my Django code to count the total number of users, once in a while, and then use a gauge instead of an incr, but it feels like graphite should be able to just sum up whatever it receives, on the fly, not just when it aggregates data.
Edit:
Using Graphite's web interface, in the Graphite composer, I applied the integral function to the basic "signups per second" graph (in Graphite/stats/signups), and I got the desired graph (i.e. the total number of signups).  Is this the appropriate way to get a cumulated graph?  It's annoying because I need to select the full date range since the beginning of time, I cannot zoom into the graph, or else I just get the integral of the zoomed part.  :-(
Yes the integral() function is the correct way of doing this. Since StatsD is stateless in that regard (all collected data is reset/deleted after the flush to Graphite occurs) there is no way for it to be able to sum up all received data since a certain point.
From the Graphite documentation of the integral() function:
This will show the sum over time, sort of like a continuous addition function. Useful for finding totals or trends in metrics that are collected per minute.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With