Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spring Boot Metrics.counter("x").count() always return 0 for Influxdb

Having a spring-boot (2.7.10) application include:

    <dependency>
        <groupId>io.micrometer</groupId>
        <artifactId>micrometer-core</artifactId>
    </dependency>
    <dependency>
        <groupId>io.micrometer</groupId>
        <artifactId>micrometer-registry-influx</artifactId>
    </dependency>

Connecting successfully to InfluxDB.

Now, executing

Metrics.counter("registration.upsigned").increment();
LOG.info("Size of registration-counter now: " + Metrics.counter("registration.upsigned").count());

Create successfully a tick in grafana

enter image description here

But the logging-result remains 0:

19-Jun-2023 07:25:15.282 INFO [https-jsse-nio2-14xx26-443-exec-123] xxx.register Size of registration-counter now: 0.0

How to keep increment the counter and surviving reboots of the application?

like image 780
Grim Avatar asked Sep 18 '25 21:09

Grim


1 Answers

The behavior you described could be explained by the following.

As indicated in my original answer InfluxMeterRegistry under the hood uses StepCounters to implement counters.

To obtain the actual count StepCounter uses the following code:

@Override
public double count() {
    return value.poll();
}

where value turns out to be a convenient implementation for handling double values of the StepValue class.

Please, be aware that when describing poll, they indicate:

/**
 * @return The value for the last completed interval.
 */
public V poll() {
    rollCount(clock.wallTime());
    return previous;
}

i.e. it is reporting the value for the last completed interval, not the actual one.

Consider review this test, reproduced here for completeness, I think it describes the programmed behavior in a very clear way:

@Test
void poll() {
    final AtomicLong aLong = new AtomicLong(42);
    final long stepTime = 60;


    final StepValue<Long> stepValue = new StepValue<Long>(clock, stepTime) {
        @Override
        public Supplier<Long> valueSupplier() {
            return () -> aLong.getAndSet(0);
        }


        @Override
        public Long noValue() {
            return 0L;
        }
    };


    assertThat(stepValue.poll()).isEqualTo(0L);


    clock.add(Duration.ofMillis(1));
    assertThat(stepValue.poll()).isEqualTo(0L);


    clock.add(Duration.ofMillis(59));
    assertThat(stepValue.poll()).isEqualTo(42L);


    clock.add(Duration.ofMillis(60));
    assertThat(stepValue.poll()).isEqualTo(0L);


    clock.add(Duration.ofMillis(60));
    assertThat(stepValue.poll()).isEqualTo(0L);


    aLong.set(24);
    assertThat(stepValue.poll()).isEqualTo(0L);


    clock.add(Duration.ofMillis(60));
    assertThat(stepValue.poll()).isEqualTo(24L);


    clock.add(Duration.ofMillis(stepTime / 2));
    aLong.set(27);
    assertThat(stepValue.poll()).isEqualTo(24L);


    stepValue._closingRollover();
    assertThat(stepValue.poll()).isEqualTo(27L);
}

Note that for this purpose increment is doing something "equivalent" to the different aLong.set presented in the previous test code.

I think this other test could be of help as well for describing how the value reported is the one for the last completed interval, not the actual one:

@Test
void count() {
    Counter counter = registry.counter("my.counter");


    assertThat(counter).isInstanceOf(StepCounter.class);
    assertThat(counter.count()).isEqualTo(0);
    counter.increment();
    clock.add(config.step());
    assertThat(counter.count()).isEqualTo(1);
    counter.increment();
    counter.increment(5);
    clock.add(config.step());
    assertThat(counter.count()).isEqualTo(6);
    clock.add(config.step());
    assertThat(counter.count()).isEqualTo(0);
}

In conclusion, what I tried meaning is that the log in your code fragment:

Metrics.counter("registration.upsigned").increment();
LOG.info("Size of registration-counter now: " + Metrics.counter("registration.upsigned").count());

will always print out the value for the last completed interval, not for the current one, and it may be or not the value you expect depending on whether or not the interval in which you performed the increment elapsed.

That is not specific for InfluxDB but for any Metrics registry implementation based on the step concept.

In any case, the most valuable information you can obtain is the one that is actually published to your backend, InfluxDB or the one you defined - which causes among other things the step related values to be polled.

I don't know if it will shed some additional light or not, but you can read about this rate data aggregation step based mechanism in the Micrometer documentation when they describe client side rate aggregation - please, note that InfluxBD is one of this client-side rate aggregation monitoring system. From the docs:

Micrometer efficiently maintains rate data by means of a step value that accumulates data for the current publishing interval. When the step value is polled (when publishing, for example), if the step value detects that the current interval has elapsed, it moves current data to “previous” state. This previous state is what is reported until the next time current data overwrites it. The following image shows the interaction of current and previous state, along with polling:

Behavior of a step value

* From my understanding the arrows signal publish events.

As indicated, your log (which, as you can see in the source code presented above, polls when performing count) will always report the previous value presented in the image which may or not be the same you expect according to your increments invocations and step intervals (your log may behave as the first pollAsRate presented in the image, in which it will print out 0 despite the fact two increments were performed).

If you prefer, here is a simplified version of the process, more focused in your example:

uml for process

Please, take in mind that a step based counter:

  • Always maintains two values under the hood, the current one, internal, not visible outside the class, and a previous one, visible, the one that is taking into account when publishing data.
  • Every step is configured with a time interval for rate aggregation.
  • The previous value is updated with the current value when any of two things happen:
    • A poll is requested, either by invoking count manually or usually when publishing information to your metrics registry, and the step interval is elapsed.
    • Micrometer shutdowns and claims for a closing rollover, i.e., send the last acquired data to your metrics repository.

To illustrate it, please, consider the following test case, I think it could be of help:

import java.time.Duration;

import org.junit.jupiter.api.Test;

import io.micrometer.core.instrument.Clock;
import io.micrometer.core.instrument.Counter;
import io.micrometer.core.instrument.MeterRegistry;
import io.micrometer.core.instrument.Tags;
import io.micrometer.core.instrument.simple.CountingMode;
import io.micrometer.core.instrument.simple.SimpleConfig;
import io.micrometer.core.instrument.simple.SimpleMeterRegistry;

public class MicrometerTest {

  @Test
  public void testStepCounterBehavior() {
    Clock clock = Clock.SYSTEM;
    MeterRegistry registry = new SimpleMeterRegistry(new SimpleConfig() {
      @Override
      public String get(String key) {
        return null;
      }

      @Override
      public CountingMode mode() {
        return CountingMode.STEP;
      }

      @Override
      public Duration step() {
        return Duration.ofMillis(1000);
      }
    }, clock);

    double value = -1, previousValue = -1;
    Tags tags = Tags.empty();
    Counter counter = registry.counter("test.counter", tags);
    long start = clock.wallTime();
    counter.increment();

    int i = 0;
    while (clock.wallTime() < start + 5000) {
      value = counter.count();
      if (value != previousValue) {
        System.out.printf("Current value: %.0f. Elapsed (millis): %d.\n", value, clock.wallTime() - start);
        previousValue = value;
        if (i++ == 2) {
          counter.increment(3);
        }
      }
    }
  }
}

It will output, with some random bias, something similar to:

Current value: 0. Elapsed (millis): 0.
Current value: 1. Elapsed (millis): 601.
Current value: 0. Elapsed (millis): 1600.
Current value: 3. Elapsed (millis): 2600.
Current value: 0. Elapsed (millis): 3600.

Note that except for the first spurious output which reports a time less than the configured 1 second step interval, I think motivated by the different initialization stuff, the values and time printed are consistent with the explanation provided.

like image 162
jccampanero Avatar answered Sep 20 '25 10:09

jccampanero