I'm trying to determine the time that it takes for a machine to receive a packet, process it and give back an answer.
This machine, that I'll call 'server', runs a very simple program, which receives a packet (recv(2)) in a buffer, copies the received content (memcpy(3)) to another buffer and sends the packet back (send(2)). The server runs NetBSD 5.1.2.
My client measures the round-trip time a number of times (pkt_count):
struct timespec start, end;
for(i = 0; i < pkt_count; ++i)
{
    printf("%d ", i+1);
    clock_gettime(CLOCK_MONOTONIC, &start);        
    send(sock, send_buf, pkt_size, 0);
    recv(sock, recv_buf, pkt_size, 0);
    clock_gettime(CLOCK_MONOTONIC, &end);        
    //struct timespec nsleep = {.tv_sec = 0, .tv_nsec = 100000};
    //nanosleep(&nsleep, NULL);
    printf("%.3f ", timespec_diff_usec(&end, &start));
}   
I removed error checks and other minor things for clarity. The client runs on an Ubuntu 12.04 64-bit. Both programs run in real-time priority, although only the Ubuntu kernel is real time (-rt). The connection between the programs is TCP. This works fine and gives me an average of 750 microseconds.
However, if I enable the commented out nanosleep call (with a sleep of 100 µs), my measures drop 100 µs, giving an average of 650 µs. If I sleep for 200 µs, the measures drop to 550 µs, and so on. This goes up until a sleep of 600 µs, giving an average of 150 µs. Then, if I raise the sleep to 700 µs, my measures go way up to 800 µs in average. I confirmed my program's measures with Wireshark.
I can't figure out what is happening. I already set the TCP_NODELAY socket option in both client and server, no difference. I used UDP, no difference (same behavior). So I guess this behavior is not due to the Nagle algorithm. What could it be?
[UPDATE]
Here's a screenshot of the output of the client together with Wireshark. Now, I ran my server in another machine. I used the same OS with the same configuration (as it is a Live System in a pen drive), but the hardware is different. This behaviour didn't show up, everything worked as expected. But the question remains: why does it happen in the previous hardware?

[UPDATE 2: More info]
As I said before, I tested my pair of programs (client/server) in two different server computers. I plotted the two results obtained.

The first server (the weird one) is a RTD Single Board Computer, with a 1Gbps Ethernet interface. The second server (the normal one) is a Diamond Single Board Computer with a 100Mbps Ethernet interface. Both of them run the SAME OS (NetBSD 5.1.2) from the SAME pendrive.
From these results, I do believe that this behaviour is due either to the driver or the to NIC itself, although I still can't imagine why it happens...
OK, I reached a conclusion.
I tried my program using Linux, instead of NetBSD, in the server. It ran as expected, i.e., no matter how much I [nano]sleep in that point of the code, the result is the same.
This fact tells me that the problem might lie in the NetBSD's interface driver. To identify the driver, I read the dmesg output. This is the relevant part:
wm0 at pci0 dev 25 function 0: 82801I mobile (AMT) LAN Controller, rev. 3
wm0: interrupting at ioapic0 pin 20
wm0: PCI-Express bus
wm0: FLASH
wm0: Ethernet address [OMMITED]
ukphy0 at wm0 phy 2: Generic IEEE 802.3u media interface
ukphy0: OUI 0x000ac2, model 0x000b, rev. 1
ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
So, as you can see, my interface is called wm0. According to this (page 9) I should check which driver is loaded by consulting the file sys/dev/pci/files.pci, line 625 (here). It shows:
# Intel i8254x Gigabit Ethernet
device  wm: ether, ifnet, arp, mii, mii_bitbang
attach  wm at pci
file    dev/pci/if_wm.c         wm
Then, searching through the driver source code (dev/pci/if_wm.c, here), I found a snippet of code that might change the driver behavior:
/*
 * For N interrupts/sec, set this value to:
 * 1000000000 / (N * 256).  Note that we set the
 * absolute and packet timer values to this value
 * divided by 4 to get "simple timer" behavior.
 */
sc->sc_itr = 1500;              /* 2604 ints/sec */
CSR_WRITE(sc, WMREG_ITR, sc->sc_itr);
Then I changed this 1500 value to 1 (trying to increase the number of interrupts per second allowed) and to 0 (trying to eliminate the interrupt throttling altogether), but both of these values produced the same result:
This is, at least better behaved than the previous situation.
Therefore, I concluded that the behavior is due to the interface driver of the server. I am not willing to investigate it further in order to find other culprits, as I am moving from NetBSD to Linux for the project involving this Single Board Computer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With