about summary refs log tree commit homepage
path: root/ext
diff options
context:
space:
mode:
authorDale Hamel <dale.hamel@shopify.com>2023-02-24 13:35:44 -0500
committerEric Wong <bofh@yhbt.net>2023-02-24 19:02:15 +0000
commita5902f243cd245d615fc0a443c370559db9117f9 (patch)
treea3191e0747314f3acadf3063a2560c9984fc52a7 /ext
parent2fa3153681c8bdcaf4d97ec7080c3aad31d5afd7 (diff)
downloadraindrops-a5902f243cd245d615fc0a443c370559db9117f9.tar.gz
SO_REUSEPORT was introduced in 2013, see https://lwn.net/Articles/542629/.

However, the current logic for calculating socket queue backlog stats
is from 2011, predating the advent of SO_REUSEPORT.

The current strategy was thus written before the notion that the mapping of
INET_ADDR:socket could potentially be 1:N instead of always 1:1.

This causes raindrops to provide invalid socket backlog values when
SO_REUSEPORT is used, which Unicorn supports since 2013. The current behaviour
will set the queue metric to the queue depth of the last inet_diag_msg
it processes matching the INET_ADDR. In practice, this will result in
the backlog being off by a factor N, assuming relatively even
distribution of requests to sockets, which the kernel should guarantee
through consistent hashing.

The fix here is to accumulate the socket queue depth as we iterate
over the the socket diagnostics, rather than reset it each time.

I have a provided a test, but it only checks the queues, not the
accept metrics, as those are not affected by this bug, and it is
not possible to know which of the listeners will be dispatched the
request by the kernel, and thus which should call accept.
Diffstat (limited to 'ext')
-rw-r--r--ext/raindrops/linux_inet_diag.c4
1 files changed, 2 insertions, 2 deletions
diff --git a/ext/raindrops/linux_inet_diag.c b/ext/raindrops/linux_inet_diag.c
index cabd427..2a2360c 100644
--- a/ext/raindrops/linux_inet_diag.c
+++ b/ext/raindrops/linux_inet_diag.c
@@ -306,7 +306,7 @@ static void table_set_queued(st_table *table, struct inet_diag_msg *r)
 {
         struct listen_stats *stats = stats_for(table, r);
         stats->listener_p = 1;
-        stats->queued = r->idiag_rqueue;
+        stats->queued += r->idiag_rqueue;
 }
 
 /* inner loop of inet_diag, called for every socket returned by netlink */
@@ -328,7 +328,7 @@ static inline void r_acc(struct nogvl_args *args, struct inet_diag_msg *r)
                 if (args->table)
                         table_set_queued(args->table, r);
                 else
-                        args->stats.queued = r->idiag_rqueue;
+                        args->stats.queued += r->idiag_rqueue;
         }
         /*
          * we wont get anything else because of the idiag_states filter