* [PATCH] tcp_listener_stats: always eagerly close sockets @ 2023-09-23 9:56 Jean Boussier 2023-09-26 21:40 ` Eric Wong 0 siblings, 1 reply; 5+ messages in thread From: Jean Boussier @ 2023-09-23 9:56 UTC (permalink / raw) To: raindrops-public Hello, Once again apologies for not submitting the patch in an usable format. It can be downloaded with curl/wget from https://github.com/casperisfine/raindrops/commit/1c92b440ad7b11a1708a1d5ed75b0767f213b40a.patch I just debugged an issue with our system, I was witnessing the number of file descriptor in our process grow at an alarming rate which I mapped to our use of raindrops to report utilisation. For various reasons we don’t call raindrops from a Rack middleware but have one process that monitor the socket continuously, and share that data with the workers. Since we call tcp_listener_stats every seconds in a process that doesn't do much else, GC very rarely triggers if at all which cause `InetDiagSocket` instances to accumulate very quickly. Each of those instances holds a file descriptor. Looking at the raindrops implementation it seems to assume the GC will take care of regularly closing these sockets, but I think it’s a bit too bold of an assumption. Regards. --- ext/raindrops/linux_inet_diag.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/ext/raindrops/linux_inet_diag.c b/ext/raindrops/linux_inet_diag.c index 2a2360c..b3d9a51 100644 --- a/ext/raindrops/linux_inet_diag.c +++ b/ext/raindrops/linux_inet_diag.c @@ -634,7 +634,7 @@ static VALUE tcp_listener_stats(int argc, VALUE *argv, VALUE self) switch (TYPE(addrs)) { case T_STRING: rb_hash_aset(rv, addrs, tcp_stats(&args, addrs)); - return rv; + goto out; case T_ARRAY: { long i; long len = RARRAY_LEN(addrs); @@ -643,7 +643,7 @@ static VALUE tcp_listener_stats(int argc, VALUE *argv, VALUE self) VALUE cur = rb_ary_entry(addrs, 0); rb_hash_aset(rv, cur, tcp_stats(&args, cur)); - return rv; + goto out; } for (i = 0; i < len; i++) { union any_addr check; @@ -659,6 +659,7 @@ static VALUE tcp_listener_stats(int argc, VALUE *argv, VALUE self) gen_bytecode_all(&args.iov[2]); break; default: + rb_io_close(sock); rb_raise(rb_eArgError, "addr must be an array of strings, a string, or nil"); } @@ -671,6 +672,7 @@ static VALUE tcp_listener_stats(int argc, VALUE *argv, VALUE self) if (RHASH_SIZE(rv) > 1) rb_hash_foreach(rv, drop_placeholders, Qfalse); +out: /* let GC deal with corner cases */ if (argc < 2) rb_io_close(sock); return rv; ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] tcp_listener_stats: always eagerly close sockets 2023-09-23 9:56 [PATCH] tcp_listener_stats: always eagerly close sockets Jean Boussier @ 2023-09-26 21:40 ` Eric Wong 2023-09-30 23:13 ` [PATCH] middleware: reuse inet_diag netlink socket Eric Wong 0 siblings, 1 reply; 5+ messages in thread From: Eric Wong @ 2023-09-26 21:40 UTC (permalink / raw) To: Jean Boussier; +Cc: raindrops-public Jean Boussier <jean.boussier@gmail.com> wrote: > Hello, > > Once again apologies for not submitting the patch in an usable format. <snip> I was actually able to run `git am' on your raw mail just fine w/o needing an active connection. So whatever you did worked[1] More inline... <snip> > Looking at the raindrops implementation it seems to assume > the GC will take care of regularly closing these sockets, but > I think it’s a bit too bold of an assumption. Your code (and some of the Ruby code shipped w/ raindrops) should probably be reusing sockets given our API allows it. At the top of the tcp_listener_stats function, we have: rb_scan_args(argc, argv, "02", &addrs, &sock); and then: sock = NIL_P(sock) ? rb_funcall(cIDSock, id_new, 0) : rb_io_get_io(sock); So we only create sockets one isn't passed. > diff --git a/ext/raindrops/linux_inet_diag.c b/ext/raindrops/linux_inet_diag.c > index 2a2360c..b3d9a51 100644 > --- a/ext/raindrops/linux_inet_diag.c > +++ b/ext/raindrops/linux_inet_diag.c > @@ -634,7 +634,7 @@ static VALUE tcp_listener_stats(int argc, VALUE *argv, VALUE self) > switch (TYPE(addrs)) { > case T_STRING: > rb_hash_aset(rv, addrs, tcp_stats(&args, addrs)); > - return rv; > + goto out; OK > @@ -643,7 +643,7 @@ static VALUE tcp_listener_stats(int argc, VALUE *argv, VALUE self) > VALUE cur = rb_ary_entry(addrs, 0); > > rb_hash_aset(rv, cur, tcp_stats(&args, cur)); > - return rv; > + goto out; OK > } > for (i = 0; i < len; i++) { > union any_addr check; > @@ -659,6 +659,7 @@ static VALUE tcp_listener_stats(int argc, VALUE *argv, VALUE self) > gen_bytecode_all(&args.iov[2]); > break; > default: > + rb_io_close(sock); That needs the (argc < 2) guard like below in `out:'. We should never close sockets passed by the user. Indentation is also done with hard tabs for this project (and alignment with spaces, roughly git.git and Linux kernel style)[2]. I'll push out the patch below if it looks OK to you. > rb_raise(rb_eArgError, > "addr must be an array of strings, a string, or nil"); > } > @@ -671,6 +672,7 @@ static VALUE tcp_listener_stats(int argc, VALUE *argv, VALUE self) > if (RHASH_SIZE(rv) > 1) > rb_hash_foreach(rv, drop_placeholders, Qfalse); > > +out: OK. > /* let GC deal with corner cases */ > if (argc < 2) rb_io_close(sock); > return rv; That 'if (argc < 2)' is important for the exception above. This is what I'll push out: -----8<----- From: Jean Boussier <jean.boussier@gmail.com> Subject: [PATCH] tcp_listener_stats: always eagerly close sockets I just debugged an issue with our system, I was witnessing the number of file descriptor in our process grow at an alarming rate which I mapped to our use of raindrops to report utilisation. For various reasons we don’t call raindrops from a Rack middleware but have one process that monitor the socket continuously, and share that data with the workers. Since we call tcp_listener_stats every seconds in a process that doesn't do much else, GC very rarely triggers if at all which cause `InetDiagSocket` instances to accumulate very quickly. Each of those instances holds a file descriptor. Looking at the raindrops implementation it seems to assume the GC will take care of regularly closing these sockets, but I think it’s a bit too bold of an assumption. [ew: don't close user-passed sockets on exception] Acked-by: Eric Wong <e@80x24.org> --- Range-diff: 1: 6a93833 ! 1: 9b9909b tcp_listener_stats: always eagerly close sockets @@ Commit message the GC will take care of regularly closing these sockets, but I think it’s a bit too bold of an assumption. + [ew: don't close user-passed sockets on exception] + + Acked-by: Eric Wong <e@80x24.org> + ## ext/raindrops/linux_inet_diag.c ## @@ ext/raindrops/linux_inet_diag.c: static VALUE tcp_listener_stats(int argc, VALUE *argv, VALUE self) switch (TYPE(addrs)) { @@ ext/raindrops/linux_inet_diag.c: static VALUE tcp_listener_stats(int argc, VALUE gen_bytecode_all(&args.iov[2]); break; default: -+ rb_io_close(sock); ++ if (argc < 2) rb_io_close(sock); rb_raise(rb_eArgError, "addr must be an array of strings, a string, or nil"); } ext/raindrops/linux_inet_diag.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/ext/raindrops/linux_inet_diag.c b/ext/raindrops/linux_inet_diag.c index 2d4f503..e4050cb 100644 --- a/ext/raindrops/linux_inet_diag.c +++ b/ext/raindrops/linux_inet_diag.c @@ -636,7 +636,7 @@ static VALUE tcp_listener_stats(int argc, VALUE *argv, VALUE self) switch (TYPE(addrs)) { case T_STRING: rb_hash_aset(rv, addrs, tcp_stats(&args, addrs)); - return rv; + goto out; case T_ARRAY: { long i; long len = RARRAY_LEN(addrs); @@ -645,7 +645,7 @@ static VALUE tcp_listener_stats(int argc, VALUE *argv, VALUE self) VALUE cur = rb_ary_entry(addrs, 0); rb_hash_aset(rv, cur, tcp_stats(&args, cur)); - return rv; + goto out; } for (i = 0; i < len; i++) { union any_addr check; @@ -661,6 +661,7 @@ static VALUE tcp_listener_stats(int argc, VALUE *argv, VALUE self) gen_bytecode_all(&args.iov[2]); break; default: + if (argc < 2) rb_io_close(sock); rb_raise(rb_eArgError, "addr must be an array of strings, a string, or nil"); } @@ -673,6 +674,7 @@ static VALUE tcp_listener_stats(int argc, VALUE *argv, VALUE self) if (RHASH_SIZE(rv) > 1) rb_hash_foreach(rv, drop_placeholders, Qfalse); +out: /* let GC deal with corner cases */ rb_str_resize(buf, 0); if (argc < 2) rb_io_close(sock); [1] Fwiw, the mail submission port is open on yhbt.net and you can use `git send-email' with it: git send-email \ --smtp-domain=yhbt.net \ --smtp-debug=1 \ --smtp-encryption=tls \ --smtp-server-port=587 \ --smtp-server=yhbt.net \ --to raindrops-public@yhbt.net \ --suppress-cc=all /path/to/patches If you prefer pull requests, format messages with the "git request-pull" command so they're easy to search for[3]. Sorry, but using a proprietary+centralized hosting service owned by a convicted monopolist puts me in a bad mood, especially when I'm to blame for their success given my involvement in git.git. repo.or.cz and Sourcehut are 100% Free Software if you don't feel like self-hosting. [2] yes, tabs were roughly ~16% faster for `git grep' https://lore.kerne.org/git/20071018024553.GA5186@coredump.intra.peff.net/ [3] There are automated bots and search queries that can search for these in mail archives. While none are currently in use for this project, https://yhbt.net/raindrops.git will probably feature it in the nearish future. ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH] middleware: reuse inet_diag netlink socket 2023-09-26 21:40 ` Eric Wong @ 2023-09-30 23:13 ` Eric Wong 2023-09-30 23:35 ` [squash] make reusing inet_diag sock fork+preload safe Eric Wong 0 siblings, 1 reply; 5+ messages in thread From: Eric Wong @ 2023-09-30 23:13 UTC (permalink / raw) To: Jean Boussier; +Cc: raindrops-public > Your code (and some of the Ruby code shipped w/ raindrops) > should probably be reusing sockets given our API allows it. IOW, something like this (don't think test cases are worth updating): ------8<------- Subject: [PATCH] middleware: reuse inet_diag netlink socket No point in constantly allocating and deallocating FDs (and Ruby IO objects) when reusing them is supported. --- lib/raindrops/middleware.rb | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/lib/raindrops/middleware.rb b/lib/raindrops/middleware.rb index d5e3927..e0781f2 100644 --- a/lib/raindrops/middleware.rb +++ b/lib/raindrops/middleware.rb @@ -97,7 +97,9 @@ def initialize(app, opts = {}) if tmp.nil? && defined?(Unicorn) && Unicorn.respond_to?(:listener_names) tmp = Unicorn.listener_names end - @tcp = @unix = nil + @nl_sock = @tcp = @unix = nil + defined?(Raindrops::Linux.tcp_listener_stats) and + @nl_sock = Raindrops::InetDiagSocket.new if tmp @tcp = tmp.grep(/\A.+:\d+\z/) @@ -129,7 +131,7 @@ def stats_response # :nodoc: "writing: #{@stats.writing}\n" if defined?(Raindrops::Linux.tcp_listener_stats) - Raindrops::Linux.tcp_listener_stats(@tcp).each do |addr,stats| + Raindrops::Linux.tcp_listener_stats(@tcp, @nl_sock).each do |addr,stats| body << "#{addr} active: #{stats.active}\n" \ "#{addr} queued: #{stats.queued}\n" end if @tcp ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [squash] make reusing inet_diag sock fork+preload safe 2023-09-30 23:13 ` [PATCH] middleware: reuse inet_diag netlink socket Eric Wong @ 2023-09-30 23:35 ` Eric Wong 2023-12-29 17:44 ` [PATCH v3] middleware: reuse inet_diag netlink socket Eric Wong 0 siblings, 1 reply; 5+ messages in thread From: Eric Wong @ 2023-09-30 23:35 UTC (permalink / raw) To: Jean Boussier; +Cc: raindrops-public I'll squash this in for fork+preload safety. Fwiw, most of the stats stuff has never seen any real-world use to my knowledge :x Somebody else got me to work on it but never got around to using it, so you're probably the only guinea pig :> diff --git a/lib/raindrops/middleware.rb b/lib/raindrops/middleware.rb index e0781f2..a9e2ee3 100644 --- a/lib/raindrops/middleware.rb +++ b/lib/raindrops/middleware.rb @@ -98,8 +98,6 @@ def initialize(app, opts = {}) tmp = Unicorn.listener_names end @nl_sock = @tcp = @unix = nil - defined?(Raindrops::Linux.tcp_listener_stats) and - @nl_sock = Raindrops::InetDiagSocket.new if tmp @tcp = tmp.grep(/\A.+:\d+\z/) @@ -131,6 +129,7 @@ def stats_response # :nodoc: "writing: #{@stats.writing}\n" if defined?(Raindrops::Linux.tcp_listener_stats) + @nl_sock ||= Raindrops::InetDiagSocket.new Raindrops::Linux.tcp_listener_stats(@tcp, @nl_sock).each do |addr,stats| body << "#{addr} active: #{stats.active}\n" \ "#{addr} queued: #{stats.queued}\n" ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v3] middleware: reuse inet_diag netlink socket 2023-09-30 23:35 ` [squash] make reusing inet_diag sock fork+preload safe Eric Wong @ 2023-12-29 17:44 ` Eric Wong 0 siblings, 0 replies; 5+ messages in thread From: Eric Wong @ 2023-12-29 17:44 UTC (permalink / raw) To: Jean Boussier; +Cc: raindrops-public Eric Wong <bofh@yhbt.net> wrote: > I'll squash this in for fork+preload safety. I forget there are multithreaded servers using this :x ------8<----- Subject: [PATCH v3] middleware: reuse inet_diag netlink socket No point in constantly allocating and deallocating FDs (and Ruby IO objects) when reusing them is supported. However, we must guard it against parallel callers since linux_inet_diag.c::diag releases the GVL while calling sendmsg(2) and recvmsg(2), so it's possible two Ruby threads can end up crossing streams if the middlware is used with a multi-threaded server. AFAIK, Raindrops::Middleware isn't a heavily-trafficked endpoint, so saving FDs and avoiding thread specific data is preferable for memory-constrained systems I use. --- lib/raindrops/middleware.rb | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/lib/raindrops/middleware.rb b/lib/raindrops/middleware.rb index d5e3927..20e573c 100644 --- a/lib/raindrops/middleware.rb +++ b/lib/raindrops/middleware.rb @@ -1,5 +1,6 @@ # -*- encoding: binary -*- require 'raindrops' +require 'thread' # Raindrops::Middleware is Rack middleware that allows snapshotting # current activity from an HTTP request. For all operating systems, @@ -93,11 +94,12 @@ def initialize(app, opts = {}) @app = app @stats = opts[:stats] || Stats.new @path = opts[:path] || "/_raindrops" + @mtx = Mutex.new tmp = opts[:listeners] if tmp.nil? && defined?(Unicorn) && Unicorn.respond_to?(:listener_names) tmp = Unicorn.listener_names end - @tcp = @unix = nil + @nl_sock = @tcp = @unix = nil if tmp @tcp = tmp.grep(/\A.+:\d+\z/) @@ -129,9 +131,12 @@ def stats_response # :nodoc: "writing: #{@stats.writing}\n" if defined?(Raindrops::Linux.tcp_listener_stats) - Raindrops::Linux.tcp_listener_stats(@tcp).each do |addr,stats| - body << "#{addr} active: #{stats.active}\n" \ - "#{addr} queued: #{stats.queued}\n" + @mtx.synchronize do + @nl_sock ||= Raindrops::InetDiagSocket.new + Raindrops::Linux.tcp_listener_stats(@tcp, @nl_sock).each do |addr,stats| + body << "#{addr} active: #{stats.active}\n" \ + "#{addr} queued: #{stats.queued}\n" + end end if @tcp Raindrops::Linux.unix_listener_stats(@unix).each do |addr,stats| body << "#{addr} active: #{stats.active}\n" \ ^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-12-29 17:44 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-09-23 9:56 [PATCH] tcp_listener_stats: always eagerly close sockets Jean Boussier 2023-09-26 21:40 ` Eric Wong 2023-09-30 23:13 ` [PATCH] middleware: reuse inet_diag netlink socket Eric Wong 2023-09-30 23:35 ` [squash] make reusing inet_diag sock fork+preload safe Eric Wong 2023-12-29 17:44 ` [PATCH v3] middleware: reuse inet_diag netlink socket Eric Wong
Code repositories for project(s) associated with this public inbox https://yhbt.net/raindrops.git/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).