From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-3.0 required=3.0 tests=ALL_TRUSTED,AWL,BAYES_00, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF shortcircuit=no autolearn=ham autolearn_force=no version=3.4.6 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id C5F5F1F55F; Tue, 26 Sep 2023 21:40:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=yhbt.net; s=selector1; t=1695764400; bh=oXv4lqXk1n5jGqtfX0W/Z/8n2lfdxQxM9L3SegsKUGg=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=mSH1GBMQU03NrriwMRUjHW/BIIEAPLcvl2XKW/4kH+Xs94nYrtS37uQJD372Y4jDp m+v6WlFo+18xMLaHbqzzvPnFuGnUOGwvfZs/8X2r4wh6BQDGoG2nLTxTQ80B0B/zDo QUwc+W6In2ldeU77iieoz4qTr5zzzanORLgCQxgw= Date: Tue, 26 Sep 2023 21:40:00 +0000 From: Eric Wong To: Jean Boussier Cc: raindrops-public@yhbt.net Subject: Re: [PATCH] tcp_listener_stats: always eagerly close sockets Message-ID: <20230926214000.M564322@dcvr> References: <6E0E349D-A7CE-4B88-8F89-66438BB775A1@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <6E0E349D-A7CE-4B88-8F89-66438BB775A1@gmail.com> List-Id: Jean Boussier wrote: > Hello, > > Once again apologies for not submitting the patch in an usable format. I was actually able to run `git am' on your raw mail just fine w/o needing an active connection. So whatever you did worked[1] More inline... > Looking at the raindrops implementation it seems to assume > the GC will take care of regularly closing these sockets, but > I think it’s a bit too bold of an assumption. Your code (and some of the Ruby code shipped w/ raindrops) should probably be reusing sockets given our API allows it. At the top of the tcp_listener_stats function, we have: rb_scan_args(argc, argv, "02", &addrs, &sock); and then: sock = NIL_P(sock) ? rb_funcall(cIDSock, id_new, 0) : rb_io_get_io(sock); So we only create sockets one isn't passed. > diff --git a/ext/raindrops/linux_inet_diag.c b/ext/raindrops/linux_inet_diag.c > index 2a2360c..b3d9a51 100644 > --- a/ext/raindrops/linux_inet_diag.c > +++ b/ext/raindrops/linux_inet_diag.c > @@ -634,7 +634,7 @@ static VALUE tcp_listener_stats(int argc, VALUE *argv, VALUE self) > switch (TYPE(addrs)) { > case T_STRING: > rb_hash_aset(rv, addrs, tcp_stats(&args, addrs)); > - return rv; > + goto out; OK > @@ -643,7 +643,7 @@ static VALUE tcp_listener_stats(int argc, VALUE *argv, VALUE self) > VALUE cur = rb_ary_entry(addrs, 0); > > rb_hash_aset(rv, cur, tcp_stats(&args, cur)); > - return rv; > + goto out; OK > } > for (i = 0; i < len; i++) { > union any_addr check; > @@ -659,6 +659,7 @@ static VALUE tcp_listener_stats(int argc, VALUE *argv, VALUE self) > gen_bytecode_all(&args.iov[2]); > break; > default: > + rb_io_close(sock); That needs the (argc < 2) guard like below in `out:'. We should never close sockets passed by the user. Indentation is also done with hard tabs for this project (and alignment with spaces, roughly git.git and Linux kernel style)[2]. I'll push out the patch below if it looks OK to you. > rb_raise(rb_eArgError, > "addr must be an array of strings, a string, or nil"); > } > @@ -671,6 +672,7 @@ static VALUE tcp_listener_stats(int argc, VALUE *argv, VALUE self) > if (RHASH_SIZE(rv) > 1) > rb_hash_foreach(rv, drop_placeholders, Qfalse); > > +out: OK. > /* let GC deal with corner cases */ > if (argc < 2) rb_io_close(sock); > return rv; That 'if (argc < 2)' is important for the exception above. This is what I'll push out: -----8<----- From: Jean Boussier Subject: [PATCH] tcp_listener_stats: always eagerly close sockets I just debugged an issue with our system, I was witnessing the number of file descriptor in our process grow at an alarming rate which I mapped to our use of raindrops to report utilisation. For various reasons we don’t call raindrops from a Rack middleware but have one process that monitor the socket continuously, and share that data with the workers. Since we call tcp_listener_stats every seconds in a process that doesn't do much else, GC very rarely triggers if at all which cause `InetDiagSocket` instances to accumulate very quickly. Each of those instances holds a file descriptor. Looking at the raindrops implementation it seems to assume the GC will take care of regularly closing these sockets, but I think it’s a bit too bold of an assumption. [ew: don't close user-passed sockets on exception] Acked-by: Eric Wong --- Range-diff: 1: 6a93833 ! 1: 9b9909b tcp_listener_stats: always eagerly close sockets @@ Commit message the GC will take care of regularly closing these sockets, but I think it’s a bit too bold of an assumption. + [ew: don't close user-passed sockets on exception] + + Acked-by: Eric Wong + ## ext/raindrops/linux_inet_diag.c ## @@ ext/raindrops/linux_inet_diag.c: static VALUE tcp_listener_stats(int argc, VALUE *argv, VALUE self) switch (TYPE(addrs)) { @@ ext/raindrops/linux_inet_diag.c: static VALUE tcp_listener_stats(int argc, VALUE gen_bytecode_all(&args.iov[2]); break; default: -+ rb_io_close(sock); ++ if (argc < 2) rb_io_close(sock); rb_raise(rb_eArgError, "addr must be an array of strings, a string, or nil"); } ext/raindrops/linux_inet_diag.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/ext/raindrops/linux_inet_diag.c b/ext/raindrops/linux_inet_diag.c index 2d4f503..e4050cb 100644 --- a/ext/raindrops/linux_inet_diag.c +++ b/ext/raindrops/linux_inet_diag.c @@ -636,7 +636,7 @@ static VALUE tcp_listener_stats(int argc, VALUE *argv, VALUE self) switch (TYPE(addrs)) { case T_STRING: rb_hash_aset(rv, addrs, tcp_stats(&args, addrs)); - return rv; + goto out; case T_ARRAY: { long i; long len = RARRAY_LEN(addrs); @@ -645,7 +645,7 @@ static VALUE tcp_listener_stats(int argc, VALUE *argv, VALUE self) VALUE cur = rb_ary_entry(addrs, 0); rb_hash_aset(rv, cur, tcp_stats(&args, cur)); - return rv; + goto out; } for (i = 0; i < len; i++) { union any_addr check; @@ -661,6 +661,7 @@ static VALUE tcp_listener_stats(int argc, VALUE *argv, VALUE self) gen_bytecode_all(&args.iov[2]); break; default: + if (argc < 2) rb_io_close(sock); rb_raise(rb_eArgError, "addr must be an array of strings, a string, or nil"); } @@ -673,6 +674,7 @@ static VALUE tcp_listener_stats(int argc, VALUE *argv, VALUE self) if (RHASH_SIZE(rv) > 1) rb_hash_foreach(rv, drop_placeholders, Qfalse); +out: /* let GC deal with corner cases */ rb_str_resize(buf, 0); if (argc < 2) rb_io_close(sock); [1] Fwiw, the mail submission port is open on yhbt.net and you can use `git send-email' with it: git send-email \ --smtp-domain=yhbt.net \ --smtp-debug=1 \ --smtp-encryption=tls \ --smtp-server-port=587 \ --smtp-server=yhbt.net \ --to raindrops-public@yhbt.net \ --suppress-cc=all /path/to/patches If you prefer pull requests, format messages with the "git request-pull" command so they're easy to search for[3]. Sorry, but using a proprietary+centralized hosting service owned by a convicted monopolist puts me in a bad mood, especially when I'm to blame for their success given my involvement in git.git. repo.or.cz and Sourcehut are 100% Free Software if you don't feel like self-hosting. [2] yes, tabs were roughly ~16% faster for `git grep' https://lore.kerne.org/git/20071018024553.GA5186@coredump.intra.peff.net/ [3] There are automated bots and search queries that can search for these in mail archives. While none are currently in use for this project, https://yhbt.net/raindrops.git will probably feature it in the nearish future.