unicorn Ruby/Rack server user+dev discussion/patches/pulls/bugs/help
 help / color / mirror / code / Atom feed
From: Eric Wong <e@80x24.org>
To: Simon Eskildsen <simon.eskildsen@shopify.com>
Cc: unicorn-public@bogomips.org
Subject: Re: [PATCH] check_client_connection: use tcp state on linux
Date: Sat, 25 Feb 2017 23:12:44 +0000	[thread overview]
Message-ID: <20170225231243.GA6224@dcvr.yhbt.net> (raw)
In-Reply-To: <CAO3HKM76Pu-YB8h7caxhqNDvcOXbBT_eZYoMhy7gLkb7-O2bNA@mail.gmail.com>

Simon Eskildsen <simon.eskildsen@shopify.com> wrote:
 
> On Sat, Feb 25, 2017 at 9:03 AM, Simon Eskildsen
> <simon.eskildsen@shopify.com> wrote:
> > The implementation of the check_client_connection causing an early write is
> > ineffective when not performed on loopback. In my testing, when on non-loopback,
> > such as another host, we only see a 10-20% rejection rate with TCP_NODELAY of
> > clients that are closed. This means 90-80% of responses in this case are
> > rendered in vain.
> >
> > This patch uses getosockopt(2) with TCP_INFO on Linux to read the socket state.
> > If the socket is in a state where it doesn't take a response, we'll abort the
> > request with a similar error as to what write(2) would give us on a closed
> > socket in case of an error.
> >
> > In my testing, this has a 100% rejection rate. This testing was conducted with
> > the following script:

Good to know!

> > diff --git a/lib/unicorn/http_request.rb b/lib/unicorn/http_request.rb
> > index 0c1f9bb..b4c95fc 100644
> > --- a/lib/unicorn/http_request.rb
> > +++ b/lib/unicorn/http_request.rb
> > @@ -31,6 +31,9 @@ class Unicorn::HttpParser
> >    @@input_class = Unicorn::TeeInput
> >    @@check_client_connection = false
> >
> > +  # TCP_TIME_WAIT: 6, TCP_CLOSE: 7, TCP_CLOSE_WAIT: 8, TCP_LAST_ACK: 9
> > +  IGNORED_CHECK_CLIENT_SOCKET_STATES = (6..9)

Thanks for documenting the numbers.

I prefer we use a hash or case statement.  Both allow more
optimization in the YARV VM of CRuby (opt_aref and
opt_case_dispatch in insns.def).  case _might_ be a little
faster if there's no constant lookup overhead, but 
a microbench or dumping the bytecode will be necessary
to be sure :)

A hash or a case can also help portability-wise in case
we hit a system where these numbers are non-sequential;
or if we forgot something.

> > -    # detect if the socket is valid by writing a partial response:
> > -    if @@check_client_connection && headers?
> > -      self.response_start_sent = true
> > -      HTTP_RESPONSE_START.each { |c| socket.write(c) }
> > +    # detect if the socket is valid by checking client connection.
> > +    if @@check_client_connection

We can probably split everything inside this if to a new
method, this avoid penalizing people who don't use this feature.
Using check_client_connection will already have a high cost,
since it requires at least one extra syscall.

> > +      if defined?(Raindrops::TCP_Info)

...because defined? only needs to be checked once for the
lifetime of the process; we can define different methods
at load time to avoid the check for each and every request.

> > +        tcp_info = Raindrops::TCP_Info.new(socket)
> > +        if IGNORED_CHECK_CLIENT_SOCKET_STATES.cover?(tcp_info.state)
> > +          socket.close

Right, no need to socket.close, here; handle_error does it.

> > +          raise Errno::EPIPE

Since we don't print out the backtrace in handle_error, we
can raise without a backtrace to avoid excess garbage.

> > +        end
> > +      elsif headers?
> > +        self.response_start_sent = true
> > +        HTTP_RESPONSE_START.each { |c| socket.write(c) }
> > +      end
> >      end

> I left out a TCPSocket check, we'll need a `socket.is_a?(TCPSocket)`
> in there. I'll wait with sending an updated patch in case you have
> other comments. I'm also not entirely sure we need the `socket.close`.
> What do you think?

Yep, we need to account for the UNIX socket case.  I forget if
kgio even makes them different...

Anyways I won't be online much for a few days, and it's the
weekend, so no rush :)

Thanks.

  reply	other threads:[~2017-02-25 23:12 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-25 14:03 [PATCH] check_client_connection: use tcp state on linux Simon Eskildsen
2017-02-25 16:19 ` Simon Eskildsen
2017-02-25 23:12   ` Eric Wong [this message]
2017-02-27 11:44     ` Simon Eskildsen
2017-02-28 21:12       ` Eric Wong
2017-03-01  3:18         ` Eric Wong
2017-03-06 21:32           ` Simon Eskildsen
2017-03-07 22:50             ` Eric Wong
2017-03-08  0:26               ` Eric Wong
2017-03-08 12:06                 ` Simon Eskildsen
2017-03-13 20:16                   ` Simon Eskildsen
2017-03-13 20:37                     ` Eric Wong
2017-03-14 16:14                       ` Simon Eskildsen
2017-03-14 16:41                         ` Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://yhbt.net/unicorn/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170225231243.GA6224@dcvr.yhbt.net \
    --to=e@80x24.org \
    --cc=simon.eskildsen@shopify.com \
    --cc=unicorn-public@bogomips.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhbt.net/unicorn.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).