unicorn Ruby/Rack server user+dev discussion/patches/pulls/bugs/help
 help / color / Atom feed
From: Simon Eskildsen <simon.eskildsen@shopify.com>
To: Eric Wong <e@80x24.org>
Cc: unicorn-public@bogomips.org, Jeremy Evans <code@jeremyevans.net>
Subject: Re: after_worker_exit on murder
Date: Wed, 5 Apr 2017 06:55:27 -0400
Message-ID: <CAO3HKM5sLtUGFpJ9RC7i6KYCRzGr8gLtUrhJZueToh2rH8PBfw@mail.gmail.com> (raw)
In-Reply-To: <20170405011932.GA24739@starla>

I agree with you in principle, absolutely. However, when you have a
code-base the size of ours (100Ks of lines of Ruby) with 100s of
developers, and new ones coming on every month with no prior Ruby or
Rails experience, we can't rely on everyone doing the right thing all
the time. With a surface area of that size, there will be things that
are missed, especially when you run gems like Liquid that have
billions of different ways of composing templates—some of these paths,
unfortunately, are going to be slow. We definitely chase down all the
worst offenders, but when new ones creep up, we do our best to chase
them down when time allows. Using this hook, allows us to monitor when
that happens, and how often it happens, and for which endpoints. With
100Ks of lines of code, 100s of developers, and 10s of thousands of
requests per second—once in a million happens every couple of seconds.
Multiply that with the size of the code-base, and Unicorn timeouts due
to the conditions below will happen somewhat often.

It becomes difficult, because sometimes you have legitimate requests
that take 10-20s, because the merchant's data set is so large that it
exposes anomalies. Again, with the size of our code-base, we need this
wiggle room in the global timeout to not just error on users. You can
have endpoints that do 4 HTTP requests, 5 RPC requests, 4 MySQL
queries, and 30 calls to Memcached. In that case, your worst case is
the timeout of all of those actions, which easily exceeds the Unicorn
timeout. We've debated having "budgets" and "shitlisting"
(http://sirupsen.com/shitlists/) paths that obviously take longer than
the budget for a single resource. The probability of more than one
resource being very slow at once, is quite low (and if it is, again,
we rely on the Unicorn timeout).

In other words, the Unicorn timeout is not a crutch for timeouts in
the application, but a global timeout to as a last line of defense
against many timeouts, or some bug we didn't foresee. This seems
unavoidable in my eyes, unless you have very aggressive timeouts and
meticulously keep track of the budgets in a testing environment and
raise if the budget is exceeded. When we hit many timeouts, we use
Semian (http://github.com/shopify/semian) to trigger circuit
breakers—so the reliance on Unicorn should be brief.

Some of these bugs are even deep in Ruby, Jean B, one of my co-workers
submitted a bug about there being no write_timeout in Net::HTTP (you
even replied!): https://bugs.ruby-lang.org/issues/13396

BTW we deployed 5.3.0 and replaced our `before_murder` hook with
`after_worker_exit`. Everything works perfectly and we finally are not
using a forked version of Unicorn anymore. Thanks for the release!

  reply index

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-04 14:08 Simon Eskildsen
2017-04-04 14:32 ` Jeremy Evans
2017-04-04 14:36   ` Simon Eskildsen
2017-04-05  1:19 ` Eric Wong
2017-04-05 10:55   ` Simon Eskildsen [this message]
2017-04-05 18:33     ` Eric Wong

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://bogomips.org/unicorn/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAO3HKM5sLtUGFpJ9RC7i6KYCRzGr8gLtUrhJZueToh2rH8PBfw@mail.gmail.com \
    --to=simon.eskildsen@shopify.com \
    --cc=code@jeremyevans.net \
    --cc=e@80x24.org \
    --cc=unicorn-public@bogomips.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

unicorn Ruby/Rack server user+dev discussion/patches/pulls/bugs/help

Archives are clonable:
	git clone --mirror http://bogomips.org/unicorn-public
	git clone --mirror http://ou63pmih66umazou.onion/unicorn-public

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.lang.ruby.unicorn
	nntp://ou63pmih66umazou.onion/inbox.comp.lang.ruby.unicorn

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox