From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=unavailable autolearn_force=no version=3.4.0 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id CC9331FF70; Tue, 7 Jun 2016 13:41:07 +0000 (UTC) Date: Tue, 7 Jun 2016 13:41:07 +0000 From: Eric Wong To: Jesper =?utf-8?Q?R=C3=B8nn-Jensen?= Cc: unicorn-public@bogomips.org Subject: Re: [PATCH] `unicorn upgrade` script resilience against exceptions Message-ID: <20160607134107.GA4500@dcvr.yhbt.net> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: List-Id: Jesper Rønn-Jensen wrote: > This is actually a change proposal "by accident". But first, some > background: Please don't send HTML mail. I guess the bounce message didn't include the reason for the bounce, but should be fixed; now; and the bounce will tell you to not send HTML. Anyways if you stick to old-school Usenet/mailing list posting conventions, I'm more than happy to help you :) > My server uses a standard version of this `init.sh` script. I found that > executing `unicorn upgrade` often gives me problems. Hm... I haven't looked at that script in a while given all the new-fangled init replacements... > For example, it breaks my Capistrano deploy script to use `unicorn:upgrade` > to deploy. This is a shame, since it's the correct > command for Unicorn to pick up any code changes. But unicorn upgrade in the > script fails with messages like: > > ``` > sudo /etc/init.d/unicorn upgrade > Couldn't upgrade, starting 'export HOME=/home/deploy ; cd > /var/www/my_app/current && /home/deploy/.rvm/wrappers/my_app/bundle exec > unicorn -D -c /var/www/my_app/shared/config/unicorn.rb -E production' > instead > master failed to start, check stderr log for details > ``` > > And the stderror log: > ``` > E, [2016-06-07T09:03:27.517267 #27583] ERROR -- : reaped # pid 27621 exit 1> exec()-ed > /var/www/my_app/shared/vendor/bundle/ruby/2.3.0/gems/unicorn-5.1.0/lib/unicorn/http_server.rb:195:in > `pid=': Already running on PID:27583 (or > pid=/var/www/my_app/shared/tmp/pids/unicorn.pid is stale) > (ArgumentError) > ``` > > I am proposing this change because I misread the documentation for the > SIGNALS! > I thought that Unicorn itself sends a QUIT after all subprocesses has ended > via SIG USR2. > > The change is to remove the QUIT signal to the PID.oldbin process, and that > apparently stops all the failures I ran into. So I made this change on a > server, and it works! > > I just don't understand why, since reading the documentation again it says: > https://github.com/defunkt/unicorn/blob/d23d4713dc9ab9732c574f5aa34a4b6740b43164/SIGNALS#L35 > > > * USR2 - reexecute the running binary. A separate QUIT > > should be sent to the original process once the child is verified to > > be up and running. > > So clearly, I should send the QUIT signal. Yes. > My goal is to have a resilient, robust deployment, where unicorn picks up > any code changes. Unfortunately, PID files have always been racy with USR2. Nowadays systemd is fairly standardized and seems to work pretty well for managing sockets and services. I'm still not a fan of some systemd things, but I think the socket activation part is very nice. Example @.service and .socket files are distributed nowadays: https://unicorn.bogomips.org/examples/unicorn@.service https://unicorn.bogomips.org/examples/unicorn.socket > Can you help me and enlighten me as to why this proposed change works? Anyways, things are racy in that script and increasing the "sleep 2" to a higher number will help if your system is really overloaded. > --- > examples/init.sh | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/examples/init.sh b/examples/init.sh > index 1f0e035..cbadc11 100644 > --- a/examples/init.sh > +++ b/examples/init.sh > @@ -45,7 +45,7 @@ restart|reload) > $CMD > ;; > upgrade) > - if sig USR2 && sleep 2 && sig 0 && oldsig QUIT > + if sig USR2 && sleep 2 && sig 0 > then > n=$TIMEOUT > while test -s $old_pid && test $n -ge 0 > -- I actually wrote a better init script for a similar server, patch coming in a bit (or later, close to falling asleep).