unicorn Ruby/Rack server user+dev discussion/patches/pulls/bugs/help
 help / color / mirror / code / Atom feed
From: Eric Wong <normalperson@yhbt.net>
To: unicorn list <mongrel-unicorn@rubyforge.org>
Cc: "J. Austin Hughey" <jhughey@engineyard.com>
Subject: Re: Sending ABRT to timeout-errant process before KILL
Date: Thu, 8 Sep 2011 19:13:52 +0000	[thread overview]
Message-ID: <20110908191352.GA25251@dcvr.yhbt.net> (raw)
In-Reply-To: <6310128B-282E-40C0-801B-15DBBE9E6AC3@engineyard.com>

"J. Austin Hughey" <jhughey@engineyard.com> wrote:
> The general idea is that I'd like to have some way to "warn" the
> application when it's about to be killed.  I've patched
> murder_lazy_workers to send an abort signal via kill_worker, sleep for
> 5 seconds, then look and see if the process still exists by using
> Process.getpgid.  If so, it sends the original kill command, and if
> not, a rescue block kicks in to prevent the raised error from
> Process.getpgid from making things explode.

The problem with anything other than SIGKILL (or SIGSTOP) is that it
assumes the Ruby VM is working and in a good state.

> I've created a simulation app, built on Rails 3.0, that uses a generic
> "posts" controller to simulate a long-running request.  Instead of
> just throwing a straight-up sleep 65 in there, I have it running
> through sleeping 1 second on a decrementing counter, and doing that 65
> times.  The reason is because, assuming I've read the code correctly,
> even with my "skip sleeping workers" commented line below, it'll skip
> over the process, thus rendering my simulation of a long-running
> process invalid.  However, clarification on this point is certainly
> welcome.  You can see the app here:
> https://github.com/jaustinhughey/unicorn_test/blob/master/app/controllers/posts_controller.rb

(purely for educational purposes, since I'll point you towards another
approach I believe is better)

              Signal.trap(:ABRT) do
                # Write some stuff to the Rails log
                logger.info "Caught Unicorn kill exception!"

If this is the logger that ships with Ruby, it locks a Mutex, so it'll
deadlock if another SIGABRT is received while logging the above
statement (a very small window, admittedly).

                # Do a controlled disconnect from ActiveRecord
                ActiveRecord::Base.connection.disconnect!

Likewise, if AR needs to lock internal structures before disconnecting,
it also must be reentrant.  Ruby's normal Mutex implementation is not
reentrant-safe.


> So it looks like Worker 1 is hitting a strange/false timeout of
> 1315467289 seconds, which isn't really possible as it wasn't even
> running 1315467289 seconds prior to that (which equates to roughly 41
> years ago if my math is right).

You're getting this because you removed the following line:

      0 == tick and next # skip workers that are sleeping

sleeping means they haven't accepted a client connection, yet.  Not
sleeping while processing a client request.   I'll clarify that in the
code.

> Needless to say, I'm a bit stumped at this point, and would sincerely
> appreciate another point of view on this.  Am I going about this all
> wrong?  Is there a better approach I should consider?  And if I'm on
> the right track, how can I get this to work regardless of how many
> Unicorn workers are running?

Since it's an application error, it can be done as middleware.  You can
try something like the Rainbows::ThreadTimeout middleware, it's
currently Rainbows! specific but can easily be made to work with
Unicorn.

	git clone git://bogomips.org/rainbows
	cat rainbows/lib/rainbows/thread_timeout.rb

This is conceptually similar to "timeout" in the Ruby standard library,
but does not allow nesting.

I'll try to clarify more later today if you have questions, in a bit of
a rush right now.

-- 
Eric Wong
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying


  reply	other threads:[~2011-09-08 20:07 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-08  9:04 Sending ABRT to timeout-errant process before KILL J. Austin Hughey
2011-09-08 19:13 ` Eric Wong [this message]
2011-09-09 23:01   ` Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://yhbt.net/unicorn/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110908191352.GA25251@dcvr.yhbt.net \
    --to=normalperson@yhbt.net \
    --cc=jhughey@engineyard.com \
    --cc=mongrel-unicorn@rubyforge.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhbt.net/unicorn.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).