From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS33070 50.56.128.0/17 X-Spam-Status: No, score=-0.2 required=5.0 tests=AWL shortcircuit=no autolearn=unavailable version=3.3.2 X-Original-To: archivist@yhbt.net Delivered-To: archivist@dcvr.yhbt.net Received: from rubyforge.org (50-56-192-79.static.cloud-ips.com [50.56.192.79]) by dcvr.yhbt.net (Postfix) with ESMTP id 804513F69D for ; Mon, 25 Jun 2012 04:15:51 +0000 (UTC) Received: from localhost.localdomain (localhost [127.0.0.1]) by rubyforge.org (Postfix) with ESMTP id 64B172E062; Mon, 25 Jun 2012 04:15:46 +0000 (UTC) X-Original-To: mongrel-unicorn@rubyforge.org Delivered-To: mongrel-unicorn@rubyforge.org Received: from dcvr.yhbt.net (dcvr.yhbt.net [64.71.152.64]) by rubyforge.org (Postfix) with ESMTP id 4C8A32E061 for ; Mon, 25 Jun 2012 03:59:37 +0000 (UTC) Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 4BB821F436; Mon, 25 Jun 2012 03:59:37 +0000 (UTC) Date: Mon, 25 Jun 2012 03:59:37 +0000 From: Eric Wong To: unicorn list Subject: Re: [PATCH] `kill -SIGTRAP ` Message-ID: <20120625035937.GA22367@dcvr.yhbt.net> References: <1340467944-23621-1-git-send-email-cedric@maion.com> <1340467944-23621-2-git-send-email-cedric@maion.com> <20120623185534.GA17517@dcvr.yhbt.net> <4FE6F491.6070108@maion.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <4FE6F491.6070108@maion.com> User-Agent: Mutt/1.5.20 (2009-06-14) Cc: Cedric Maion X-BeenThere: mongrel-unicorn@rubyforge.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: mongrel-unicorn-bounces@rubyforge.org Errors-To: mongrel-unicorn-bounces@rubyforge.org Cedric Maion wrote: > Eric Wong wrote: > > SIGKILL timeout is only a last line of defense when the Ruby VM itself > > is completely broken. Handling SIGTRAP implies the worker can still > > respond (and /can/ be rescued), so your SIGTRAP handler is worthless if > > SIGKILL is required to kill a process. > Sure. But if the VM is responding, being able to get a backtrace is nice. > And if it's stuck, you won't get anything indeed, but that's still an > information (in that case, one may eventually want to get a gdb > backtrace too). No? Sure it's nice. But the point is you should've had something around to handle it in your app anyways if your worker was capable of responding to SIGTRAP at all. The SIGKILL logic only exists in the master because it must run outside of the worker. > > See http://unicorn.bogomips.org/Application_Timeouts.html > Yes, I'm well aware of this. However, when you still get rare unicorn > timeouts, debugging them is not obvious. > In my case, a server in a loadbalanced farm sometimes sees all it's > unicorn workers timeout in the same minute (approx once a day at what > seems a random time) -- other servers are fine. Couldn't correlate this > with any specific network/disk/misc system/user activity yet. I might even crank the unicorn timeout sky high and have something else (per-worker) handling timeouts + debugging/dumping in this case. I recall some mailing list threads on similar topics over the years, gmane has excellent archives and I'd start there (and not the Rubyforge archives): gmane.org/gmane.comp.lang.ruby.unicorn.general The Rainbows::ThreadTimeout could be used as a starting point for a Rack middleware to debug with. git clone git://bogomips.org/rainbows cat lib/rainbows/thread_timeout.rb _______________________________________________ Unicorn mailing list - mongrel-unicorn@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-unicorn Do not quote signatures (like this one) or top post when replying