From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on dcvr.yhbt.net X-Spam-Level: * X-Spam-ASN: AS33070 50.56.128.0/17 X-Spam-Status: No, score=1.3 required=5.0 tests=RDNS_NONE shortcircuit=no autolearn=no version=3.3.2 X-Original-To: archivist@yhbt.net Delivered-To: archivist@dcvr.yhbt.net Received: from rubyforge.org (unknown [50.56.192.79]) by dcvr.yhbt.net (Postfix) with ESMTP id A65A920E51 for ; Wed, 16 Apr 2014 03:58:44 +0000 (UTC) Received: from localhost.localdomain (localhost [127.0.0.1]) by rubyforge.org (Postfix) with ESMTP id 25DC82E25F; Wed, 16 Apr 2014 03:58:43 +0000 (UTC) X-Original-To: mongrel-unicorn@rubyforge.org Delivered-To: mongrel-unicorn@rubyforge.org X-Greylist: delayed 1385 seconds by postgrey-1.31 at rubyforge; Wed, 16 Apr 2014 03:58:37 UTC Received: from mail-ig0-f172.google.com (mail-ig0-f172.google.com [209.85.213.172]) by rubyforge.org (Postfix) with ESMTP id 0C1612E25E for ; Wed, 16 Apr 2014 03:58:36 +0000 (UTC) Received: by mail-ig0-f172.google.com with SMTP id hn18so593902igb.5 for ; Tue, 15 Apr 2014 20:58:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=kmjUtm9JWMTVBlZifQIDSFZoVV7VP1XztgCTKVsF3ak=; b=f3Tk0UROgSuiA81GY1lhEp/9vfKnR5cK7xUB8yI8bXAiieexPftgYS2X7xN56q3Gbp qp/IIXOaZhz6U1DJotdBNEnHzdzH3OL+tJlF03gEIiXJX0ybkLVHrBYj+MtRXm0I1cyV scC/4jkIn+6cfFw3AyRG4Jd8unUtORaR6xlHWQQc7dwvU3c3AF2DVS21Pm5kh/9CZBvF NUfoVORqotWMfG2dJ2k0KnEV0z7utINzrmS/0aayRvXyYp7mcN8GWGV0zZnaZwUqRcTT fZArIoL1Qk9HltUdZu+MlNrh0VEPT9GrQ9SPX2VAx+nv8Lx3ONJaK5gcGg6ezs9gKBPL l/og== X-Gm-Message-State: ALoCoQluvSX3twZ9XcS+8aXnohpYTNPIPQsBz6vFSfyDAkqVb3zeJdza59HbIJUT1O0/uBzHzmIr X-Received: by 10.43.159.197 with SMTP id lz5mr1781450icc.55.1397619327290; Tue, 15 Apr 2014 20:35:27 -0700 (PDT) Received: from vagrant.myshopify.io.localdomain ([184.175.41.220]) by mx.google.com with ESMTPSA id w4sm24878363igl.7.2014.04.15.20.35.25 for (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 15 Apr 2014 20:35:26 -0700 (PDT) X-Google-Original-From: Camilo Lopez From: Camilo Lopez To: mongrel-unicorn@rubyforge.org Subject: [PATCH] Kill lazy workers with TERM before KILL Date: Wed, 16 Apr 2014 03:35:18 +0000 Message-Id: <1397619318-28190-2-git-send-email-camilo@shopify.com> X-Mailer: git-send-email 1.9.2 In-Reply-To: <1397619318-28190-1-git-send-email-camilo@shopify.com> References: <1397619318-28190-1-git-send-email-camilo@shopify.com> Cc: Camilo Lopez , admin@camilolopez.com X-BeenThere: mongrel-unicorn@rubyforge.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: mongrel-unicorn-bounces@rubyforge.org Errors-To: mongrel-unicorn-bounces@rubyforge.org We want to know what was going on with a lazy worker, KILL is un-trappable so we are sending TERM instead, then to make sure the process really dies we are KILLing a second later. --- lib/unicorn/http_server.rb | 31 +++++++++++++++++++------------ lib/unicorn/worker.rb | 7 ++++++- 2 files changed, 25 insertions(+), 13 deletions(-) diff --git a/lib/unicorn/http_server.rb b/lib/unicorn/http_server.rb index 21cb9a1..1b2a33b 100644 --- a/lib/unicorn/http_server.rb +++ b/lib/unicorn/http_server.rb @@ -474,22 +474,29 @@ class Unicorn::HttpServer end # forcibly terminate all workers that haven't checked in in timeout seconds. The timeout is implemented using an unlinked File + # Workers will be killed with TERM first, so the backtrace can be printed, on + # the next call to murder_lazy_workers the worker will receive KILL. def murder_lazy_workers next_sleep = @timeout - 1 now = Time.now.to_i WORKERS.dup.each_pair do |wpid, worker| - tick = worker.tick - 0 == tick and next # skip workers that haven't processed any clients - diff = now - tick - tmp = @timeout - diff - if tmp >= 0 - next_sleep > tmp and next_sleep = tmp - next + if worker.needs_sig_kill? + kill_worker(:KILL, wpid) + else + tick = worker.tick + 0 == tick and next # skip workers that haven't processed any clients + diff = now - tick + tmp = @timeout - diff + if tmp >= 0 + next_sleep > tmp and next_sleep = tmp + next + end + next_sleep = 0 + logger.error "worker=#{worker.nr} PID:#{wpid} timeout " \ + "(#{diff}s > #{@timeout}s), killing" + kill_worker(:TERM, wpid) + worker.needs_sig_kill = true end - next_sleep = 0 - logger.error "worker=#{worker.nr} PID:#{wpid} timeout " \ - "(#{diff}s > #{@timeout}s), killing" - kill_worker(:KILL, wpid) # take no prisoners for timeout violations end next_sleep <= 0 ? 1 : next_sleep end @@ -606,7 +613,7 @@ class Unicorn::HttpServer def init_worker_process(worker) worker.atfork_child # we'll re-trap :QUIT later for graceful shutdown iff we accept clients - EXIT_SIGS.each { |sig| trap(sig) { exit!(0) } } + EXIT_SIGS.each { |sig| trap(sig) { STDERR.puts(caller); exit!(0) } } exit!(0) if (SIG_QUEUE & EXIT_SIGS)[0] WORKER_QUEUE_SIGS.each { |sig| trap(sig, nil) } trap(:CHLD, 'DEFAULT') diff --git a/lib/unicorn/worker.rb b/lib/unicorn/worker.rb index e74a1c9..4214500 100644 --- a/lib/unicorn/worker.rb +++ b/lib/unicorn/worker.rb @@ -11,7 +11,7 @@ require "raindrops" class Unicorn::Worker # :stopdoc: attr_accessor :nr, :switched - attr_writer :tmp + attr_writer :tmp, :needs_sig_kill attr_reader :to_io # IO.select-compatible PER_DROP = Raindrops::PAGE_SIZE / Raindrops::SIZE @@ -25,6 +25,7 @@ class Unicorn::Worker @nr = nr @tmp = @switched = false @to_io, @master = Unicorn.pipe + @needs_sig_kill = false end def atfork_child # :nodoc: @@ -149,4 +150,8 @@ class Unicorn::Worker Process.euid != uid and Process::UID.change_privilege(uid) @switched = true end + + def needs_sig_kill? + @needs_sig_kill + end end -- 1.9.2 _______________________________________________ Unicorn mailing list - mongrel-unicorn@rubyforge.org http://rubyforge.org/mailman/listinfo/mongrel-unicorn Do not quote signatures (like this one) or top post when replying