From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS33070 50.56.128.0/17 X-Spam-Status: No, score=0.0 required=3.0 tests=MSGID_FROM_MTA_HEADER, TVD_RCVD_IP shortcircuit=no autolearn=unavailable version=3.3.2 Path: news.gmane.org!not-for-mail From: Samuel Kadolph Newsgroups: gmane.comp.lang.ruby.rainbows.general Subject: Re: Unicorn is killing our rainbows workers Date: Wed, 18 Jul 2012 19:06:07 -0400 Message-ID: References: <20120718215222.GA11539@dcvr.yhbt.net> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Trace: dough.gmane.org 1342652820 28005 80.91.229.3 (18 Jul 2012 23:07:00 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Wed, 18 Jul 2012 23:07:00 +0000 (UTC) Cc: Cody Fauser , ops , Harry Brundage , Jonathan Rudenberg To: "Rainbows! list" Original-X-From: rainbows-talk-bounces-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org Thu Jul 19 01:06:58 2012 Return-path: Envelope-to: gclrrg-rainbows-talk@m.gmane.org X-Original-To: rainbows-talk-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org Delivered-To: rainbows-talk-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shopify.com; s=google; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=qkdIJsAMw3kXOx4dDbE7RmN7uH3oOLWqcrMPzCOFm8Y=; b=S07+bKWTpPFgW8s+K1vjA7jSL0WZnYLl1Q1G6X1w1EK7K+5BtbW1mESWajxh9K2obC xjg79vK62UcVoXlZ0ihrh1hsVT+Dwhy7RTdO+OWL//nl/a4LwEsTSMtWwiSbJzGmf2/I wB2h2A0b8DIgZigcgUrEkcHns26iC7Z67moxU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:x-gm-message-state; bh=qkdIJsAMw3kXOx4dDbE7RmN7uH3oOLWqcrMPzCOFm8Y=; b=V0pUkyWam9PSVSh4FwAQrbaq3wbjCJA0yXg8yukD4pgp5MFDu9yoy/HM12XhFO8vtL i11uvUYvFRdaLj5G/phVserzmm4WAqp8898E8vJw7NEo9pkLWvaKbzATzZNGkMwPMVQW vYHMuh6Ma+dYRE7wcxJR7wSMtqbFaAuHvStKjsFU7VoyJaUXa9Equh+11RxanqOjsEtW XZMiixTLEZ/T9L4iOphXLblYwoKvz3uXFR+WiIUbeKULjtwt1KytCzeNqkah6zk8TEOP X0a21pDDpvMHQdokQy+s4gupSTPThO0oi6yB690YSeY77TJCRs814ohhQxQRDGQ49mJn HVvw== In-Reply-To: <20120718215222.GA11539-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org> X-Gm-Message-State: ALoCoQmgUQP4EFR4VK9aPDSUqwPmRBqWLH3Imsk5kRW0dSTsGI4tIsZfbMwrWjJQlxF/AM8ktUFj X-BeenThere: rainbows-talk-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: rainbows-talk-bounces-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org Errors-To: rainbows-talk-bounces-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org Xref: news.gmane.org gmane.comp.lang.ruby.rainbows.general:373 Archived-At: Received: from 50-56-192-79.static.cloud-ips.com ([50.56.192.79] helo=rubyforge.org) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1SrdKW-0003du-S2 for gclrrg-rainbows-talk@m.gmane.org; Thu, 19 Jul 2012 01:06:53 +0200 Received: from localhost.localdomain (localhost [127.0.0.1]) by rubyforge.org (Postfix) with ESMTP id 8A45E2E06A; Wed, 18 Jul 2012 23:06:50 +0000 (UTC) Received: from mail-pb0-f50.google.com (mail-pb0-f50.google.com [209.85.160.50]) by rubyforge.org (Postfix) with ESMTP id 93BB42E06A for ; Wed, 18 Jul 2012 23:06:08 +0000 (UTC) Received: by pbbrr4 with SMTP id rr4so3670221pbb.23 for ; Wed, 18 Jul 2012 16:06:08 -0700 (PDT) Received: by 10.68.222.9 with SMTP id qi9mr11485032pbc.164.1342652767922; Wed, 18 Jul 2012 16:06:07 -0700 (PDT) Received: by 10.66.217.225 with HTTP; Wed, 18 Jul 2012 16:06:07 -0700 (PDT) On Wed, Jul 18, 2012 at 5:52 PM, Eric Wong wrote: > Samuel Kadolph wrote: >> Hey rainbows-talk, >> >> We have 40 servers that each run rainbows with 2 workers with 100 >> threads using ThreadPool. We're having an issue where unicorn is >> killing the worker process. We use ThreadTimeout (set to 70 seconds) >> and originally had the unicorn timeout set to 150 seconds and we're >> seeing unicorn eventually killing each worker. So we bumped the >> timeout to 300 seconds and it took about 5 minutes but we started >> seeing unicorn starting to kill workers again. You can see our stderr >> log file (timeout at 300s) at >> https://gist.github.com/9ec96922e55a59753997. Any insight into why >> unicorn is killing our ThreadPool workers would help us greatly. If >> you require additional info I would be happy to provide it. > > Which Ruby version/patchlevel are you using? 1.8 and 1.9 have vastly > different thread implementations and workarounds to deal with. > > What C extensions are you using? > > ThreadTimeout might also be conflicting with some libraries you use and > causing deadlocks. Also, ThreadTimeout might not be a good idea with > many common libraries which: > > 1) use the stdlib Timeout internally > 2) rely on ensure clauses firing > > ThreadTimeout turns out to be difficult to use correctly with existing > code, so it may not be appropriate for you. Your app should use > localized timeouts as much as possible (using timeout mechanisms built > into libraries you use). > > Also, please don't use private gist (especially when posting to public > mailing list), it requires a github account to clone from and I'll > never require (nor encourage :P) needing any website account > for contributing to Rainbows!, just an email address. We're running ruby 1.9.3-p125 with the performance patches at https://gist.github.com/1688857. I listed the gems we use and which ones that have c extension at https://gist.github.com/3139226. We'll try running without the ThreadTimeout. We don't think we're having deadlock issues because our stress tests do not timeout but they do 502 when the rainbows worker gets killed during a request. _______________________________________________ Rainbows! mailing list - rainbows-talk-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org http://rubyforge.org/mailman/listinfo/rainbows-talk Do not quote signatures (like this one) or top post when replying