From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS33070 50.56.128.0/17 X-Spam-Status: No, score=0.0 required=3.0 tests=MSGID_FROM_MTA_HEADER, TVD_RCVD_IP shortcircuit=no autolearn=unavailable version=3.3.2 Path: news.gmane.org!not-for-mail From: Samuel Kadolph Newsgroups: gmane.comp.lang.ruby.rainbows.general Subject: Re: Unicorn is killing our rainbows workers Date: Tue, 31 Jul 2012 10:09:08 -0400 Message-ID: References: <20120719002641.GA17210@dcvr.yhbt.net> <20120719201633.GA8203@dcvr.yhbt.net> <20120719213125.GA17708@dcvr.yhbt.net> <20120726234845.GA29453@dcvr.yhbt.net> <20120727001125.GA30957@dcvr.yhbt.net> <20120727204040.GA2192@dcvr.yhbt.net> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Trace: dough.gmane.org 1343743759 20328 80.91.229.3 (31 Jul 2012 14:09:19 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Tue, 31 Jul 2012 14:09:19 +0000 (UTC) Cc: Cody Fauser , ops , Harry Brundage , Jonathan Rudenberg To: "Rainbows! list" Original-X-From: rainbows-talk-bounces-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org Tue Jul 31 16:09:18 2012 Return-path: Envelope-to: gclrrg-rainbows-talk@m.gmane.org X-Original-To: rainbows-talk-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org Delivered-To: rainbows-talk-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shopify.com; s=google; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=NafMPCEyXhSnK4Q0MewafGaMExfOD1ffsWnWLhIuuNY=; b=as1FM+fgkqPt2R+JokhDiCFPF7ekcA49FzKv0Iv5GLs32gJdB97woMIus0H520csZc G2Qbohu47WndMZy9CcaQc9B8gKC2fDaFNV5ROznlAx23pLnsVbZOxbALe7NyYKi3wROl dQKZZc71xn8Eu1mlScGuP5hrt/0bzumm2p1L4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:x-gm-message-state; bh=NafMPCEyXhSnK4Q0MewafGaMExfOD1ffsWnWLhIuuNY=; b=CZ/zpoxi7TY3Z8t/Mj3ZNT6JoMnIfyoS9a+FisRYXppXfHByYT6qKbMgkLiMo36ufW lO4VuF2Zk3N0Bk6zeYA49puXrElfCC4KvWyjXKX2IhMGlNVNQIp5ZyEP05rWLDRVWTMk RT1WWE7+hlT4uzTFj43uCwenWvtpPSFW0PeFUZiSLQ3Z+IHEIqXSN0avHywkK9MV4727 nOHMYR+TCQNfgDYDmu+yrKA2e6mha9jMU8RRsE8U8zeWkaI61EeIWKtoCJ64rrYbNhK1 ApJngYtYWRsh6S5yK6DaF9xIHmzZz/Z/i9QwcIJnWtNBvLA8v2gpxo8GVaqjABZYJXSx 1iuA== In-Reply-To: <20120727204040.GA2192-yBiyF41qdooeIZ0/mPfg9Q@public.gmane.org> X-Gm-Message-State: ALoCoQlCimQG8X8YEo3Mi3R0Svv+kQO2SfyUvh2OAHkJrcIlHBSn9i65MWfMHV4vFUUO3a3NNsLp X-BeenThere: rainbows-talk-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: rainbows-talk-bounces-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org Errors-To: rainbows-talk-bounces-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org Xref: news.gmane.org gmane.comp.lang.ruby.rainbows.general:389 Archived-At: Received: from 50-56-192-79.static.cloud-ips.com ([50.56.192.79] helo=rubyforge.org) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1SwD8P-00025e-9h for gclrrg-rainbows-talk@m.gmane.org; Tue, 31 Jul 2012 16:09:17 +0200 Received: from localhost.localdomain (localhost [127.0.0.1]) by rubyforge.org (Postfix) with ESMTP id 828152E069; Tue, 31 Jul 2012 14:09:14 +0000 (UTC) Received: from mail-vc0-f178.google.com (mail-vc0-f178.google.com [209.85.220.178]) by rubyforge.org (Postfix) with ESMTP id 02D0F2E069 for ; Tue, 31 Jul 2012 14:09:08 +0000 (UTC) Received: by vcbf13 with SMTP id f13so7134252vcb.23 for ; Tue, 31 Jul 2012 07:09:08 -0700 (PDT) Received: by 10.52.99.138 with SMTP id eq10mr12483401vdb.25.1343743748544; Tue, 31 Jul 2012 07:09:08 -0700 (PDT) Received: by 10.58.198.11 with HTTP; Tue, 31 Jul 2012 07:09:08 -0700 (PDT) On Fri, Jul 27, 2012 at 4:40 PM, Eric Wong wrote: > Samuel Kadolph wrote: >> On Thu, Jul 26, 2012 at 8:11 PM, Eric Wong wrote: >> >> Our ops guys have been busy so I don't have the output from lsof but >> >> it didn't look like it was spawning any extra threads or opening any >> >> unexplainable connections. But I think we should have been checking >> >> the worker processes and not the master, right? >> > >> > Definitely check the master, too. It's the master that seems to >> > believe it's suspended, so that makes me believe something is wrong >> > with the master (and this is likely due to preload_app). >> > >> >> Haven't tried disabling preload_app yet but we have tried >> >> I've got the output of lsof and ls at https://gist.github.com/3190171. > > Thanks, that's the output for the master? I don't see anything > obviously wrong. > > I seem to recall the Ruby library responsible for the following log file > also spawns its own background thread, but your "ls" only shows 2 tasks > (instead of 3): > >> ruby 26564 root 9w REG 202,1 51221 529742 APP_PATH/shared/log/newrelic_agent.log > >> $ ls /proc/26564/task/ >> 26564 27052 > > (While the Ruby code for the module responsible for that log file is > technically "open", it's not Free, so I'm not comfortable looking at > that code). So 2 updates: yes that lsof output is from the master process and using preload_app false solves the issue. No more killings and the suspend/hibernation messages stopped as well. We lost newrelic data so we're going to try putting preload_app back to true and removing the newrelic gem. _______________________________________________ Rainbows! mailing list - rainbows-talk-GrnCvJ7WPxnNLxjTenLetw@public.gmane.org http://rubyforge.org/mailman/listinfo/rainbows-talk Do not quote signatures (like this one) or top post when replying