unicorn Ruby/Rack server user+dev discussion/patches/pulls/bugs/help
 help / color / mirror / code / Atom feed
* FreeBSD jail and unicorn
@ 2012-01-31 17:22 Philipp Bruell
  2012-01-31 18:39 ` Eric Wong
  0 siblings, 1 reply; 13+ messages in thread
From: Philipp Bruell @ 2012-01-31 17:22 UTC (permalink / raw)
  To: mongrel-unicorn@rubyforge.org; +Cc: Charles Hornberger

Hello,

I'm using unicorn since a while, but now I try to run it the first time
inside a FreeBSD jail.

The initial start of unicorn works fine and it serves all the requests.
But if I want to restart it using the USR2 signal, it (more or less)
slowly starts using more and more CPU cycles. There is no error message in
the logs and it quite hard to reproduce that error. In 1 of 20 tries,
unicorn restarts correctly, but in the other cases I have to "kill -9" the
process. I haven't found anything that gives some indication.

I've tried unicorn version 4.1.1 and 4.2.0. The FreeBSD version is
8.2-STABLE amd64.

That my config:
---
listen "/home/deploy/staging/unicorn.sock"
pid "/home/deploy/staging/unicorn.pid"

preload_app true

stderr_path "/home/deploy/staging/unicorn.stderr.log"
stdout_path "/home/deploy/staging/unicorn.stdout.log"

before_fork do |server, worker|
  old_pid = "#{server.config[:pid]}.oldbin"
    if old_pid != server.pid
    begin
      process_id = File.read(old_pid).to_i
      puts "sending QUIT to #{process_id}"
      Process.kill :QUIT, process_id
    rescue Errno::ENOENT, Errno::ESRCH
    end
  end
end
---


I've tried without the before_fork-block, but I think, that's not the
critical part, since it doesn't reach to point where two master processes
exists. There is just the old master consuming all the CPU cycles.

Does someone ran into the same problem? Does someone has an idea?

Thanks in advance
Philipp

_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: FreeBSD jail and unicorn
  2012-01-31 17:22 FreeBSD jail and unicorn Philipp Bruell
@ 2012-01-31 18:39 ` Eric Wong
  2012-01-31 18:50   ` Charles Hornberger
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Wong @ 2012-01-31 18:39 UTC (permalink / raw)
  To: unicorn list; +Cc: Philipp Bruell, "wrote:", Charles Hornberger

Philipp Bruell <Philipp.Bruell@skrill.com> wrote:
> stderr_path "/home/deploy/staging/unicorn.stderr.log"

<snip>

> Does someone ran into the same problem? Does someone has an idea?

Tatsuya Ono documented a workaround for jails here (see gist):
http://mid.gmane.org/CAHBuKRj09FdxAgzsefJWotexw-7JYZGJMtgUp_dhjPz9VbKD6Q@mail.gmail.com

(http://unicorn.bogomips.org/KNOWN_ISSUES.html refers to this link, too)

If that didn't work, maybe checking stderr.log will tell you something
more.
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: FreeBSD jail and unicorn
  2012-01-31 18:39 ` Eric Wong
@ 2012-01-31 18:50   ` Charles Hornberger
  2012-01-31 19:05     ` Eric Wong
  0 siblings, 1 reply; 13+ messages in thread
From: Charles Hornberger @ 2012-01-31 18:50 UTC (permalink / raw)
  To: Eric Wong, unicorn list; +Cc: Philipp Bruell, "wrote:"@dcvr.yhbt.net

On 1/31/12 7:39 PM, normalperson@yhbt.net wrote:
>Philipp Bruell <Philipp.Bruell@skrill.com> wrote:
>>
>> Does someone ran into the same problem? Does someone has an idea?
>
>Tatsuya Ono documented a workaround for jails here (see gist):
>http://mid.gmane.org/CAHBuKRj09FdxAgzsefJWotexw-7JYZGJMtgUp_dhjPz9VbKD6Q@m
>ail.gmail.com
>
>(http://unicorn.bogomips.org/KNOWN_ISSUES.html refers to this link, too)

Philipp's gone afk for the evening, so I'll take the liberty of replying
with what I know ...

We tried the fix mentioned above, and it didn't work. We also tried
switching to unix sockets; no joy. (Actually it worked once, then refused
to work again.)


>If that didn't work, maybe checking stderr.log will tell you something
>more.

Nothing shows up in stderr.log It's as if the master doesn't even get the
-USR2 signal. Or as if whatever it's sending to stderr is not actually
getting to the filesystem...

In any case, we don't see anything.

Any further ideas for how to debug would be much appreciated, and my
apologies in advance for mixing up any details; Philipp was doing the work
on this, not me.

-c

_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: FreeBSD jail and unicorn
  2012-01-31 18:50   ` Charles Hornberger
@ 2012-01-31 19:05     ` Eric Wong
       [not found]       ` <CB4EBD2A.7DF%philipp.bruell@skrill.com>
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Wong @ 2012-01-31 19:05 UTC (permalink / raw)
  To: Charles Hornberger; +Cc: unicorn list, Philipp Bruell

Charles Hornberger <Charles.Hornberger@skrill.com> wrote:
> On 1/31/12 7:39 PM, normalperson@yhbt.net wrote:
> >If that didn't work, maybe checking stderr.log will tell you something
> >more.
> 
> Nothing shows up in stderr.log It's as if the master doesn't even get the
> -USR2 signal. Or as if whatever it's sending to stderr is not actually
> getting to the filesystem...
> 
> In any case, we don't see anything.

Can you check if the signal is received in the master via truss/dtruss?

Do other signals (USR1, HUP, QUIT) work?

You might need to enable the equivalent of "-f" for strace (follow child
processes/threads) since Ruby 1.9 uses a dedicated thread for receiving
signals.

> Any further ideas for how to debug would be much appreciated, and my
> apologies in advance for mixing up any details; Philipp was doing the work
> on this, not me.

Also, which version of Ruby are you using?  I'm pretty familiar
with the 1.9.3 implementation, the earlier 1.9.x releases were
messier and noisy wrt signal handling.
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: FreeBSD jail and unicorn
       [not found]       ` <CB4EBD2A.7DF%philipp.bruell@skrill.com>
@ 2012-02-01 18:14         ` Eric Wong
       [not found]           ` <CB5014D7.892%philipp.bruell@skrill.com>
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Wong @ 2012-02-01 18:14 UTC (permalink / raw)
  To: unicorn list; +Cc: Philipp Bruell, Charles Hornberger

Philipp Bruell <Philipp.Bruell@skrill.com> wrote:
> First of all, thank you for your fast reply.

No problem, depends on the time of day of course :>

> The behaviour details Charles described are correct and we are using ruby
> version 1.9.3.
> 
> It's good that you've asked for the other signals. I've checked them in
> particular and it seems that it is a common signal handling problem. The
> process freaks out on each of them :-(
> 
> I've attached the output of truss -f for a QUIT signal. That signal took a
> quite long time to get processed (and took all CPU cycles), but finally
> worked.

I only saw the output from the master process there, nothing from the
worker.  It seems like the master is OK, but trying to kill the worker
is not.  I wonder if it's related to
https://bugs.ruby-lang.org/issues/5240

With the following script, can you try sending SIGQUIT to the parent
and see what happens?

------------------------------- 8< -----------------------------
pid = fork do
  r, w = IO.pipe
  trap(:QUIT) do
    puts "SIGQUIT received in child, exiting"
    w.close
  end
  r.read
end

trap(:QUIT) do
  puts "SIGQUIT received in parent, killing child"
  Process.kill(:QUIT, pid)
  p Process.waitpid2(pid)
  exit
end
sleep 1 # wait for child to setup sig handler
puts "Child ready on #{pid}, parent on #$$"
sleep
----------------------------------------------------------------
If the above fails, try with different variables:

* without a jail on the same FreeBSD version/release/patchlevel
* Ruby 1.9.2 (which has a different signal handling implementation)
* different FreeBSD version
* different architecture[1]

Mixing either signal handling or fork()-ing in the presence of threads
is tricky.  Ruby 1.9 uses a dedicated thread internally for signal
handling, I wouldn't be surprised if there's a bug lingering somewhere
in FreeBSD or Ruby...

Have you checked the FreeBSD mailing lists/bug trackers?  I don't recall
seeing anything other than the aforementioned bug in ruby-core...

[1] - I expect there's ASM involved in signal/threading implementation
      details, so there's a chance it's x86_64-specific...

> The output of USR2 signal is too long to attach it to a mail, but at a
> first sight, it repeats the following calls over and over again.

Don't send monster attachments, host it somewhere else so mail servers
won't reject it for wasting bandwidth.  The mailman limit on rubyforge
is apparently 256K (already huge IMHO).  Also, don't top post

> 24864: 
> thr_kill(0x18c32,0x1a,0x800a8edc0,0x18a86,0x7fffffbeaf80,0x80480c000) = 0
> (0x0)
> 24864: select(4,{3},0x0,0x0,{0.100000 })         = 1 (0x1)
> 24864: read(3,"!",1024)                          = 1 (0x1)

OK, so the signal is received correctly by the Ruby VM in the master.
I just don't see anything in the worker, but the master does attempt
to forward SIGQUIT to the worker.

> It also seems to me, that observing the processes with truss changes the
> behaviour a lot. During the observed USR2 the master process spawns a lot
> (about 30) of <defunct> processes. I never had this before.

Some processes react strangely to being traced.  Maybe there's something
better than truss nowadays?
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: FreeBSD jail and unicorn
       [not found]           ` <CB5014D7.892%philipp.bruell@skrill.com>
@ 2012-02-02 19:31             ` Eric Wong
  2012-02-02 22:27               ` Eric Wong
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Wong @ 2012-02-02 19:31 UTC (permalink / raw)
  To: unicorn list; +Cc: Philipp Bruell, Charles Hornberger

Philipp Bruell <Philipp.Bruell@skrill.com> wrote:
> On 01/02/2012 19:14, "Eric Wong" <normalperson@yhbt.net> wrote:
> >Philipp Bruell <Philipp.Bruell@skrill.com> wrote:

> The scripts behaves exactly like unicorn. The master received the QUIT and
> passes it to the child. The child also receives it, but don't exit. While
> the master is waiting for the child to exit, it consumes all the cpu
> cycles.

Interesting, I suspect it's some bad interaction with fork() causing
signal handlers/pthreads to go bad.  I expect the following simple
script to work flawlessly since it doesn't fork:

----------------------------------------
trap(:QUIT) { exit(0) }
puts "Ready for SIGQUIT on #$$"
sleep
----------------------------------------

> I don't have the option, to test without jail, on a different FreeBSD
> version nor a different architecture (and FreeBSD - on Mac OS X everything
> works perfect). But I tried ruby version 1.9.2 and that works! So I guess
> it's a bug with 1.9.3 on FreeBSD.

Can you report this as a bug to the Ruby core folks on
https://bugs.ruby-lang.org/ and also to whereever the FreeBSD hackers
take bug reports?  Somebody from one of those camps should be able
to resolve the issue.

The good thing is my small sample script is enough to reproduce the
issue, so it should be easy for an experienced FreeBSD hacker to
track down.

> I've attached the truss -f output of the child process of the test script.
> But the observation with truss made the problem disappear again :-(

It could be a timing or race condition issue.  I've had strace on linux
find/hide bugs because it slowed the program down enough.
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: FreeBSD jail and unicorn
  2012-02-02 19:31             ` Eric Wong
@ 2012-02-02 22:27               ` Eric Wong
  2012-02-02 22:41                 ` Eric Wong
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Wong @ 2012-02-02 22:27 UTC (permalink / raw)
  To: unicorn list; +Cc: Philipp Bruell, Charles Hornberger

Eric Wong <normalperson@yhbt.net> wrote:
> Can you report this as a bug to the Ruby core folks on
> https://bugs.ruby-lang.org/ and also to whereever the FreeBSD hackers
> take bug reports?  Somebody from one of those camps should be able
> to resolve the issue.

A total stab in the dark, but I posted this patch to ruby-core anyways
to find more testers/reviewers:

  http://mid.gmane.org/20120202221946.GA32004@dcvr.yhbt.net

Mind giving it a shot?
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: FreeBSD jail and unicorn
  2012-02-02 22:27               ` Eric Wong
@ 2012-02-02 22:41                 ` Eric Wong
  2012-02-02 23:41                   ` Eric Wong
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Wong @ 2012-02-02 22:41 UTC (permalink / raw)
  To: unicorn list; +Cc: Philipp Bruell, Charles Hornberger

Eric Wong <normalperson@yhbt.net> wrote:
> A total stab in the dark, but I posted this patch to ruby-core anyways
> to find more testers/reviewers:
> 
>   http://mid.gmane.org/20120202221946.GA32004@dcvr.yhbt.net

Oops, and I just posted a follow-up since the original was a no-op
due to ordering issues :x
http://mid.gmane.org/20120202223945.GA9233@dcvr.yhbt.net
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: FreeBSD jail and unicorn
  2012-02-02 22:41                 ` Eric Wong
@ 2012-02-02 23:41                   ` Eric Wong
  2012-02-03  2:58                     ` Eric Wong
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Wong @ 2012-02-02 23:41 UTC (permalink / raw)
  To: unicorn list; +Cc: Philipp Bruell, Charles Hornberger

Eric Wong <normalperson@yhbt.net> wrote:
> > A total stab in the dark, but I posted this patch to ruby-core anyways
> > to find more testers/reviewers:

Last attempt at a patch on this issue, I'm just shotgunning here :x

   http://bogomips.org/ruby.git/patch/?id=418827f4e41a618d91
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: FreeBSD jail and unicorn
  2012-02-02 23:41                   ` Eric Wong
@ 2012-02-03  2:58                     ` Eric Wong
  2012-02-07  5:21                       ` Eric Wong
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Wong @ 2012-02-03  2:58 UTC (permalink / raw)
  To: unicorn list; +Cc: Philipp Bruell, Charles Hornberger

ruby-core pointed me to the following issue:
  https://bugs.ruby-lang.org/issues/5757

So there may already be a fix in Ruby SVN, can you test?
http://svn.ruby-lang.org/repos/ruby/branches/ruby_1_9_3 r34425
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: FreeBSD jail and unicorn
  2012-02-03  2:58                     ` Eric Wong
@ 2012-02-07  5:21                       ` Eric Wong
  2012-02-07  7:36                         ` Philipp Bruell
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Wong @ 2012-02-07  5:21 UTC (permalink / raw)
  To: unicorn list; +Cc: Philipp Bruell, Charles Hornberger

Eric Wong <normalperson@yhbt.net> wrote:
> ruby-core pointed me to the following issue:
>   https://bugs.ruby-lang.org/issues/5757
> 
> So there may already be a fix in Ruby SVN, can you test?
> http://svn.ruby-lang.org/repos/ruby/branches/ruby_1_9_3 r34425

Btw, did anybody get a chance to try this?

While working on an unrelated project, I setup FreeBSD 8.2 and 9.0
KVM images over the weekend.  Since I had the KVM images handy, I also
tried to reproduce this issue under 1.9.3-p0 (without a jail) but was
unable to reproduce the issue under either 8.2 nor 9.0.

I tried building a jail, but didn't have enough space for a full one.
If I get the chance, I'll see how building a partial jail goes, but I'm
not optimistic about being able to reproduce this issue under KVM
since it seems to be a race condition/timing issue.

I even had both CPU cores enabled under KVM and even installed the
virtio drivers for better performance.

I assume you guys are using SMP in the jail?
_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: FreeBSD jail and unicorn
  2012-02-07  5:21                       ` Eric Wong
@ 2012-02-07  7:36                         ` Philipp Bruell
  2012-02-07  9:59                           ` Charles Hornberger
  0 siblings, 1 reply; 13+ messages in thread
From: Philipp Bruell @ 2012-02-07  7:36 UTC (permalink / raw)
  To: unicorn list

Hi,

Sorry for my late rely.

On 07/02/2012 07:21, "Eric Wong" <normalperson@yhbt.net> wrote:

>Eric Wong <normalperson@yhbt.net> wrote:
>> ruby-core pointed me to the following issue:
>>   https://bugs.ruby-lang.org/issues/5757
>> 
>> So there may already be a fix in Ruby SVN, can you test?
>> http://svn.ruby-lang.org/repos/ruby/branches/ruby_1_9_3 r34425
>
>Btw, did anybody get a chance to try this?

I haven't tried it yet. Currently, we are using RVM to install ruby, but
as soon as I've some time, I'll setup a source version of ruby and apply
some of these patches.

>
>While working on an unrelated project, I setup FreeBSD 8.2 and 9.0
>KVM images over the weekend.  Since I had the KVM images handy, I also
>tried to reproduce this issue under 1.9.3-p0 (without a jail) but was
>unable to reproduce the issue under either 8.2 nor 9.0.

Yeah - I've also ask a college to test it under FreeBSD (without Jail) and
he can't reproduce it either. It seems to be a Jail problem (I would run
in cycles too, if I would be in Jail ;-).

>I tried building a jail, but didn't have enough space for a full one.
>If I get the chance, I'll see how building a partial jail goes, but I'm
>not optimistic about being able to reproduce this issue under KVM
>since it seems to be a race condition/timing issue.
>
>I even had both CPU cores enabled under KVM and even installed the
>virtio drivers for better performance.
>
>I assume you guys are using SMP in the jail?

Yes - I'm pretty sure that SMP is involved.

Kind regards
Philipp

_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: FreeBSD jail and unicorn
  2012-02-07  7:36                         ` Philipp Bruell
@ 2012-02-07  9:59                           ` Charles Hornberger
  0 siblings, 0 replies; 13+ messages in thread
From: Charles Hornberger @ 2012-02-07  9:59 UTC (permalink / raw)
  To: unicorn list

On 2/7/12 9:36 AM, Philipp.Bruell@skrill.com wrote:
>Yes - I'm pretty sure that SMP is involved.

The Jail is on a (big) SMP box, and we definitely have access to all the
CPUs. And since the process consumes 200% of cpu according to ps, it seems
clear that at least 2 CPUs are involved...

_______________________________________________
Unicorn mailing list - mongrel-unicorn@rubyforge.org
http://rubyforge.org/mailman/listinfo/mongrel-unicorn
Do not quote signatures (like this one) or top post when replying

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-02-07 10:32 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-31 17:22 FreeBSD jail and unicorn Philipp Bruell
2012-01-31 18:39 ` Eric Wong
2012-01-31 18:50   ` Charles Hornberger
2012-01-31 19:05     ` Eric Wong
     [not found]       ` <CB4EBD2A.7DF%philipp.bruell@skrill.com>
2012-02-01 18:14         ` Eric Wong
     [not found]           ` <CB5014D7.892%philipp.bruell@skrill.com>
2012-02-02 19:31             ` Eric Wong
2012-02-02 22:27               ` Eric Wong
2012-02-02 22:41                 ` Eric Wong
2012-02-02 23:41                   ` Eric Wong
2012-02-03  2:58                     ` Eric Wong
2012-02-07  5:21                       ` Eric Wong
2012-02-07  7:36                         ` Philipp Bruell
2012-02-07  9:59                           ` Charles Hornberger

Code repositories for project(s) associated with this public inbox

	https://yhbt.net/unicorn.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).