Bug, probably related to Unicoen

unicorn Ruby/Rack server user+dev discussion/patches/pulls/bugs/help
 help / color / mirror / code / Atom feed

* Bug, probably related to Unicoen
@ 2017-09-14  8:25 Felix Yasnopolski
  2017-09-14  9:15 ` Eric Wong
  0 siblings, 1 reply; 3+ messages in thread
From: Felix Yasnopolski @ 2017-09-14  8:25 UTC (permalink / raw)
  To: unicorn-public

We use Ruby v2.2.3, Rails v4.2.4, Unicorn v5.1.0 in production.

ActiveRecord validation rules and callbacks for all models of app has
been disabled after unicorn restart (from log rotation script). Here
it is

/var/www/fitness_crm/shared/log/production.log
/var/www/fitness_crm/shared/log/unicorn.stderr.log
/var/www/fitness_crm/shared/log/unicorn.stdout.log {
        daily
        missingok
        rotate 30
        compress
        notifempty
        create 640 deploy deploy
        sharedscripts
        postrotate
                kill -s USR2 `cat
/var/www/fitness_crm/shared/tmp/pids/unicorn.pid`
        endscript
}

We've changed the script to send USR1 instead of USR2, but nothing has
changed. When USR1 is sent validations/callbacks simply keep being
disabled. Here is our unicorn configuration file

working_directory "/var/www/fitness_crm/current"
pid "/var/www/fitness_crm/shared/tmp/pids/unicorn.pid"
stdout_path "/var/www/fitness_crm/shared/log/unicorn.stdout.log"
stderr_path "/var/www/fitness_crm/shared/log/unicorn.stderr.log"

listen "/tmp/unicorn.fitness_crm_production.sock"

worker_processes 8
timeout 30

preload_app true

before_exec do |server|
  ENV["BUNDLE_GEMFILE"] = "/var/www/fitness_crm/current/Gemfile"
end

before_fork do |server, worker|
  # Disconnect since the database connection will not carry over
  if defined? ActiveRecord::Base
    ActiveRecord::Base.connection.disconnect!
  end

  # Quit the old unicorn process
  old_pid = "#{server.config[:pid]}.oldbin"
  if File.exists?(old_pid) && server.pid != old_pid
    begin
      Process.kill("QUIT", File.read(old_pid).to_i)
    rescue Errno::ENOENT, Errno::ESRCH
      # someone else did our job for us
    end
  end

  if defined?(Resque)
    Resque.redis.quit
  end

  sleep 1
end

after_fork do |server, worker|
  # Start up the database connection again in the worker
  if defined?(ActiveRecord::Base)
    ActiveRecord::Base.establish_connection
  end

  if defined?(Resque)
    Resque.redis = 'localhost:6379'
  end
end

After this we kill unicorn and start it manually with command:

bundle exec unicorn -D -c
/var/www/fitness_crm/shared/config/unicorn.rb -E production

After this everything is good and validations and callbacks are
enabled again. Please, help to find out what is the reason of such a
behavior and how to fix it.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Bug, probably related to Unicoen
  2017-09-14  8:25 Bug, probably related to Unicoen Felix Yasnopolski
@ 2017-09-14  9:15 ` Eric Wong
  2017-09-27  7:05   ` Eric Wong
  0 siblings, 1 reply; 3+ messages in thread
From: Eric Wong @ 2017-09-14  9:15 UTC (permalink / raw)
  To: Felix Yasnopolski; +Cc: unicorn-public

Felix Yasnopolski <felix.yasnopolski@gmail.com> wrote:
> We use Ruby v2.2.3, Rails v4.2.4, Unicorn v5.1.0 in production.
> 
> ActiveRecord validation rules and callbacks for all models of app has
> been disabled after unicorn restart (from log rotation script). Here
> it is

Wow, that's a new one... I haven't used Rails in years, but it's
certainly not intentional behavior on the part of unicorn.

Hopefully somebody else with Rails knowledge can chime in.

Some questions:

Do you have anything which overrides USR1?

Are the logs getting rotated correctly?

(more below ...)

> /var/www/fitness_crm/shared/log/production.log
> /var/www/fitness_crm/shared/log/unicorn.stderr.log
> /var/www/fitness_crm/shared/log/unicorn.stdout.log {
>         daily
>         missingok
>         rotate 30
>         compress
>         notifempty
>         create 640 deploy deploy
>         sharedscripts
>         postrotate
>                 kill -s USR2 `cat
> /var/www/fitness_crm/shared/tmp/pids/unicorn.pid`
>         endscript
> }
> 
> We've changed the script to send USR1 instead of USR2, but nothing has
> changed. When USR1 is sent validations/callbacks simply keep being
> disabled. Here is our unicorn configuration file
> 
> working_directory "/var/www/fitness_crm/current"
> pid "/var/www/fitness_crm/shared/tmp/pids/unicorn.pid"
> stdout_path "/var/www/fitness_crm/shared/log/unicorn.stdout.log"
> stderr_path "/var/www/fitness_crm/shared/log/unicorn.stderr.log"

OK, that looks right.

Do your logs show the log rotations as they're happening, at least?

Anything else pop up in the logs which might help us figure out
what's wrong?  (for example, do you have dying workers)

> listen "/tmp/unicorn.fitness_crm_production.sock"
> 
> worker_processes 8
> timeout 30
> 
> preload_app true

Can you try omitting preload_app and see if you can reproduce
the problem?  The big reason preload_app is `false' by default
is it could break some code...

And probably use only 1 worker to avoid memory
overhead/excessive log spew during debugging.

> before_exec do |server|
>   ENV["BUNDLE_GEMFILE"] = "/var/www/fitness_crm/current/Gemfile"
> end
> 
> before_fork do |server, worker|
>   # Disconnect since the database connection will not carry over
>   if defined? ActiveRecord::Base
>     ActiveRecord::Base.connection.disconnect!
>   end
> 
>   # Quit the old unicorn process
>   old_pid = "#{server.config[:pid]}.oldbin"
>   if File.exists?(old_pid) && server.pid != old_pid
>     begin
>       Process.kill("QUIT", File.read(old_pid).to_i)
>     rescue Errno::ENOENT, Errno::ESRCH
>       # someone else did our job for us
>     end
>   end
> 
>   if defined?(Resque)
>     Resque.redis.quit
>   end
> 
>   sleep 1
> end
> 
> after_fork do |server, worker|
>   # Start up the database connection again in the worker
>   if defined?(ActiveRecord::Base)
>     ActiveRecord::Base.establish_connection
>   end
> 
>   if defined?(Resque)
>     Resque.redis = 'localhost:6379'
>   end
> end

If log rotations (via USR1) fails (e.g. due to permission
error), the workers die and get replaced, instead.  This allows
the log files to be released and rotated in "brute force"
fashion...

Can you reproduce the problem by sending a QUIT, TERM, or KILL
signal directly to a worker process?  (Also, note the time
interval between initial master startup and your test)

A later-spawned worker could behave differently/strangely
from an originally-spawned one...

Finally, can you reproduce your original problem by sending USR1
to the master immediately after a fresh start?

With logrotate (which only runs intermittently), the problem
could be manifesting from an app timeout or time skew of some
sort interacting poorly with preload_app and respawned workers.

> After this we kill unicorn and start it manually with command:
> 
> bundle exec unicorn -D -c
> /var/www/fitness_crm/shared/config/unicorn.rb -E production
> 
> After this everything is good and validations and callbacks are
> enabled again. Please, help to find out what is the reason of such a
> behavior and how to fix it.

This is why I gave the time skew/timeout hypothesis above.

Otherwise; I'm out of ideas, for now (and falling asleep).
Hoping somebody more awake with modern Rails knowledge can chime
in and help with things I'm missing.

Again; one of the first unicorn-specific steps for
troubleshooting is to disable preload_app and use a single
worker process which you can trace more easily.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Bug, probably related to Unicoen
  2017-09-14  9:15 ` Eric Wong
@ 2017-09-27  7:05   ` Eric Wong
  0 siblings, 0 replies; 3+ messages in thread
From: Eric Wong @ 2017-09-27  7:05 UTC (permalink / raw)
  To: Felix Yasnopolski; +Cc: unicorn-public

Felix: Ping?  Did you ever get a chance to try the things I
suggested in my previous message or figure this out another way?

ref: https://bogomips.org/unicorn-public/20170914091548.GA13634@dcvr/

Thanks.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-09-27  7:05 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-14  8:25 Bug, probably related to Unicoen Felix Yasnopolski
2017-09-14  9:15 ` Eric Wong
2017-09-27  7:05   ` Eric Wong

Code repositories for project(s) associated with this public inbox

	https://yhbt.net/unicorn.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).