Date | Commit message (Collapse) |
|
Rainbows! relies on the ERROR_XXX_RESPONSE constants of unicorn
4.x. Changing the constants in unicorn 4.x will break existing
versions of Rainbows!, so remove the dependency on the constants
and generate the error response dynamically.
Unlike Mongrel, unicorn is unlikely to see malicious traffic and
thus unlikely to benefit from making error messages constant.
For unicorn 5.x, we will drop these constants entirely.
(Rainbows! most likely cannot support check_client_connection
consistently across all concurrency models since some of them
pessimistically buffer all writes in userspace. However, the
extra concurrency of Rainbows! makes it less likely to be
overloaded than unicorn, so this feature is likely less useful
for Rainbows!)
|
|
This patch checks incoming connections and avoids calling the application
if the connection has been closed.
It works by sending the beginning of the HTTP response before calling
the application to see if the socket can successfully be written to.
By enabling this feature users can avoid wasting application rendering
time only to find the connection is closed when attempting to write, and
throwing out the result.
When a client disconnects while being queued or processed, Nginx will log
HTTP response 499 but the application will log a 200.
Enabling this feature will minimize the time window during which the problem
can arise.
The feature is disabled by default and can be enabled by adding
'check_client_connection true' to the unicorn config.
[ew: After testing this change, Tom Burns wrote:
So we just finished the US Black Friday / Cyber Monday weekend running
unicorn forked with the last version of the patch I had sent you. It
worked splendidly and helped us handle huge flash sales without
increased response time over the weekend.
Whereas in previous flash traffic scenarios we would see the number of
HTTP 499 responses grow past the number of real HTTP 200 responses,
over the weekend we saw no growth in 499s during flash sales.
Unexpectedly the patch also helped us ward off a DoS attack where the
attackers were disconnecting immediately after making a request.
ref: <CAK4qKG3rkfVYLyeqEqQyuNEh_nZ8yw0X_cwTxJfJ+TOU+y8F+w@mail.gmail.com>
]
Signed-off-by: Eric Wong <normalperson@yhbt.net>
|
|
In the case where preload_app is true, delay binding new
listeners until after loading the application.
Some applications have very long load times (especially Rails
apps with Ruby 1.9.2). Binding listeners early may cause a load
balancer to incorrectly believe the unicorn workers are ready to
serve traffic even while the app is being loaded.
Once a listener is bound, connect() requests from the load
balancer succeed until the listen backlog is filled. This
allows requests to pile up for a bit (depending on backlog size)
before getting rejected by the kernel. By the time the
application is loaded and ready-to-run, requests in the
listen backlog are likely stale and not useful to process.
Processes inheriting listeners do not suffer this effect, as the
old process should still be capable of serving new requests.
This change does not improve the situation for the
preload_app=false (default) use case. There may not be a
solution for preload_app=false users using large applications.
Fortunately Ruby 1.9.3+ improves load times of large
applications significantly over 1.9.2 so this should be less of
a problem in the future.
Reported via private email sent on 2012-06-29T22:59:10Z
|
|
Since there's nothing unicorn can do to avoid this error
on unconnected/halfway-connected clients, ignoring ENOTCONN
is a safe bet.
Rainbows! has long had this rescue as it called getpeername(2)
on untrusted sockets
|
|
Previously we relied on implicit socket shutdown() from the
close() syscall. However, some Rack applications fork()
(without calling exec()), creating a potentially long-lived
reference to the underlying socket in a child process. This
ends up causing nginx to wait on the socket shutdown when the
child process exits.
Calling shutdown() explicitly signals nginx (or whatever client)
that the unicorn worker is done with the socket, regardless of
the number of FD references to the underlying socket in
existence.
This was not an issue for applications which exec() since
FD_CLOEXEC is always set on the client socket.
Thanks to Patrick Wenger for discovering this. Thanks to
Hongli Lai for the tip on using shutdown() as is done in
Passenger.
ref: http://mid.gmane.org/CAOG6bOTseAPbjU5LYchODqjdF3-Ez4+M8jo-D_D2Wq0jkdc4Rw@mail.gmail.com
|
|
In some cases, EPERM may indicate a real configuration problem,
but it can also just mean the pid file is stale.
|
|
If unicorn doesn't get terminated cleanly (for example if the machine
has its power interrupted) and the pid in the pidfile gets used by
another process, the current unicorn code will exit and not start a
server. This tiny patch fixes that behaviour.
Acked-by: Eric Wong <normalperson@yhbt.net>
|
|
No need to duplicate logic here
|
|
It's possible for a SIGUSR1 signal to be received in the
worker immediately before calling IO.select. In that case,
do not clutter logging with IOError and just process the
reopen log request.
|
|
This will also be the foundation of SSL support in Rainbows!
and Zbatery. Some users may also want to use this in
Unicorn on LANs to meet certain security/auditing requirements.
Of course, Nightmare! (in whatever form) should also be able to
use it.
|
|
This prevents the stopping of all workers by SIGWINCH if you're
using a windowing system that will 'exec' unicorn from a process
that's already in a process group.
Acked-by: Eric Wong <normalperson@yhbt.net>
|
|
The old comment was confusing. We only zero the tick counter
when forking because application loading can take a long time.
Otherwise, it's always updated.
ref: http://mid.gmane.org/20110908191352.GA25251@dcvr.yhbt.net
|
|
There is no need to keep extra hashes or Proc objects around in
the heap.
|
|
I've noticed in stderr logs from some folks that (last resort)
timeouts from the master process are taking too long to activate
due to the workarounds for suspend/hibernation.
|
|
The signal handler from the master is still active and will
push the pending signal to SIG_QUEUE if a worker receives
a signal immediately after forking.
|
|
We only need the fileno in the key which we use
to generate the UNICORN_FD env. Otherwise the IO
object is accepted and understood by Ruby.
|
|
Setting the close-on-exec flag by default and closing
non-standard descriptors is proposed for Ruby 1.9.4/2.0.0.
Since Unicorn is one of the few apps to rely on FD inheritance
across exec(), we need to workaround this by redirecting each
listener FD to itself for Kernel#exec.
Ruby supports a hash as the final argument to Kernel#exec since
at least 1.9.1 (nobody cares for 1.9.0 anymore). This allows
users to backport close-on-exec by default patches to older
1.9.x installs without breaking anything.
ref: http://redmine.ruby-lang.org/issues/5041
|
|
This helps close a race condition preventing shutdown if
loading the application (preload_app=false) takes a long
time and the user decides to kil workers instead.
|
|
Future versions of Ruby may change this from the default *nix
behavior, so we need to explicitly allow FD passing via exec().
ref: http://redmine.ruby-lang.org/issues/5041
|
|
The testcase for this was broken, too, so we didn't notice
this :<
Reported-by: ghazel@gmail.com on the Rainbows! mailing list,
http://mid.gmane.org/BANLkTi=oQXK5Casq9SuGD3edeUrDPvRm3A@mail.gmail.com
|
|
It's still O(n) since we don't maintain a reverse mapping of
spawned processes, but at least we avoid the extra overhead of
creating an array every time.
|
|
Some applications/libraries may launch background threads which
can lock up the process. So we can't disable heartbeat checking
just because the main thread is sleeping. This also has the
side effect of reducing master process wakeups when all workers
are idle.
|
|
We don't want the Worker#tick= assignment to trigger after we
accept a client, since we'd drop that request when we raise the
exception that breaks us out of the worker loop.
Also, we don't want to enter IO.select with an empty LISTENERS
array so we can fail with IOError or Errno::EBADF.
|
|
A leftover from the fchmod() days
|
|
Backtraces are now formatted properly (with timestamps) and
exceptions will be logged more consistently and similar to
Logger defaults:
"#{exc.message} (#{e.class})"
backtrace.each { |line| ... }
This may break some existing monitoring scripts, but errors
will be more standardized and easier to check moving forward.
|
|
"app error" is more correct, and consistent with Rainbows!
|
|
rescuing from SystemExit and exit()-ing again is ugly, but
changes made to lower stack depth positively affect _everyone_
so we'll tolerate some ugliness here.
We'll need to disable graceful exit for some tests, too...
|
|
This means we no longer waste an extra file descriptor per
worker process in the master. Now there's no need to set a
higher file descriptor limit for systems running >= 1024
workers.
|
|
There's absolutely no need to keep the OptionParser around in
worker processes.
|
|
We always know we have zero workers at startup, so we don't
need to check before hand. SIGHUP users may suffer a small
performance decrease as a result, but there's not much we
can do about it.
|
|
This should be easier to understand and reduces garbage on
stack, too.
|
|
kgio never does reverse lookup
|
|
Ruby IO.select never raises that, actually
|
|
By avoid Array#each
|
|
ivar references using @ are slightly faster than calling
attribute methods.
|
|
Oops, it messes logging up badly.
|
|
This reduces the size of `caller` by 5 frames,
which should make backtraces easier-to-read, raising
exceptions less expensive, and reduce GC runtime.
|
|
Enabling this flag for an IPv6 TCP listener allows users to
specify IPv6-only listeners regardless of the OS default.
This should be interest to Rainbows! users.
|
|
Rainbows! wants to be able to lower this eventually...
|
|
There's an HTTP status code allocated for it in
<http://www.iana.org/assignments/http-status-codes>, so
return that instead of 400.
|
|
Using the return value of Kernel#srand actually made the
problem worse. Using the value of Kernel#rand is required
to actually get a random value to seed the OpenSSL PRNG.
Thanks to ghazel for the bug report!
|
|
Older Rainbows! redefines the ready_pipe= accessor method
to call internal after_fork hooks.
|
|
Don't clutter up our RDoc/website with things that users
of Unicorn don't need to see. This should make user-relevant
documentation easier to find, especially since Unicorn is
NOT intended to be an API.
|
|
OpenSSL seeds its PRNG with the process ID, so if a process ID
is recycled, there's a chance of indepedent workers getting
repeated PRNG sequences over a long time period iff the same
PID is used.
This only affects deployments that meet both of the following
conditions:
1) OpenSSL::Random.random_bytes is called before forking
2) worker (but not master) processes are die unexpectedly
The SecureRandom module in Ruby (and Rails) uses the OpenSSL
PRNG if available. SecureRandom is used by Rails and called
when the application is loaded, so most Rails apps with
frequently dying worker processes are affected.
Of course dying worker processes are bad and entirely the
fault of bad application/library code, not the fault of
Unicorn.
Thanks for Alexander Dymo for reporting this.
ref: http://redmine.ruby-lang.org/issues/4579
|
|
The current versions of Ruby 1.8 do not reseed the PRNG after
forking, so we'll work around that by calling Kernel#srand.
ref: http://redmine.ruby-lang.org/issues/show/4338
|
|
They should then recover and inherit writable descriptors
from the master when it respawns.
|
|
We don't want to repeatedly reclose the same IOs
and keep raising exceptions this way.
|
|
for i in `git ls-files '*.rb'`; do ruby -w -c $i; done
|
|
Response bodies may capture the block passed to each
and save it for body.close, so don't close the socket
before we have a chance to call body.close
|
|
No need to preserve the response tuplet if we're just
going to unpack it eventually.
|