Date | Commit message (Collapse) |
|
Latency from redirects is painful, and HTTPS can protect privacy
in some cases.
|
|
We have never had any need for pipes with the default 64K
capacity on Linux. Our pipes are only used for tiny writes
in signal handlers and to perform parent shutdown detection.
With the current /proc/sys/fs/pipe-user-pages-soft
default, only 1024 pipes can be created by an unprivileged
user before Linux clamps down the pipe size to 4K (a single page)
for newly-created pipes[1].
So avoid penalizing OTHER pipe users who could benefit from the
increased capacity and use only a single page for ourselves.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/pipe.c?h=v4.18#n642
|
|
Hijackers may capture and reuse `env' indefinitely, so we must
not use it in those cases for future requests. For non-hijack
requests, we continue to reuse the `env' object to reduce
memory recycling.
Reported-and-tested-by: Sam Saffron <sam.saffron@gmail.com>
|
|
The ruby constant Socket::TCP_INFO is only defined if TCP_INFO is defined
in C, so we can just check for the presence of that ruby constant instead
of rescuing SocketError from the call to getsockopt.
|
|
On OpenBSD, getsockopt(2) does not support TCP_INFO. With the current code,
this results in a 500 for all clients if check_client_connection is enabled
on OpenBSD.
This patch rescues SocketError on the first getsockopt call, and
if SocketError is raised, it doesn't check in the future. This
should be the same behavior as if TCP_INFO was supported but
inspect did not return a string in the expected format.
|
|
raindrops 0.18+ will have Raindrops::TCP state hash for portable
mapping of TCP states to their respective numeric values. This
was necessary because TCP state numbers (and even macro names)
differ between FreeBSD and Linux (and possibly other OSes).
Favor using the Raindrops::TCP state hash if available, but
fall back to the hard-coded values since older versions of
raindrops did not support TCP_INFO on non-Linux systems.
While we're in the area, favor "const_defined?" over "defined?"
to reduce the inline constant cache footprint for branches
which are only evaluated once.
Patches to implement Raindrops::TCP for FreeBSD are available at:
https://bogomips.org/raindrops-public/20170316031652.17433-1-e@80x24.org/T/
|
|
Unlike constants and instance variables, class variable access
is not optimized in the mainline Ruby VM. Use a constant
instead, to take advantage of inline constant caching.
This further reduces runtime instruction size by avoiding a
branch by allocating the Raindrops::TCP_Info object up front.
This reduces the method size by roughly 300 bytes on 64-bit.
|
|
Ruby 2.2+ can show "struct tcp_info" as a string via
Socket::Option#inspect, and we can attempt to parse it
out to extract the information we need.
Parsing this string is inefficient, but does not depend on the
ordering of the tcp_info struct.
|
|
We can force kgio_tryaccept to return an internal class
for TCP objects by subclassing Kgio::TCPServer.
This avoids breakage in any unfortunate projects which depend on
our undocumented internal APIs, such as gctools
<https://github.com/tmm1/gctools>
|
|
* Use a frozen empty array and a class variable for TCP_Info to avoid
garbage. As far as I can tell, this shouldn't result in any garbage on
any requests (other than on the first request).
* Pass listener socket to #read to only check the client connection on
a TCP server.
* Short circuit CLOSE_WAIT after ESTABLISHED since in my testing it's
the most common state after ESTABLISHED, it makes the numbers
un-ordered, though. But comment should make it OK.
* Definition of of `check_client_connection` based on whether
Raindrops::TCP_Info is defined, instead of the class variable
approach.
* Changed the unit tests to pass a `nil` listener.
Tested on our staging environment, and still works like a dream.
I should note that I got the idea between this patch into Puma as well!
https://github.com/puma/puma/pull/1227
[ew: squashed in temporary change for oob_gc.rb, but we'll come
up with a different change to avoid breaking gctools
<https://github.com/tmm1/gctools>]
Acked-by: Eric Wong <e@80x24.org>
|
|
This ensures we won't have duplicate objects in Ruby 2.0-2.4.
For Ruby 2.5.0dev+, this avoids any duplicate cleanup
introduced as of r57471: https://bugs.ruby-lang.org/issues/13085
|
|
Ruby (MRI) 2.1 optimizes allocations away on String#freeze with
literal strings.
Furthermore, Ruby 2.2 optimizes away literal string allocations
when they are used as arguments to Hash#[] and Hash#[]=
Thus we can avoid expensive constant lookups and cache overhead
by taking advantage of advancements in Ruby.
Since Ruby 2.2 has been out for 7 months, now; it ought to be safe
to introduce minor performance regressions for folks using older
Rubies (1.9.3+ remains supported) to benefit folks on the latest
Ruby.
This should recover the performance lost in the
"reflect changes in Rack::Utils::HTTP_STATUS_CODES" change
in synthetic benchmarks.
|
|
Combined with the previous commit to eliminate the `@socket'
instance variable, this eliminates the last instance variable
in the Unicorn::HttpRequest class.
Eliminating the last instance variable avoids the creation of a
internal hash table used for implementing the "generic" instance
variables found in non-pure-Ruby classes. Method entry overhead
remains the same.
While this change doesn't do a whole lot for unicorn memory usage
where the HttpRequest is a singleton, it helps other HTTP servers
which rely on this code where thousands of clients may be connected.
|
|
This avoids the expensive generic instance variable for @socket
and exposes the socket as `env["unicorn.socket"]' to the Rack
application.
As as nice side-effect, applications may access
`env["unicorn.socket"]' as part of the API may be useful for
3rd-party bits such as Raindrops::TCP_Info for reading the tcp_info
struct on Linux-based systems.
Yes, `env["unicorn.socket"]' is a proprietary API in unicorn!
News at 11! But then again, unicorn is not the first Rack server
to implement `env["#{servername}.socket"]', either...
|
|
proc creation is expensive, so merely use a 48-byte generic ivar
hash slot for @socket instead.
|
|
Rack 1.4 and earlier will soon die out, so avoid having extra code
The only minor overhead is assigning two hash slots and
the extra hash checks when running ancient versions of Rack,
so it is unlikely anybody cares about that overhead with Rack 1.5
and later.
|
|
Literal String#freeze avoids allocations since Ruby 2.1 via the
opt_str_freeze instruction, so we can start relying on it in
some places as Ruby 2.1 adoption increases. The 100-continue
handling is a good place to start since it is an uncommonly-used
code path which benefits from size reduction and the negative
performance impact is restricted to a handful of users.
HTTP_RESPONSE_START can safely live in http_request.rb as its
usage does not cross namespace boundaries
The goal is to eventually eliminate Unicorn::Const entirely.
|
|
As far as I can tell, this was never necessary.
|
|
commit a9474624a148fe58e0944664190b259787dcf51e in rack.git
|
|
Once a connection is hijacked, we ignore it completely and leave
the connection at the mercy of the application.
|
|
Rack 1.5.0 (protocol version [1,2]) adds support for
hijacking the client socket (removing it from the control
of unicorn (or any other Rack webserver)).
Tested with rack 1.5.0.
|
|
This patch checks incoming connections and avoids calling the application
if the connection has been closed.
It works by sending the beginning of the HTTP response before calling
the application to see if the socket can successfully be written to.
By enabling this feature users can avoid wasting application rendering
time only to find the connection is closed when attempting to write, and
throwing out the result.
When a client disconnects while being queued or processed, Nginx will log
HTTP response 499 but the application will log a 200.
Enabling this feature will minimize the time window during which the problem
can arise.
The feature is disabled by default and can be enabled by adding
'check_client_connection true' to the unicorn config.
[ew: After testing this change, Tom Burns wrote:
So we just finished the US Black Friday / Cyber Monday weekend running
unicorn forked with the last version of the patch I had sent you. It
worked splendidly and helped us handle huge flash sales without
increased response time over the weekend.
Whereas in previous flash traffic scenarios we would see the number of
HTTP 499 responses grow past the number of real HTTP 200 responses,
over the weekend we saw no growth in 499s during flash sales.
Unexpectedly the patch also helped us ward off a DoS attack where the
attackers were disconnecting immediately after making a request.
ref: <CAK4qKG3rkfVYLyeqEqQyuNEh_nZ8yw0X_cwTxJfJ+TOU+y8F+w@mail.gmail.com>
]
Signed-off-by: Eric Wong <normalperson@yhbt.net>
|
|
Combines the following sequence:
http_parser.buf << socket.readpartial(0x4000)
http_parser.parse
Into:
http_parser.add_parse(socket.readpartial(0x4000))
It was too damn redundant otherwise...
|
|
It's not needed for users, so avoid confusing them.
Unicorn itself is not intended to be an API, it just
hosts Rack applications.
|
|
But allows small optimizations to be made to avoid
constant/instance variable lookups later :)
|
|
This allows users to override the current Rack spec and disable
the rewindable input requirement. This can allow applications
to use less I/O to minimize the performance impact when
processing uploads.
|
|
This should be easier for Rainbows! to use
|
|
This provides the kgio_read! method which is like readpartial,
only significantly cheaper when a client disconnects on us.
|
|
This should hopefully make the non-blocking accept()
situation more tolerable under Ruby 1.9.2.
|
|
This hides more HTTP request logic inside our object.
|
|
This should ensure we have less typing to do.
|
|
Rainbows! will be able to reuse this.
|
|
"stringio" is part of the Ruby distro and we use it in multiple
places, so avoid re-requiring it.
|
|
"[]" is slightly faster under Ruby 1.9 (but slightly
slower under 1.8).
|
|
Copy-on-write will always invalidate it regardless, and
the first request is likely to be slow for any app.
|
|
This will match what's in Rack the 1.1.0 release.
|
|
No point in bloating code for an unlikely path (and the memcpy()
vs malloc() tradeoff is debatable...)
|
|
We've started using magic comments to ensure any strings we
create are binary instead. Additionally, ensure we create any
StringIO objects with an explicit string (which default to
binary) to ensure the StringIO object is binary. This is
because StringIO.new (with no arguments) will always use the
process-wide default encoding since it does not know about
magic comments (and couldn't, really...)
|
|
It is simpler and even slightly faster in micro benchmarks
when inlined.
|
|
This gives applications more rope to play with in case they have
any reasons for changing some values of the default constants.
Freezing strings for Hash assignments still speeds up MRI, so
we'll keep on doing that for now (and as long as MRI supports
frozen strings, I expect them to always be faster for Hashes
though I'd be very happy to be proven wrong...)
|
|
This ensures any string literals that pop up in *our* code will
just be a bag of bytes. This shouldn't affect/fix/break
existing apps in most cases, but most constants will always have
the "correct" encoding (none!) to be consistent with HTTP/socket
expectations. Since this comment affects things only on a
per-source basis, it won't affect existing apps with the
exception of strings we pass to the Rack application.
This will eventually allow us to get rid of that Unicorn::Z
constant, too.
|
|
TeeInput being needed is now (once again) an uncommon code path
so there's no point in relying on global constants. While we're
at it, allow StringIO to be used in the presence of small
inputs; too.
|
|
This makes a noticeable difference on light GET/HEAD requests.
Heck, even the tests run a few seconds faster.
|
|
This should be more robust, faster and easier to deal
with than the ugly proof-of-concept regexp-based ones.
|
|
Regexps can be run against nil just fine
|
|
Anything that calls close on a rack.input body is violating
Rack::Lint; so don't waste cycles supporting them. Being
liberal in things we accept tolerates bad behavior and Unicorn
doesn't have a large userbase that would scream bloody murder if
we stopped supporting broken behavior.
|
|
We couldn't do proper namespacing for the C module so there was
a potential conflict with Init_http11() in Mongrel. This was
needed because Mongrel's HTTP parser could be used in some
applications and we may be unfortunate enough need to support
them.
|
|
There's a small memory reduction to be had when forking
oodles of processes and the Perl hacker in me still
gets confused into thinking those are arrays...
|
|
This change gives applications full control to deny clients
from uploading unwanted message bodies. This also paves the
way for doing things like upload progress notification within
applications in a Rack::Lint-compatible manner.
Since we don't support HTTP keepalive, so we have more freedom
here by being able to close TCP connections and deny clients the
ability to write to us (and thus wasting our bandwidth).
While I could've left this feature off by default indefinitely
for maximum backwards compatibility (for arguably broken
applications), Unicorn is not and has never been about
supporting the lowest common denominator.
|
|
This gives the app ability to deny clients with 417 instead of
blindly making the decision for the underlying application. Of
course, apps must be made aware of this.
|