Rack HTTP server for Unix and fast clients

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
 = The Philosophy Behind unicorn

Being a server that only runs on Unix-like platforms, unicorn is
strongly tied to the Unix philosophy of doing one thing and (hopefully)
doing it well.  Despite using HTTP, unicorn is strictly a _backend_
application server for running Rack-based Ruby applications.

== Avoid Complexity

Instead of attempting to be efficient at serving slow clients, unicorn
relies on a buffering reverse proxy to efficiently deal with slow
clients.

unicorn uses an old-fashioned preforking worker model with blocking I/O.
Our processing model is the antithesis of more modern (and theoretically
more efficient) server processing models using threads or non-blocking
I/O with events.

=== Threads and Events Are Hard

...to many developers.  Reasons for this is beyond the scope of this
document.  unicorn avoids concurrency within each worker process so you
have fewer things to worry about when developing your application.  Of
course unicorn can use multiple worker processes to utilize multiple
CPUs or spindles.  Applications can still use threads internally, however.

== Slow Clients Are Problematic

Most benchmarks we've seen don't tell you this, and unicorn doesn't
care about slow clients... but <i>you</i> should.

A "slow client" can be any client outside of your datacenter.  Network
traffic within a local network is always faster than traffic that
crosses outside of it.  The laws of physics do not allow otherwise.

Persistent connections were introduced in HTTP/1.1 reduce latency from
connection establishment and TCP slow start.  They also waste server
resources when clients are idle.

Persistent connections mean one of the unicorn worker processes
(depending on your application, it can be very memory hungry) would
spend a significant amount of its time idle keeping the connection alive
<i>and not doing anything else</i>.  Being single-threaded and using
blocking I/O, a worker cannot serve other clients while keeping a
connection alive.  Thus unicorn does not implement persistent
connections.

If your application responses are larger than the socket buffer or if
you're handling large requests (uploads), worker processes will also be
bottlenecked by the speed of the *client* connection.  You should
not allow unicorn to serve clients outside of your local network.

== Application Concurrency != Network Concurrency

Performance is asymmetric across the different subsystems of the machine
and parts of the network.  CPUs and main memory can process gigabytes of
data in a second; clients on the Internet are usually only capable of a
tiny fraction of that.  unicorn deployments should avoid dealing with
slow clients directly and instead rely on a reverse proxy to shield it
from the effects of slow I/O.

== Improved Performance Through Reverse Proxying

By acting as a buffer to shield unicorn from slow I/O, a reverse proxy
will inevitably incur overhead in the form of extra data copies.
However, as I/O within a local network is fast (and faster still
with local sockets), this overhead is negligible for the vast majority
of HTTP requests and responses.

The ideal reverse proxy complements the weaknesses of unicorn.
A reverse proxy for unicorn should meet the following requirements:

1. It should fully buffer all HTTP requests (and large responses).
   Each request should be "corked" in the reverse proxy and sent
   as fast as possible to the backend unicorn processes.  This is
   the most important feature to look for when choosing a
   reverse proxy for unicorn.

2. It should spend minimal time in userspace.  Network (and disk) I/O
   are system-level tasks and usually managed by the kernel.
   This may change if userspace TCP stacks become more popular in the
   future; but the reverse proxy should not waste time with
   application-level logic.  These concerns should be separated

3. It should avoid context switches and CPU scheduling overhead.
   In many (most?) cases, network devices and their interrupts are
   only be handled by one CPU at a time.  It should avoid contention
   within the system by serializing all network I/O into one (or few)
   userspace processes.  Network I/O is not a CPU-intensive task and
   it is not helpful to use multiple CPU cores (at least not for GigE).

4. It should efficiently manage persistent connections (and
   pipelining) to slow clients.  If you care to serve slow clients
   outside your network, then these features of HTTP/1.1 will help.

5. It should (optionally) serve static files.  If you have static
   files on your site (especially large ones), they are far more
   efficiently served with as few data copies as possible (e.g. with
   sendfile() to completely avoid copying the data to userspace).

nginx is the only (Free) solution we know of that meets the above
requirements.

Indeed, the folks behind unicorn have deployed nginx as a reverse-proxy not
only for Ruby applications, but also for production applications running
Apache/mod_perl, Apache/mod_php and Apache Tomcat.  In every single
case, performance improved because application servers were able to use
backend resources more efficiently and spend less time waiting on slow
I/O.

== Worse Is Better

Requirements and scope for applications change frequently and
drastically.  Thus languages like Ruby and frameworks like Rails were
built to give developers fewer things to worry about in the face of
rapid change.

On the other hand, stable protocols which host your applications (HTTP
and TCP) only change rarely.  This is why we recommend you NOT tie your
rapidly-changing application logic directly into the processes that deal
with the stable outside world.  Instead, use HTTP as a common RPC
protocol to communicate between your frontend and backend.

In short: separate your concerns.

Of course a theoretical "perfect" solution would combine the pieces
and _maybe_ give you better performance at the end of the day, but
that is not the Unix way.

== Just Worse in Some Cases

unicorn is not suited for all applications.  unicorn is optimized for
applications that are CPU/memory/disk intensive and spend little time
waiting on external resources (e.g. a database server or external API).

unicorn is highly inefficient for Comet/reverse-HTTP/push applications
where the HTTP connection spends a large amount of time idle.
Nevertheless, the ease of troubleshooting, debugging, and management of
unicorn may still outweigh the drawbacks for these applications.