Rack HTTP server for Unix and fast clients

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
 == Design

* Simplicity: Unicorn is a traditional UNIX prefork web server.
  No threads are used at all, this makes applications easier to debug
  and fix.  When your application goes awry, a BOFH can just
  "kill -9" the runaway worker process without worrying about tearing
  all clients down, just one.  Only UNIX-like systems supporting
  fork() and file descriptor inheritance are supported.

* The Ragel->C HTTP parser is taken from Mongrel.  This is the
  only non-Ruby part and there are no plans to add any more
  non-Ruby components.

* All HTTP protocol parsing and I/O is done just like Mongrel:
    1. read/parse HTTP request in full
    2. call Rack application
    3. write HTTP response back to the client

* Like Mongrel, neither keepalive nor pipelining are supported.
  These aren't needed since Unicorn is only designed to serve
  fast, low-latency clients directly.  Do one thing, do it well;
  let nginx handle slow clients.

* Configuration is purely in Ruby and eval().  Ruby is less
  ambiguous than YAML and lets lambdas for
  before_fork/after_fork/before_exec hooks be defined inline.  An
  optional, separate config_file may be used to modify supported
  configuration changes (and also gives you plenty of rope if you RTFS
  :>)

* One master process spawns and reaps worker processes.  The
  Rack application itself is called only within the worker process (but
  can be loaded within the master).  A copy-on-write friendly garbage
  collector like Ruby Enterprise Edition can be used to minimize memory
  usage along with the "preload_app true" directive.

* The number of worker processes should be scaled to the number of
  CPUs, memory or even spindles you have.  If you have an existing
  Mongrel cluster, using the same amount of processes should work.
  Let a full-HTTP-request-buffering reverse proxy like nginx manage
  concurrency to thousands of slow clients for you.  Unicorn scaling
  should only be concerned about limits of your backend system(s).

* Load balancing between worker processes is done by the OS kernel.
  All workers share a common set of listener sockets and does
  non-blocking accept() on them.  The kernel will decide which worker
  process to give a socket to and workers will sleep if there is
  nothing to accept().

* Since non-blocking accept() is used, there can be a thundering
  herd when an occasional client connects when application
  *is not busy*.  The thundering herd problem should not affect
  applications that are running all the time since worker processes
  will only select()/accept() outside of the application dispatch.

* Blocking I/O is used for clients.  This allows a simpler code path
  to be followed within the Ruby interpreter and fewer syscalls.
  Applications that use threads should continue to work if Unicorn
  is serving LAN or localhost clients.

* Timeout implementation is done via fchmod(2) in each worker
  on a shared file descriptor to update st_ctime on the inode.
  Master process wakeups for checking on timeouts is throttled
  one a second to minimize the performance impact and simplify
  the code path within the worker.  Neither futimes(2) nor
  pwrite(2)/pread(2) are supported by base MRI, nor are they as
  portable on UNIX systems as fchmod(2).

* SIGKILL is used to terminate the timed-out workers as reliably
  as possible on a UNIX system.

* The poor performance of select() on large FD sets is avoided
  as few file descriptors are used in each worker.
  There should be no gain from moving to highly scalable but
  unportable event notification solutions for watching few
  file descriptors.

* If the master process dies unexpectedly for any reason,
  workers will notice within :timeout/2 seconds and follow
  the master to its death.