cmogstored changelog since v1.0.0 Full changeset information is available at https://yhbt.net/cmogstored.git See NEWS file for a user-oriented summary of changes commit e1a4e5d1c0361d31fe771b6b83b0da50690635df (HEAD -> master, tag: v1.8.1) Author: Eric Wong Date: 2021-02-13 02:20:04 +0000 cmogstored 1.8.1 - use default system stack size This release fixes a segfault on some systems/toolchains where our per-thread stack size was too small. Given the prevalance of 64-bit systems nowadays, using a small stack is unlikely to yield any benefits. Users on 32-bit systems who wish to continue with a minimal stack should use "ulimit -s" in startup scripts or configure their process manager appropriately (e.g. setting the "LimitSTACK" directive in described in systemd.exec(5)). Thanks to Xiao Yu for reporting and testing at our public mailbox: https://yhbt.net/cmogstored-public/CABfxMcW+kb5gwq3pSB_89P49EVv+4UkJXz+mUPQTy19AdrwbAg@mail.gmail.com/T/ commit f441da11c290373e444771d4806dfe58d4d6d972 (origin/master) Author: Eric Wong Date: 2021-02-13 01:03:58 +0000 thrpool: remove stack size changing for all platforms As compilers and system C libraries change, the using a non-default stack size is too risky and can lead to difficult-to-diagnose problems. Using the default stack size seems to solve the segfaults at http_close reported by Xiao Yu . Users on modern 64-bit systems were unlikely to find any benefit in using a small stack size with this code base. Users on 32-bit systems who wish to continue with a minimal stack should use "ulimit -s" in startup scripts or configure their process manager appropriately (e.g. setting the "LimitSTACK" directive in described in systemd.exec(5)). Reported-and-tested-by: Xiao Yu Link: https://yhbt.net/cmogstored-public/CABfxMcW+kb5gwq3pSB_89P49EVv+4UkJXz+mUPQTy19AdrwbAg@mail.gmail.com/T/ commit fac3a390395520c10d6d0524448c9aa26768a7d1 (tag: v1.8.0) Author: Eric Wong Date: 2020-08-13 20:49:37 +0000 cmogstored 1.8.0 devXXX/usage files are emitted properly for systems where the mount point can't be resolved. This is needed for multi-device filesystems such as btrfs. PUT and DELETE requests now update the in-memory representation of these devXXX/usage files, since the 10s update interval may be too low for high-traffic situations. Their is a new "USAGE FILES" section in the manpage documenting the changes from 1.7.0 and this release. Our public mail archives are now available over IMAPS. gnulib is updated to 4e082bffbcc46e68 in the tarball. commit 3d4f0bbffae8eb4a111f5ef9cfc4f8997e774187 Author: Eric Wong Date: 2020-08-13 20:44:37 +0000 doc: add IMAPS, NNTPS and .onion archive URLs public-inbox.org started supporting IMAP + IMAPS a few months ago, and has always supported Tor .onions via NNTP. Non-TLS is still supported for older systems and users with oppressive firewalls. commit 9958e4ea86dee4a2f65656356759ac537e1bfc47 Author: Eric Wong Date: 2020-08-13 19:57:29 +0000 m4/.gitignore: update for gnulib 4e082bffbcc46e68 commit 4e082bffbcc46e68644ae0d59b4f09bf2b5feb84 ("sys_random: Work around an uClibc bug.") commit 3a97c98e07fdfc988199fe00f3471bb76620215b Author: Eric Wong Date: 2020-07-22 18:40:41 +0000 http: update in-memory devXX/usage on PUT+DELETE Under heavy write traffic, free space changes constantly, and the periodic updates every 10 (or MOG_DISK_USAGE_INTERVAL) seconds can be too far behind. Since we keep the usage file contents in-memory now for out-of-FD situations, we can update that without incurring extra VFS traffic. v2: We no longer try to use fstatvfs(2) and instead pay the cost of an extra name lookups and just update all usage files. This was necessary since calculating free space while a file is still open can take a long time on some FSes and we need to send the HTTP response back ASAP to avoid timeouts on the client-side. This avoids contention in the request worker threads and the mostly idle main thread to do more work. commit d5451338548c9cbfc159c5f166a4236e70d098aa Author: Eric Wong Date: 2020-07-22 20:03:27 +0000 doc: add "USAGE FILES" section to manpage And fix formatting of the SIGNALS section while we're at it. commit fc2f3298da5ad3496ff2ae7c1f6b5b5c4327decd Author: Eric Wong Date: 2020-07-20 03:36:48 +0000 dev.c: emit usage for devices with unknown mount point LUKS + btrfs on Linux gives an .st_rdev value of `0', so we can't reliably figure out what the "device:" field in /devXX/usage should be without parsing /proc/partitions. Since MogileFS::Worker::Monitor only cares about the "used:" and "total:" fields, we'll just emit "(?)" in the device field. The effort of parsing /proc/partitions to correctly display a field that our only known consumer won't use is a waste of our time. commit d7626e6cd2e71ab3587d3facc561187d8f94afa4 (tag: v1.7.3) Author: Eric Wong Date: 2020-03-22 00:09:36 +0000 cmogstored 1.7.3 Improve RFC 7230 conformance w.r.t Content-Length and Transfer-Encoding handling in PUT requests. We now favor "Transfer-Encoding: chunked" if a Content-Length header is also present. Furthermore, we no longer accept Transfer-Encoding values aside from chunk, since we don't support gzip/compress/deflate as described in RFC 7230. commit a4af139431e74bf0c5d8c0b361c9dc154637cfb2 Author: Eric Wong Date: 2020-03-17 06:56:52 +0000 http: favor chunked over Content-Length RFC 7230 is actually explicit about favoring the "Transfer-Encoding: chunked" over a Content-Length header when a client specifies both. commit 5ba9c3ef8a90a64ff34dc069d4ed89f91d38606a Author: Eric Wong Date: 2020-03-17 06:56:51 +0000 http: reject non-chunked Transfer-Encoding RFC 7230 3.3.3, point 3 states: > If a Transfer-Encoding header field > is present in a request and the chunked transfer coding is not > the final encoding, the message body length cannot be determined > reliably; the server MUST respond with the 400 (Bad Request) > status code and then close the connection. And no MogileFS client is known to send "gzip", "deflate", or "compress" as part of the Transfer-Encoding, so we'll only accept "chunked". commit 63a57fee9e75c6fad2b146a125ac8f029773a36b Author: Eric Wong Date: 2020-02-19 08:17:09 +0000 build-aux/txt2pre: match '!' and '@' in URLs commit f98c236b8b70cf8d18c0d52c51bb798bb7f29bac (tag: v1.7.2) Author: Eric Wong Date: 2020-02-19 01:29:04 +0000 cmogstored 1.7.2 s/bogomips.org/yhbt.net/ in all documentation, due to bogomips.org expiring. The tarball is also updated with the latest gnulib changes. commit 5ee44b19b02016429b065875cf332df90fcffed3 Author: Eric Wong Date: 2020-02-19 01:09:57 +0000 update gnulib to f4693b0166bab83ab232dcd3cfd95906411d1110 commit 0abf9c357584c4d25d924871dc41d2b1cb9695c1 Author: Eric Wong Date: 2020-01-18 21:08:56 +0000 s/bogomips.org/yhbt.net/, update copyrights for 2020 bogomips.org is due to expire, soon, and I'm not willing to pay extortionist fees to Ethos Capital/PIR/ICANN to keep a .org. So it's at yhbt.net, for now, but it will change again to whatever's affordable... Identity is overrated. Tor users can use .onions and kick ICANN to the curb: torsocks w3m http://cmogstored.ou63pmih66umazou.onion/ torsocks git clone http://ou63pmih66umazou.onion/cmogstored.git/ torsocks w3m http://ou63pmih66umazou.onion/cmogstored-public/ commit 14d8e8b1fd9720bf0061a1050ec24e4833fcf8cb Author: Eric Wong Date: 2019-09-21 20:57:43 +0000 TODO: a few low-priority items... commit 3943bc911e78b0df2308c0ddd8930d9da3c996fd (tag: v1.7.1) Author: Eric Wong Date: 2019-05-11 20:13:14 +0000 cmogstored 1.7.1 - Linux 5.0/5.1 epoll_pwait workaround The Linux kernel bugfix should hit mainline and stable kernels, soon. But there's no reason for us to be caring if errno is EINTR or not... cf. https://lore.kernel.org/lkml/20190427093319.sgicqik2oqkez3wk@dcvr/ https://lore.kernel.org/lkml/20190507043954.9020-1-deepa.kernel@gmail.com/ There are also some minor build/test updates since v1.7.0 (2018-12-18): test/mgmt-usage.rb: fix mismatched indentation warning add .gitattributes for Ruby files test/mgmt_auto_adjust.rb: improve diagnostic messages .gitignore: add extra ignores for gnulib in Debian 9 notify.c: workaround epoll_pwait bug in current Linux 5.0/5.1 doc: remove mailing list subscription info commit a92453217ef516f205e7bfcb81c7c6b2c5b3ac88 Author: Eric Wong Date: 2019-05-12 00:45:44 +0000 build: add .gitattributes to EXTRA_DIST commit a37ba5f890ad9ef17a5845c4c1740c44bfe74784 Author: Eric Wong Date: 2019-05-11 08:07:34 +0000 doc: remove mailing list subscription info Mail subscriber lists are centralized data which is not commonly forkable or reproducible. Mail archives are more important. commit af80cb709474cac2eaae29bf33facdc9e13af20d Author: Eric Wong Date: 2019-05-11 07:50:22 +0000 notify.c: workaround epoll_pwait bug in current Linux 5.0/5.1 The bugfix should hit mainline and stable kernels, soon; but there's no reason for us to be caring if errno is EINTR, or not... https://lore.kernel.org/lkml/20190427093319.sgicqik2oqkez3wk@dcvr/ https://lore.kernel.org/lkml/20190507043954.9020-1-deepa.kernel@gmail.com/ commit 79b949e62b8a6f6f0047edcd2bf20970481a94b1 (meltdown/master) Author: Eric Wong Date: 2019-04-27 21:29:43 +0000 .gitignore: add extra ignores for gnulib in Debian 9 I usually use gnulib.git, but not everybody does and it's worth cleaning things up a bit for this common case. Tested with gnulib 20140202+stable-2+deb9u1 in Debian 9 (stretch). Further updates may be needed for other common distros which package gnulib. commit cd9baef52c4d6f03b6d474edf9863fd9bcfb6c31 Author: Eric Wong Date: 2019-04-27 21:23:58 +0000 test/mgmt_auto_adjust.rb: improve diagnostic messages Chasing down a regression in Linux 5.0: https://lkml.kernel.org/r/20190427093319.sgicqik2oqkez3wk@dcvr commit 8ad2972f425fa7ebbc46ae9cb95be613a86265b1 Author: Eric Wong Date: 2019-04-27 20:19:16 +0000 add .gitattributes for Ruby files This hopefully makes our other changes easier-to-read. commit 278e31821308e4a510f03d6c3a316a34361a0e55 Author: Eric Wong Date: 2019-04-26 19:56:30 +0000 test/mgmt-usage.rb: fix mismatched indentation warning Newer versions of Ruby warn on it commit c1226981ec311d96ccfb3bce259e48538a1dbbf4 (tag: v1.7.0) Author: Eric Wong Date: 2018-12-18 03:40:23 +0000 cmogstored 1.7.0 The big feature in this release is "devNNN/usage" are served from memory, allowing up-to-date usage information even unwritable/unreadable filesystems. This can also be used to reduce spinups and wear on HDDs. "devNNN/usage" files are still updated on the FS by default for compatibility with existing HTTP servers, but admins may wish to disable updates to them by removing all permissions from the "usage" files: chmod 0000 $MOG_DOCROOT/dev*/usage Filesystem errors from the sendfile(2) syscalls are also logged to syslog. There's also a bugfix for zombies for libkqueue-on-epoll users, but that doesn't affect native kqueue users on *BSDs. And the usual round of gnulib, minor doc and style updates. 18 changes since v1.6.0: cmogstored.h: remove unused mog_file.mmptr member doc: documentation for ioq doc: further comment updates around ioq build-aux/txt2pre: support '=' in URLs test/inherit: fix ambiguous parenthese warning test/inherit: stop testing Ruby itself doc: update URLs to HTTPS compat_sendfile: ensure this works without an offset doc/queues.txt: add key point about only retrieving ONE event fix trace.h dependency on probes.h update to gnulib.git 90f289f249a266b1afb9c63e182f5d979d17df5f http_get.c: log filesystem-level errors from sendfile serve /dev*/usage requests from memory doc: URL updates to reduce redirects and favor HTTPS test/inherit.rb: fix syntax error under Ruby 1.8 update copyrights for 2018 and use SPDX for "GPL-3.0+" selfwake: enable self-pipe with kqueue http_parser: workaround parsing OOM in Ragel 6.10 commit bd144a77fae53a2c02f2ccda7e309ff46f739fb2 Author: Eric Wong Date: 2018-12-07 23:56:24 +0000 http_parser: workaround parsing OOM in Ragel 6.10 Noticed in FreeBSD 11.2 where Ragel 6.10 was OOM-ing, this doesn't affect Ragel 6.9. TODO: make sure this is fixed upstream in Ragel. commit 86ace01fed5ed39a48e6d21810fec93f976baa97 Author: Eric Wong Date: 2018-11-29 19:35:55 +0000 selfwake: enable self-pipe with kqueue This was causing my libkqueue build to stall on Linux where epoll_pwait exists. We actually favor kqueue in the code for testing purposes, so we need to enable the self-wake pipe when using libkqueue if epoll_pwait is detected. commit a4bff7526f9c9f642767e254463f22ba2c10f507 Author: Eric Wong Date: 2018-11-28 02:03:58 +0000 update copyrights for 2018 and use SPDX for "GPL-3.0+" copyrights updated by "update-copyright" in gnulib: git ls-files | UPDATE_COPYRIGHT_HOLDER='all contributors' \ UPDATE_COPYRIGHT_USE_INTERVALS=2 \ xargs /path/to/gnulib/build-aux/update-copyright While we're at it, SPDX seems to be the accepted way to identify licenses nowadays, so lets use it. git ls-files | xargs perl -i -p -e \ 's,GPLv3 or later.*,GPL-3.0+ ,g' commit 2c22379e811f9ab3f2692d459c075bb193f05e89 Author: Eric Wong Date: 2018-07-10 20:59:23 +0000 test/inherit.rb: fix syntax error under Ruby 1.8 Not sure if it's worth supporting 1.8, anymore, but parts of the Ruby VM test and benchmark suite still remain 1.8-compatible... commit 7329eab49ef73521be6aaf03d32d0403f60b432c Author: Eric Wong Date: 2018-11-28 01:17:12 +0000 doc: URL updates to reduce redirects and favor HTTPS HTTPS is usually more secure and redirects slow readers down. commit ce60ea3e4cf733aadd4ecb8ca08e2f81b498d67c Author: Eric Wong Date: 2018-11-27 02:03:17 +0000 serve /dev*/usage requests from memory Filesystems may become unwritable and out-of-date "usage" files will cause trackers to see out-of-date information. We still write to the filesystem by default for compatibility with existing HTTP servers. However, giving the "usage" file a 0000 mode will prevent cmogstored from overwriting it. This allows admins to also reduce wear on storage devices: chmod 0000 $mogroot/dev*/usage commit 6567228f718578373e771f92dd69daaa716fbbed Author: Eric Wong Date: 2018-07-09 07:37:52 +0000 http_get.c: log filesystem-level errors from sendfile Socket errors are too common to log (especially from malicous clients), but filesystem errors are rare and important. commit 4a62891c7487a6776ed7112184ef54091e04a6a1 Author: Eric Wong Date: 2018-06-02 08:41:02 +0000 update to gnulib.git 90f289f249a266b1afb9c63e182f5d979d17df5f commit 780a9a25bcf657a7a81a28deea28248f7ad19d5a Author: Eric Wong Date: 2018-06-02 08:37:38 +0000 fix trace.h dependency on probes.h I haven't tested with systemtap, lately; maybe something bit rotted. commit 8c8916c5b7a2afdb5dcb5cc88b4e1d28fe8a5acc Author: Eric Wong Date: 2017-10-24 18:55:45 +0000 doc/queues.txt: add key point about only retrieving ONE event This had become such second nature to me that I forgot to document it :x commit 93c11990d215c678d254a56a7b3bc63e3a53e0de Author: Eric Wong Date: 2017-03-11 00:57:09 +0000 compat_sendfile: ensure this works without an offset While we never call sendfile without an offset, some projects may copy our code and want to use it without an offset. commit e5c86cb81fceb3b37019aed701b475bf40802c10 Author: Eric Wong Date: 2017-02-09 23:07:44 +0000 doc: update URLs to HTTPS HTTPS seems to be working well for the rest of bogomips.org with Let's Encrypt, so lets use it and hope it protects some users from snooping. commit 5da00e0746fd4984dde43a859271398902b26c52 Author: Eric Wong Date: 2017-01-03 06:21:00 +0000 test/inherit: stop testing Ruby itself TCPSocket.new raises exceptions on failure. commit d9484e983eed4be2d8457cebf6adeca39e9f1576 Author: Eric Wong Date: 2017-01-03 06:15:54 +0000 test/inherit: fix ambiguous parenthese warning Who tests the tests? commit 554869605ce9d5987d5689a071e68583ad3d5b98 Author: Eric Wong Date: 2016-12-24 09:16:42 +0000 build-aux/txt2pre: support '=' in URLs We'll just have everything on r******** soon, I hope :p commit 1333a3de16d7a9192286195da241ef195dbc556a Author: Eric Wong Date: 2016-12-24 08:50:34 +0000 doc: further comment updates around ioq Let's not forget about this queue, it is a useful design. commit eb41c7abde797e03dd51c0bc945f0298d0fe235c Author: Eric Wong Date: 2016-12-24 08:30:46 +0000 doc: documentation for ioq It's a queue that looks like a semaphore, so document it in doc/queues.txt and provide pointers to perhaps-forgotten documentation. commit 5752a8d1b051b1cb4e4d62e6fd1afbeb28ce7eaf Author: Eric Wong Date: 2016-12-16 11:04:28 +0000 cmogstored.h: remove unused mog_file.mmptr member This was intended for zero-copy PUT support, but that is probably not worth it due to checksumming and the general unpredictability of mmap/munmap performance, especially on non-Linux systems. commit 99958cf048f3f1d234b2155558e0fc672848e8e9 (tag: v1.6.0) Author: Eric Wong Date: 2016-08-31 03:05:16 +0000 cmogstored 1.6.0 - minor fixes on allocation errors There are minor robustness fixes on handling errors when allocating memory or spawn failures on otherwise-hosed systems. These bugfixes will not affect real users unless the system is already hosed or in badly overtaxed, so there's no real need to upgrade. There are minor portability improvements and I now test under FreeBSD 10.x. The iostat test cases are relaxed a bit to account for virtualized devices (as iostat is less useful with modern 17 changes since 1.5.0 (Nov 2015): Rakefile: add missing
for Atom feed test/pwrite-wrap: remove unused variable and comment test/pwrite_wrap: squelch unnecessary output test/pwrite_wrap: reduce space overhead required update copyrights for 2016 build-aux/txt2pre: drop CGI.pm requirement stdin is always redirected to /dev/null minor vfork/fork safety fixes process: try to handle OOM gracefully http_put: gracefully handle path allocation errors iostat_process: declare environ extern test/mgmt: relax checks for iostat mapping gnulib copyright update for 2016 upgrade: avoid syslog call if execve fails rely on gnulib for environ portability INSTALL: update latest Debian stable version to 8.x README: stop mentioning cgit commit 189e8e5646136d9524f51cfab9081c59584740bb Author: Eric Wong Date: 2016-08-31 03:04:28 +0000 README: stop mentioning cgit I do not expect to run it much longer since it contains CSS and renders poorly without it. commit d4580f32cc1b78626336cbece796d64a56c55b73 Author: Eric Wong Date: 2016-08-26 21:09:49 +0000 INSTALL: update latest Debian stable version to 8.x Debian 8.x (jessie) was released over a year ago :x commit 026d9f4d635ac360f9d349ffcb50a8252719730e (origin/gl-env, pre16) Author: Eric Wong Date: 2016-07-18 07:17:41 +0000 rely on gnulib for environ portability This avoids warnings on my GNU system while still working on FreeBSD. commit 53030c527eaac6ea2d6acbf501569d575fef9d41 Author: Eric Wong Date: 2016-07-17 12:52:42 +0000 upgrade: avoid syslog call if execve fails We cannot safely call syslog on all platforms under vfork; but we have normal exit handling to tell us of the presence of execve errors, just not which. commit 1f4e95f5887521d8df3b7cd3d4da612066d03ea6 Author: Eric Wong Date: 2016-07-17 05:22:49 +0000 gnulib copyright update for 2016 commit 360ef6aee6b1e4e0855377a343a6e39263b15daa Author: Eric Wong Date: 2016-07-17 05:16:24 +0000 test/mgmt: relax checks for iostat mapping In the age of virtualized devices and fast solid-state storage, iostat information isn't as useful at it was a decade ago and probably less useful in tests. So relax the tests. commit 7389b9ba076ffd49d5c37113809f46c2bf1f38f3 Author: Eric Wong Date: 2016-07-17 04:33:19 +0000 iostat_process: declare environ extern This is necessary for FreeBSD and probably other non-GNU systems. commit 504a5ced05f48bf8cc1a08b22ce8830b6db98d41 Author: Eric Wong Date: 2016-06-01 22:32:37 +0000 http_put: gracefully handle path allocation errors Failing to allocate memory should be a temporary error and be non-fatal. commit a03ccc608f68f44122f83dbde7bb09e9acbbc185 Author: Eric Wong Date: 2016-06-01 22:32:29 +0000 process: try to handle OOM gracefully If we fail to register a process, it is not fatal since a process is already running. However, we may not know about when to restart it when it dies. commit a19f6bf70866e9fed34c7220f8a83d8486102821 Author: Eric Wong Date: 2016-06-01 03:06:56 +0000 minor vfork/fork safety fixes In case "/bin/sh" or "/dev/null" becomes unavailable during the lifetime of cmogstored, we will no longer crash when attempting to (re)start iostat. However, your system is probably hosed anyways if "/bin/sh" or "/dev/null" become unavailable. This also fixes a bug where we would leak the iostat pipe if either fork/vfork fails. We also close an innocuous race condition where the child might toggle flags in the parent process and trigger an extra wakeup. Finally, we use sigprocmask in the child in case pthread_sigmask does not not work on some systems after forking. This is likely only a cosmetic change. commit d413151f5c0e3ccd3c7c7fe9d1db9112e7e83561 Author: Eric Wong Date: 2016-06-01 03:06:55 +0000 stdin is always redirected to /dev/null There is no reason for stdin to ever be connected to a terminal, ensure we have a consistent stdin for iostat processes and the like. commit c6b7757b241baf82be9aec9b937478881ab0d282 Author: Eric Wong Date: 2016-05-29 12:31:06 +0100 build-aux/txt2pre: drop CGI.pm requirement CGI.pm is no longer in the main Perl distro, so depending on it is not worth the effort for a few lines. commit a6ba02f02e4319c0bf5b8aa000ef6851905185b4 Author: Eric Wong Date: 2016-05-29 06:14:16 +0000 update copyrights for 2016 git ls-files | UPDATE_COPYRIGHT_HOLDER='all contributors' \ UPDATE_COPYRIGHT_USE_INTERVALS=2 \ xargs /path/to/gnulib/build-aux/update-copyright commit fa172db40c58ddb3894c5eec968b29df466f6b4c Author: Eric Wong Date: 2016-05-29 06:14:15 +0000 test/pwrite_wrap: reduce space overhead required It's probably overkill to use 100G of space, even if its sparse. commit f57e755430ca14e2b559cbdb41825e87b82f6225 Author: Eric Wong Date: 2016-02-01 10:49:29 +0000 test/pwrite_wrap: squelch unnecessary output Oops, leftover from development many years ago. commit f66d2b4739eb9439280ac29dc6e4ef45802157f5 Author: Eric Wong Date: 2016-02-01 10:23:23 +0000 test/pwrite-wrap: remove unused variable and comment They were blindly copied and s/search/replace/-ed from epoll-wrap.c commit 49bd505dd305b381e4fff7f6d2e18d649a446e03 Author: Eric Wong Date: 2015-11-28 01:39:08 +0000 Rakefile: add missing
for Atom feed Apparently this is needed for proper XHTML rendering in iceweasel? commit 907c073738636b44919dc8d3c79a6cbf38114e64 (tag: v1.5.0) Author: Eric Wong Date: 2015-11-20 05:41:03 +0000 cmogstored 1.5.0 - vfork, systemd, 416 codes A bunch of minor changes; most notable is systemd-style socket activation support. This was easy-to-add since we've always had socket activation support for nginx-style SIGUSR2 upgrades. This places no link or runtime dependency on libsystemd, so the LISTEN_FDS and LISTEN_PID environment variables may be used in other init systems as well. While I have my own reservations about systemd itself, I also strongly believe in using socket activation to prevent downtime. Existing behavior with CMOGSTORED_FD (used for SIGUSR2 upgrades) is now documented in the manpage and will always supported. We've also added vfork support for Linux systems, allowing faster spawning of iostat if malloc is using too much memory. Behavior changes: Bad Range: headers return 416 responses in more cases for invalid ranges (e.g. miscalculated ranges such as "1--1", while completely wrong ones (lacking a "bytes=" prefix) are ignored entirely as in nginx. Bugfixes: There are also some cleanups to avoid dying on OOM in more places on weird systems which trigger OOM. More work on this is ongoing. Also updates to the latest gnulib.git commit 71d39c1644762745b94e9449c45bfd716a79a5eb ("autoupdate") along with a change which fixes a memory leak when people build from cmogstored.git using gnulib commit c6148bca89e9465fd6ba3a10d273ec4cb58c2dbe or later ("mountlist: add me_mntroot field on Linux machines"). This memory leak did not affect any released tarballs of cmogstored. Note, users building from git (as opposed to the tarball) will need gnulib commit 41d1b6c42641a5b9e21486ca2074198ee7909bd7 ("mountlist: add support for deallocating returned list entries") or later (from July 2013). There are also various documentation updates and our mailing list is now readable over NNTP: nntp://news.public-inbox.org/inbox.comp.file-systems.mogilefs.cmogstored commit 4f7e8edf9f3bf734ca6bfb56756ef7cd90ffb32e Author: Eric Wong Date: 2015-11-20 21:35:34 +0000 require newer gnulib for free_mount_entry support gnulib commit 41d1b6c42641a5b9e21486ca2074198ee7909bd7 ("mountlist: add support for deallocating returned list entries") or later (from July 2013) is needed for free_mount_entry support introduced in our commit 1225f9ce4c32b3bba61ce92a487d99260a001995 ("use free_mount_entry from gnulib instead of rolling our own"). commit d3ad6ed40305cecb1abfe30fb9bf9db047b45e07 Author: Eric Wong Date: 2015-11-20 03:23:22 +0000 Makefile.am: distribute txt2pre in tarball Oops. commit 6e5770899aa986cc6b7e59f084853d87f03166ed Author: Eric Wong Date: 2015-11-20 03:15:04 +0000 add cmogstored manpage to website Sometimes people will forget to install the manpage, make sure it's online in plain-text or HTML format. commit c0da4eb6eeb4bec9b70aede7176a91f536e5bbe8 Author: Eric Wong Date: 2015-11-20 01:56:27 +0000 misc doc updates Generate pre-formatted HTML which gives us a consistent visual style with our mailing list archives and enhance linkability. ,
,
    and  are among the few useful HTML tags I'll use :P
    
    Drop the AUTHORS file, it's pointless maintenance task and users can
    just look at git history instead (and honestly, I have zero interest in
    recognition; I only use my real name to deter GPL violations).

commit fd0a3959bb678e94719bfa454c8b3742635ca98c
Author: Eric Wong <e@80x24.org>
Date:   2015-11-20 01:10:43 +0000

    README: update contact information
    
    Most notably, our mailing list is now available over NNTP.
    Stop advertising ssoma since it's too much to expect users would
    be willing to install and use yet another new tool when
    NNTP is already standardized and our NNTP server is pretty
    efficient.

commit b45c3852dfcc153ef6ac820ec5fd54265845d296
Author: Eric Wong <e@80x24.org>
Date:   2015-11-13 21:47:50 +0000

    use vfork under Linux before execve
    
    Given the prevalance of gigantic VM footprints due to current glibc
    malloc and our potentially large number of threads, vfork can speed
    up fork used for spawning iostat and SIGUSR2 upgrades.
    
    vfork only pauses the spawning thread, so it will not affect other
    I/O threads used in cmogstored; only the non-performance-critical
    master thread.
    
    Swapping 'fork()' for 'vfork()' in the following C test program
    should show a large speedup under Linux.
    
    Changing FILL to increase or decrease memory usage will respectively
    decrease or increase performance improvement gain from vfork over
    fork..
    
    -----------------------------8<-------------------------
    /* gcc -o x x.c -Wall -O2 -lpthread && ./x */
    
            #include <sys/types.h>
            #include <sys/time.h>
            #include <unistd.h>
            #include <pthread.h>
            #include <poll.h>
            #include <stdio.h>
            #include <sys/wait.h>
            #include <stdlib.h>
            #include <string.h>
            #define FILL (1024 * 1024)
    
    static void *thfunc(void *p)
    {
            void *ptr = malloc(FILL);
            memset(ptr, 1, FILL);
            poll(0, 0, -1);
            return 0;
    }
    
    int main(void)
    {
            long i;
            void *ptr = malloc(FILL);
            memset(ptr, 1, FILL);
    
            for (i = 0; i < 100; i++) {
                    pthread_t th;
                    pthread_create(&th, 0, thfunc, (void *)i);
            }
    
            poll(0, 0, 1000);
            for (i = 0; i < 100; i++) {
                    /* swapping fork with vfork increases performance on Linux */
                    pid_t pid = fork();
                    if (pid < 0) {
                            fprintf(stderr, "ERROR: forking %m\n");
                            return 1;
                    }
                    if (pid == 0) {
                            char *argv[] = { "/bin/true", 0 };
                            char *env[] = { 0 };
                            execve(argv[0], argv, env);
                            return 1;
                    } else {
                            int s;
                            waitpid(pid, &s, 0);
                    }
            }
    
            return 0;
    }

commit 4fb84823e327de868a14c4158cedf1dc7751d3fc
Author: Eric Wong <e@80x24.org>
Date:   2015-11-11 22:48:46 +0000

    doc: document CMOGSTORED_FDS in the manpage
    
    This has always been supported internally, and we can't stop
    supporting it since we'll be supporting upgrades from old versions
    indefinitely.  So document it, as it has some minor advantages over
    the LISTEN_{FDS,PID} environment handling of systemd.

commit 1a08a350c0b504ff31acf0e3ac0b6cdfe75ef521 (tag: v1.5.0rc1)
Author: Eric Wong <e@80x24.org>
Date:   2015-11-11 21:13:26 +0000

    cmogstored 1.5.0rc1
    
    A bunch of minor changes; most notable is systemd-style socket
    activation support.  This was easy-to-add since we've always had
    socket activation support for nginx-style SIGUSR2 upgrades.
    
    This places no link or runtime dependency on libsystemd, so the
    LISTEN_FDS and LISTEN_PID environment variables may be used in other
    init systems as well.  While I have my own reservations about
    systemd itself, I also strongly believe in using socket activation
    to prevent downtime.
    
    Behavior changes:
    
    Bad Range: headers return 416 responses in more cases for invalid
    ranges (e.g. miscalculated ranges such as "1--1", while
    completely wrong ones (lacking a "bytes=" prefix)) are ignored
    entirely as in nginx.
    
    Bugfixes:
    
    There are also some cleanups to avoid dying on OOM in more places
    on weird systems which trigger OOM.  More work on this is ongoing.
    
    Also updates to the latest gnulib.git
    commit f197c2c9e5e0d12c373f26d5b3211809457bc972
    ("intprops: new public macro EXPR_SIGNED")
    along with a change which fixes a memory leak when people
    build from cmogstored.git using gnulib
    commit c6148bca89e9465fd6ba3a10d273ec4cb58c2dbe
    or later ("mountlist: add me_mntroot field on Linux machines").
    This memory leak did not affect any released tarballs of cmogstored.
    
    shortlog of changes since 1.4.3:
    
          doc: use "builder" RubyGem to generate Atom feed
          dev.c: fail gracefully on out-of-memory errors
          do not die on OOM when for mgmt paths
          HACKING: update URLs to reduce redirects
          http: return 416 errors in more cases for bad Ranges
          update .gitignores for latest autotools + gnulib
          Rakefile: remove text-only part from the Atom feed
          support systemd-style socket activation via environment
          set TCP listener options on inherited sockets
          doc: add example systemd config files
          use free_mount_entry from gnulib instead of rolling our own
          fix tmpdir dependency for slow Ruby tests
          doc: publish examples directory to website

commit 961d5ba545995250c7f2ca26600c0248ac3120f9
Author: Eric Wong <e@80x24.org>
Date:   2015-11-11 21:10:19 +0000

    doc: publish examples directory to website
    
    This might improve visibility of these scripts for use with systemd.

commit 23123554d4bf85246aeb80bc837fd94add5b4269
Author: Eric Wong <e@80x24.org>
Date:   2015-11-11 10:43:29 +0000

    fix tmpdir dependency for slow Ruby tests
    
    .slowrb tests have a different suffix and the test dependencies
    need to be split out separately.

commit 1225f9ce4c32b3bba61ce92a487d99260a001995
Author: Eric Wong <e@80x24.org>
Date:   2015-11-11 03:56:47 +0000

    use free_mount_entry from gnulib instead of rolling our own
    
    gnulib.git added the me_mntroot element in
    commit c6148bca89e9465fd6ba3a10d273ec4cb58c2dbe,
    so we would leak memory during filesystem refreshes as a result :x
    Use the gnulib-provided API (free_mount_entry) instead of freeing
    elements ourselves.

commit 25e23de2bb67ed65abb535a01ea502c78113f83a
Author: Eric Wong <e@80x24.org>
Date:   2015-11-11 03:43:36 +0000

    doc: add example systemd config files
    
    Since we'll support systemd, it's not a bad idea to include
    reasonable example files for users.

commit 42a65a32623158c5bdce234b1b431b9f5093da70
Author: Eric Wong <e@80x24.org>
Date:   2015-11-11 03:38:47 +0000

    set TCP listener options on inherited sockets
    
    systemd users may not set the correct TCP socket options for
    us, so be sure to set TCP_NODELAY, SO_KEEPALIVE, and use
    a sufficiently large listen backlog to avoid hurting performance
    for users who bind sockets outside of cmogstored.

commit 0312c1e6220ef4280268a0f48f24db90738037bd
Author: Eric Wong <e@80x24.org>
Date:   2015-11-11 01:43:06 +0000

    support systemd-style socket activation via environment
    
    While I have my reservations about systemd, socket activation alone
    is a good idea and we already have existing infrastructure for
    supporting it in SIGUSR2 upgrades.
    
    We are intentionally avoiding linkage to libsystemd to avoid dealing
    with ABI compatibility issues between old and new systems.  This
    also allows us to integrate more easily with non-systemd systems
    which use the same environment variables as systemd.

commit 97ade9d8d5d751c197b61faee5f3ae6589b6b432
Author: Eric Wong <e@80x24.org>
Date:   2015-11-10 20:25:49 +0000

    Rakefile: remove text-only part from the Atom feed
    
    The pre-formatted HTML is readable as raw XML, and feed readers
    tend to have no problem rendering the HTML, so there's no point
    in nearly doubling our bandwidth usage on the text-only part
    given we're already serving XML.
    
    While we're at it, disable XML indentation to avoid wasting space;
    it doesn't significantly hamper readability, either.

commit 0c7c2d0c7d4cb89704c4e75c7194edf2bfd59686
Author: Eric Wong <e@80x24.org>
Date:   2015-11-10 01:22:59 +0000

    update .gitignores for latest autotools + gnulib
    
    Tested on automake 1:1.14.1-4 on Debian jessie,
    and automake 1:1.11.6-1 on Debian wheezy.
    
    gnulib was tested on
    commit 36d982f39b683d0266b9c6ff1e01cbfc94bd97f6
    ("test-timespec: fix typo in previous change") from
    git://git.savannah.gnu.org/gnulib.git

commit f715f6f228f9da83309a515a94de26fa3766b230
Author: Eric Wong <e@80x24.org>
Date:   2015-11-09 00:51:32 +0000

    http: return 416 errors in more cases for bad Ranges
    
    For completely unparseable Range: headers, we'll ignore them
    entirely as nginx does.  However, if /bytes=/ is matched, we'll
    start returning 416 errors instead of 400.

commit 225d5fb10474d853261b6ee2f9ceeff9c2bd73c6
Author: Eric Wong <e@80x24.org>
Date:   2015-08-29 05:22:27 +0000

    HACKING: update URLs to reduce redirects
    
    The ragel link no longer worked, actually...

commit 45bfce46d24db91d25b85a5115c2b41d4a1484fc
Author: Eric Wong <e@80x24.org>
Date:   2015-08-23 20:49:52 +0000

    do not die on OOM when for mgmt paths
    
    This also makes trywrite OOM-aware and will simulate a write error
    on allocation.

commit 7754b9ffc1b496170498f78fd2f05409dd0fb962
Author: Eric Wong <e@80x24.org>
Date:   2015-08-17 06:00:30 +0000

    dev.c: fail gracefully on out-of-memory errors
    
    The rest of cmogstored shall be updated to fail gracefully on OOM
    in due time.  It may take a while, since not many systems encounter
    this, but we shall become more robust as time goes on.

commit dc55a5b5bdd60850553ebf01adfe357d2a2a68b8
Author: Eric Wong <e@80x24.org>
Date:   2015-07-28 20:58:49 +0000

    doc: use "builder" RubyGem to generate Atom feed
    
    Nokogiri takes too long to build and install due to the C extension
    and bundled library.  Prefer a widely-used pure-Ruby gem instead.

commit a76af438da94e0d7211d4602b7fb00f2beb5e74e (tag: v1.4.3)
Author: Eric Wong <e@80x24.org>
Date:   2015-03-09 22:51:59 +0000

    cmogstored 1.4.3 - mostly non-GNU/Linux fixups
    
    For all platforms, the startup device scanning thread at startup
    may not handle EINTR properly.  This bug only manifested at
    startup and does not affect running instances.  However, this
    bug is also readily apparent on newer versions of FreeBSD
    which support the ppoll function call.
    
    Thanks to Mykola Golub <trociny@FreeBSD.org> for the bug report
    which led to this release.
    
    For systems lacking epoll_pwait (older GNU/Linux, all *BSDs),
    there is also a bugfix for systems which experience signal spam
    leading to errno clobbering in the main thread.  This bug was
    only only noticed due to a bug report against Ruby:
    
            https://bugs.ruby-lang.org/issues/10866
    
    There is no need to upgrade if 1.4.1 is already running well
    on modern GNU/Linux systems capable of epoll_pwait.  But then
    again nginx-style SIGUSR2 upgrades are transparent to clients.
    
    shortlog since 1.4.2:
    
          Makefile.am: fix publish rule for website
          Fix assertion failure during startup
          avoid relying on ppoll as a cancellation point
          preserve errno when inside sig handler for self-pipe

commit d33c7cba557ae40fb55446d841e084a74eacb425
Author: Eric Wong <e@80x24.org>
Date:   2015-03-09 21:18:00 +0000

    preserve errno when inside sig handler for self-pipe
    
    We must not clobber errno of the main thread inside signal
    handler in case write fails.
    
    This bug only affects systems without epoll_pwait where the
    self-pipe is required, so it does not affect modern GNU/Linux
    systems; but does affect FreeBSD systems and anybody else
    relying on kqueue.
    
    Thanks to Steven Stewart-Gallus for a Ruby bug report which
    inspired this fix: https://bugs.ruby-lang.org/issues/10866
    
    Cc: Mykola Golub <trociny@FreeBSD.org>
    Cc: Steven Stewart-Gallus <sstewartgallus00@mylangara.bc.ca>

commit 58cec82abb8b1e6feea090c72806bd9d8f693a37
Author: Eric Wong <e@80x24.org>
Date:   2015-03-09 20:40:25 +0000

    avoid relying on ppoll as a cancellation point
    
    While glibc supports ppoll, ppoll is not standardized and
    apparently is not a cancellation point in some versions FreeBSD
    based on Mykola Golub's bug report in
    <20150309151851.GC2195@gmail.com>
    
    Reported-by: Mykola Golub <trociny@FreeBSD.org>

commit c659acbb8d7a6b0c8098646981124a47f15cceae
Author: Eric Wong <e@80x24.org>
Date:   2015-03-09 20:22:23 +0000

    Fix assertion failure during startup
    
    During the initial device scan, it is possible for the waiter to be
    interrupted while awaiting cancellation.  We must account for this
    on all platforms regardless of whether pselect or ppoll is used.
    
    Reported-by: Mykola Golub <trociny@FreeBSD.org>

commit 1a7f32d0d8a48b9f26f595d0fa9f5db0c657bc3a
Author: Eric Wong <e@80x24.org>
Date:   2015-03-06 02:30:07 +0000

    Makefile.am: fix publish rule for website
    
    Oops, we cannot have zero-byte gzipped files :x

commit 03f957c256ca7a686e097779433eca73fcda22a6 (tag: v1.4.2)
Author: Eric Wong <e@80x24.org>
Date:   2015-03-06 02:18:14 +0000

    cmogstored 1.4.2
    
    * Makefile.am: gzip README and associated data
    * manpage: update contact and copyright information
    * update copyrights to 2014 (and all contributors)
    * doc/design.txt: add a few more notes on compromises
    * http_dav: log 500 errors from DELETE requests
    * tapset/http_access_log: note CLF differences
    * copyright updates for 2015

commit 4ca5fa15f45dd8512d1244db1ca24d5624e483d4
Author: Eric Wong <e@80x24.org>
Date:   2015-03-06 02:03:34 +0000

    copyright updates for 2015
    
    Via update-copyright in gnulib, also added a few copyrights
    to non-trivial files.
    
    git ls-files | UPDATE_COPYRIGHT_HOLDER='all contributors' \
      UPDATE_COPYRIGHT_USE_INTERVALS=2 \
      xargs /path/to/gnulib/build-aux/update-copyright

commit 3e14979ef533e41fe72f7e68afd533a3cc87471d
Author: Eric Wong <e@80x24.org>
Date:   2015-02-13 00:49:28 +0000

    tapset/http_access_log: note CLF differences
    
    We have two differences from CLF, note them correctly.

commit 10ae48e0880fc76d1f2044f80e20004491801663
Author: Eric Wong <e@80x24.org>
Date:   2015-02-05 21:52:08 +0000

    http_dav: log 500 errors from DELETE requests
    
    Errors on failed unlink can be a prelude to a bigger problem, so
    log it locally ourselves even if the tracker will notice it.
    
    This commit was tested manually by setting up cmogstored to point to a
    read-only mount point on my system and attempting a DELETE request on
    it.

commit ee4a340bfb304f0270ef3704b09ba7faca6a3c1e
Author: Eric Wong <e@80x24.org>
Date:   2015-01-15 07:48:19 +0000

    doc/design.txt: add a few more notes on compromises
    
    In case I forget, writing this down while my mind is on
    the subject for other projects.

commit 047b0c13e91fe755fe165defc9de3ad0d8843330
Author: Eric Wong <e@80x24.org>
Date:   2014-11-02 09:08:40 +0000

    update copyrights to 2014 (and all contributors)
    
    In the future, we can use the update-copyright tool from gnulib:
    
            git ls-files | UPDATE_COPYRIGHT_HOLDER='all contributors' \
              UPDATE_COPYRIGHT_USE_INTERVALS=2 \
              xargs /path/to/gnulib/build-aux/update-copyright
    
    This project (nor any project I manage) has or ever will have have
    copyright assignment.  All contributors retain copyrights to their
    contributions.

commit f7341063e774a032a679ff2ec69aae3bd8c40281
Author: Eric Wong <e@80x24.org>
Date:   2014-11-02 08:49:47 +0000

    manpage: update contact and copyright information
    
    I'll continue accepting email to my private address,
    but public email is preferred as it is easier for others
    to find messages well as making it easier to credit
    bug reporters.

commit d5be3d489822cf72fcc491a25b8f1af313435ffc
Author: Eric Wong <e@80x24.org>
Date:   2014-09-20 10:18:33 +0000

    Makefile.am: gzip README and associated data
    
    Speeds up site loading when combined with things like
    try_gzip_static in nginx.

commit 3a39cbd6632a8f80af3d0ef082d5007e28826384 (tag: v1.4.1)
Author: Eric Wong <e@80x24.org>
Date:   2014-09-07 00:43:28 +0000

    cmogstored 1.4.1 - bugfix for neon clients
    
    The PHP PECL MogileFS extension uses neon to handle WebDAV operations,
    and neon seems to send (valid but unfortunate) headers with empty
    string values.  Thanks to Patrice Damezin at Skyrock.com for reporting
    this bug.
    
    There's also a few minor cleanups.  The latest 2.6.34 stable kernel
    release no longer requires our EPOLL_CTL_MOD race workaround.  There
    are also some test suite updates for future releases of Ruby.
    Bigger changes coming later this year...
    
    There's also a new public mailing list at:
    
            cmogstored-public@bogomips.org
    
    No subscription will ever be necessary to post.
    Subscription is optional via:
    
            cmogstored-public+subscribe@bogomips.org
    
    Archives are available at http://bogomips.org/cmogstored-public/
    
    Eric Wong (11):
          minor cleanups for functions which do not return
          remove old fsck_queue declarations
          svc_dev: calling free does not need the lock
          test/mgmt: lengthen test for iostat watch
          test/http: plug race condition in FIFO test
          test/http_chunked_put: test for gigantic trailer
          update address to public mailing list
          Rakefile: remove freecode/freshmeat references
          Rakefile: shorten ChangeLog dump
          queue_epoll: disable buggy epoll workaround for 2.6.34.15+
          http_common: correctly handle empty header values

commit cc46352c76193a2f1732a1f64761eea8b7581e60
Author: Eric Wong <e@80x24.org>
Date:   2014-09-03 17:27:06 +0000

    http_common: correctly handle empty header values
    
    The PHP PECL MogileFS extension uses neon to handle WebDAV operations,
    and neon seems to send (valid but unfortunate) headers with empty
    string values:
    
    ref: http://svn.webdav.org/repos/projects/neon/trunk/src/ne_request.c
    
        else if (!sess->is_http11 && !sess->any_proxy_http) {
            ne_buffer_czappend(req->headers,
                               "Keep-Alive: " EOL
                              "Connection: TE, Keep-Alive" EOL);
        }
        else if (!req->session->is_http11 && !sess->any_proxy_http) {
            ne_buffer_czappend(req->headers,
                               "Keep-Alive: " EOL
                               "Proxy-Connection: Keep-Alive" EOL
                               "Connection: TE" EOL);
        }
    
    Thanks to Patrice Damezin at Skyrock.com for reporting
    the issue.

commit 5087825f3fd0ad59ce7afedaaaaa17d16196e1f6
Author: Eric Wong <e@80x24.org>
Date:   2014-09-05 19:04:31 +0000

    queue_epoll: disable buggy epoll workaround for 2.6.34.15+
    
    commit 356ad39592cfcb537a512b2f88ed44380ae5cd78
    ("epoll: prevent missed events on EPOLL_CTL_MOD")
    in the 2.6.34 stable tree

commit e734132c6710451340e04a765fd08f60b6102771
Author: Eric Wong <e@80x24.org>
Date:   2014-09-05 01:38:16 +0000

    Rakefile: shorten ChangeLog dump
    
    We don't need ChangeLog info going back to 1.0.0

commit 69af3d363a06030466ceca9f7a13252eef0caa81
Author: Eric Wong <e@80x24.org>
Date:   2014-09-04 23:48:49 +0000

    Rakefile: remove freecode/freshmeat references
    
    The site is dead.

commit 1b0b17910da9d3dea3d1f96083743545469db160
Author: Eric Wong <e@80x24.org>
Date:   2014-09-05 00:33:50 +0000

    update address to public mailing list
    
    Receiving bug reports via private email is awkward because I must
    ask reporters if they wish to be credited publically.  This also
    allows users to help each other in case they're not subscribed to
    the MogileFS list (which requires subscription).
    
    So the new public mail address is at:
    
            cmogstored-public@bogomips.org
    
    No subscription will ever be required to post.
    HTML email is considered spam and blocked.
    
    There's now a public mailing list for reporting issues
    with git clone-able archives (via ssoma[1]) at:
    
            git://bogomips.org/cmogstored-public
    
    [1] http://soma.public-inbox.org/README

commit 4fbe02062007d1ad073a550f5e37b599fc0019e4
Author: Eric Wong <e@80x24.org>
Date:   2014-06-22 22:49:39 +0000

    test/http_chunked_put: test for gigantic trailer
    
    This is a potential attack vector, and we seem to pass.

commit 29bc0766942a92549774d0439d1a6362c53bc26c
Author: Eric Wong <e@80x24.org>
Date:   2014-09-03 07:10:04 +0000

    test/http: plug race condition in FIFO test
    
    This is noticeable in the trunk version of ruby since r47288
    ("io.c: do not swallow exceptions at end of block").

commit 9be3f68b9d8d86379339dc0e6852612061880e38
Author: Eric Wong <normalperson@yhbt.net>
Date:   2014-05-23 08:51:13 +0000

    test/mgmt: lengthen test for iostat watch
    
    The iostat may take a while to notice a new device,
    so let it run a bit.

commit 446a21c9ac664f7456e2e4e739979baab8ba13c1
Author: Eric Wong <normalperson@yhbt.net>
Date:   2014-05-23 10:21:32 +0000

    svc_dev: calling free does not need the lock
    
    We do not need to be holding devstats_lock when releasing
    a local buffer which will never be used by another thread.

commit c53cecda7106e4c7eb14d5c26e28bda82743771d
Author: Eric Wong <e@80x24.org>
Date:   2014-05-30 22:10:16 +0000

    remove old fsck_queue declarations
    
    fsck_queues were replaced by generic ioq for all requests in 1.3,
    but the declarations here were forgotten.

commit cd7b4cbacbc968bd4d7ed5fed9122f75d229793c
Author: Eric Wong <e@80x24.org>
Date:   2014-04-08 03:51:02 +0000

    minor cleanups for functions which do not return
    
    pthread_exit and abort never returns, so quiet down some
    warnings when using -Wunreachable-code on clang.
    
    Unfortunately using -Wunreachable-code globally is too noisy due to
    1) Ragel-generated code.
    2) constant branch conditions for build-time options (trace/cork)

commit 3f56e841c3612a113cc5261b01552396cc24ea13 (tag: v1.4.0)
Author: Eric Wong <normalperson@yhbt.net>
Date:   2014-02-22 01:12:55 +0000

    cmogstored 1.4.0
    
    bsd_sendfile is now supported on Debian GNU/kFreeBSD systems.
    This release also fixes a compability bug with Perl mogstored config
    files where "daemonize = (0|1)" was not supported properly.
    
    Eric Wong (3):
          check for sys/sendfile.h header instead of __linux__
          allow bsd_sendfile with freebsd-glue on Debian/kFreeBSD
          support "daemonize = 0|1" in the config file

commit af3e5766523110f50cfb5bcdaba82f700d7d7807
Author: Eric Wong <e@80x24.org>
Date:   2014-02-21 22:47:15 +0000

    support "daemonize = 0|1" in the config file
    
    This is expected by Perl mogstored, and our previous support
    of "daemonize" (standalone) was in error (but still supported
    for now).

commit 35782a1facdc61ae007086657689cf289c96dd92
Author: Eric Wong <e@80x24.org>
Date:   2014-02-21 17:05:55 -0500

    allow bsd_sendfile with freebsd-glue on Debian/kFreeBSD
    
    Debian GNU/kFreeBSD users may ./configure with LIBS=-lfreebsd-glue
    to use the FreeBSD sendfile syscall.

commit 3d96736835c69b3de698bd3cc9ed12bab1da8d73
Author: Eric Wong <e@80x24.org>
Date:   2014-02-17 16:49:57 -0500

    check for sys/sendfile.h header instead of __linux__
    
    Non-Linux OSes may eventually gain a Linux-compatible sendfile.

commit 6eaf13539681dd1d6725021112dc43b69ae2be4d (tag: v1.3.3)
Author: Eric Wong <normalperson@yhbt.net>
Date:   2014-02-09 04:23:40 +0000

    cmogstored 1.3.3 - Debian GNU/kFreeBSD fixes
    
    This release fixes build problems with Debian GNU/kFreeBSD support
    (turns out it's been broken for over a year and nobody noticed :x).
    There are also build system upgrades for automake 1.14 and test case
    cleanups, but no changes to any of the core code.  No changes nor
    need to upgrade if you're on anything other than Debian
    GNU/kFreeBSD.

commit 4d55ec4f7aecfe7a127647f82175d92732879917
Author: Eric Wong <normalperson@yhbt.net>
Date:   2014-02-09 03:56:44 +0000

    m4/gnulib-cache: update for 2014

commit 7a55cf1b1529f487f39d7916b5d3c8188af5eccf
Author: Eric Wong <e@80x24.org>
Date:   2014-02-08 22:53:30 -0500

    test/upgrade: cleanup and robustness improvements
    
    Avoid calling top-level methods inside other tests in case some
    versions of test-unit or minitest can call setup/teardown twice.
    Avoid Timeout, as it is expensive and unnecessary
    in some cases.

commit 6b4cbed9de98b0988692f7855871034cb5f2bb3f
Author: Eric Wong <e@80x24.org>
Date:   2014-02-08 22:50:54 -0500

    Makefile.am: updates for automake 1.14.1
    
    Tested with automake 1:1.14.1-2 on Debian GNU/kFreeBSD

commit 6b974dc9cb48e6af8e4ea9410141168208e7ca06
Author: Eric Wong <e@80x24.org>
Date:   2014-02-08 20:49:12 -0500

    tests: skip iostat-dependent tests
    
    Debian GNU/kFreeBSD still does not have iostat :<

commit fd6722ac69f72bc4783675f055ab567a9902c713
Author: Eric Wong <normalperson@yhbt.net>
Date:   2014-02-08 01:14:19 +0000

    Makefile: do not clobber NOSTD_CFLAGS from configure
    
    This was breaking the Debian kFreeBSD build

commit d6147a83867fb41eabdfdde6d71a23d0e1de5f71
Author: Eric Wong <normalperson@yhbt.net>
Date:   2014-02-04 22:42:02 +0000

    doc/queues.txt: add a note about our non-use of AIO
    
    It was obvious to me to use pthreads up front, hopefully that's
    explained to others, too.

commit 4e48663f6b07954fbcfc34339f44c9f487d9b4c8 (tag: v1.3.2)
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-12-10 20:17:38 +0000

    cmogstored 1.3.2 - FreeBSD shutdown speedup
    
    This release speeds up graceful shutdown on busy systems such
    as FreeBSD.  There is also a minor resource savings for users
    of the undocumented --worker-processes switch.  There are also
    some minor memory error fixes for test cases (which did not
    affect the daemon itself).
    
    Upgrading is optional unless you are affected by these fixes.
    
    Note: GNU/Linux users are encouraged to read the manpage update
    regarding glibc malloc arenas
    
    Eric Wong (9):
          selfwake: do share pipe descriptors with workers
          test/chunk-parser-1: fix uninitialized file structures
          test: fix valgrind warnings in test-only C code
          doc: refer to malloc-related environment variables
          thrpool: sleep instead of yield when poking thread
          test/mgmt-usage: relax regexp for ZFS
          m4/.gitignore: bump for newer gnulib
          doc: fix wording in manpage
          doc: fix link to MogileFS homepage

commit 3420cb228a3aa09b453d7464ef0c7ab4b6a1d0db
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-12-10 22:09:31 +0000

    doc: fix link to MogileFS homepage
    
    mogilefs.org is the correct domain

commit 9ed6f8849d238286b37d8a2f82207d1a9c900b73
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-12-10 22:08:00 +0000

    doc: fix wording in manpage

commit 0aeb2fa7696474c7c578f9fa6d948f4e27b88bb3
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-12-09 11:21:17 +0000

    m4/.gitignore: bump for newer gnulib
    
    Now at gnulib commit 43593319b31e6b0175b8eec4433bac744959822d
    ("md5, sha1, sha256, sha512: add gl_SET_CRYPTO_CHECK_DEFAULT")

commit 3433a482c97913aaddccf83a224cb9cff819d340
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-12-09 10:54:11 +0000

    test/mgmt-usage: relax regexp for ZFS
    
    ZFS device mount points do not start with a leading '/'.
    We already account for this in our internal mountpoint handling,
    but did not account for this in the test case.
    
    Reported-by: Mikolaj Golub

commit f5328d433c588e26a7763266208fe3460ef7ee99
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-12-09 10:50:45 +0000

    thrpool: sleep instead of yield when poking thread
    
    This unfortunate loop burned too much CPU on FreeBSD and caused
    shutdown to take too long when using sched_yield.  nanosleep for
    10ms instead, hopefully allowing the system to accomplish some
    disk I/O and other tasks before we poke it again.
    
    Reported-by: Mikolaj Golub

commit fe587418ea7a71f34e5a0f49eb20148e82b9c389
Author: Eric Wong <e@yhbt.net>
Date:   2013-12-02 22:23:07 +0000

    doc: refer to malloc-related environment variables
    
    Using non-portable mallopt/mallctl functions is not feasible because
    detecting them correctly at _link_ time is not easy.  Detecting them
    at compile time is insufficient because malloc implementations can
    be swapped at link time (and even with LD_PRELOAD, unfortunately).

commit ce5cce161d504df849a50ee1080db42a66ca8c42
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-12-01 10:53:56 +0000

    test: fix valgrind warnings in test-only C code
    
    Unfortunately, none of the C-only tests are run with valgrind
    (however all of the Ruby ones are).

commit 2410738dcf00cda49c9f1d5847289f6a48944c2a
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-12-01 10:50:19 +0000

    test/chunk-parser-1: fix uninitialized file structures
    
    This test failed when during the test on FreeBSD 11.0-CURRENT with
    MALLOC_DEBUG enabled or if MALLOC_OPTIONS=J is set in the environment.
    
    Reported-by: Mikolaj Golub

commit 1a4a94f338dbe641a3f1b27a080fc34bac7f43d4
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-12-02 04:28:54 +0000

    selfwake: do share pipe descriptors with workers
    
    This only affects users of the undocumented --worker-processes
    switch.  Furthermore, this only affects non-Linux platforms which
    rely on the pipe implementation of selfwake.
    
    This prevents us from wasting one extraneous file descriptor slot
    (and hence potentially wasting 128 bytes in userland).

commit b7bda87ead4a53bb792dbbfb6079aad8cd4170de (tag: v1.3.1)
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-10-12 21:44:08 +0000

    cmogstored 1.3.1 - fix for an undocumented feature
    
    This release fixes a bug which only affects users of the
    undocumented multi-process configuration feature
    (which is also multi-threaded).
    
    * avoid use-after-free with multi-process setups
    
      readdir on the same DIR pointer is undefined if DIR was inherited by
      multiple children.  Using the reentrant readdir_r would not have
      helped, since the underlying file descriptor and kernel file handle
      were still shared (and we need rewinddir, too).
    
      This readdir usage bug existed in cmogstored since the earliest
      releases, but was harmless until the cmogstored 1.3 series.
    
      This misuse of readdir lead to hitting a leftover call to free().
      So this bug only manifested since
      commit 1fab1e7a7f03f3bc0abb1b5181117f2d4605ce3b
      (svc: implement top-level by_mog_devid hash)
    
      Fortunately, these bugs only affect users of the undocumented
      multi-process feature (not just multi-threaded).

commit e8217a1fe0cf341b7219a426f23e02cb44281301
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-10-12 07:00:58 +0000

    avoid use-after-free with multi-process setups
    
    readdir on the same DIR pointer is undefined if DIR was inherited by
    multiple children.  Using the reentrant readdir_r would not have
    helped, since the underlying file descriptor and kernel file handle
    were still shared (and we need rewinddir, too).
    
    This readdir usage bug existed in cmogstored since the earliest
    releases, but was harmless until the cmogstored 1.3 series.
    
    This misuse of readdir lead to hitting a leftover call to free().
    So this bug only manifested since
    commit 1fab1e7a7f03f3bc0abb1b5181117f2d4605ce3b
    (svc: implement top-level by_mog_devid hash)
    
    Fortunately, these bugs only affect users of the undocumented
    multi-process feature (not just multi-threaded).

commit a4126a4bef3708c6f3b63f8a8877a3ce2213470b (tag: v1.3.0)
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-09-30 08:44:15 +0000

    cmogstored 1.3.0 - many improvements
    
    There are no changes from 1.3.0rc2.
    
    For the most part, cmogstored 1.2.2 works well, but 1.3 contains some
    fairly major changes and improvements.
    
    cmogstored CPU usage may be higher than other servers because it's
    designed to use whatever resources it has at its disposal to
    distribute load to different storage devices.  cmogstored 1.3
    continues this, but it should be safer to lower thread counts
    without hurting performance too much for non-dedicated servers.
    
    cmogstored 1.3 contains improvements for storage hosts at the
    extremes ends of the performance scale.  For large machines with many
    cores, memory/thread usage is reduced because we had too many acceptor
    threads.  There are more improvements for smaller machines, especially
    those with slow/imbalanced drive speeds and few CPUs.  Some of the
    improvements came from my testing with ancient single-core machines,
    others came from testing on 24-core machines :)
    
    Major features in 1.3:
    
    ioq - a I/O queues for all MogileFS requests
    --------------------------------------------
    
    The new I/O queue (ioq) implements the equivalent of AIO channels
    functionality from Perlbal/mogstored.  This feature prevents a
    failing/overloaded disk from monopolizing all the threads in the system.
    
    Since cmogstored uses threads directly (and not AIO), the common
    (uncontended case) behaves like a successful sem_wait with POSIX
    semaphores.  Queueing+rescheduling only occurs in the contended case
    (unlike with AIO-style APIs, where request are always queued).  I
    experimented with, but did not use POSIX semaphores as contention would
    still starve the thread pool.
    
    Unlike the old fsck_queue, ioq is based on the MogileFS devid in the URL
    and not the st_dev ID of the actual underlying file.  This is less
    correct from a systems perspective, but should make no difference for
    normal production deployments (which are expected to use one MogileFS
    devid for each st_dev ID) and has several advantages:
    
    1) testing/mock deploys of this feature with mock deploys is easier
    
    2) we do not require any additional filesystem syscall (open/*stat)
       to look up the ioq based on st_dev, so we can use ioq to avoid
       stalls from slow open/openat/stat/fstatat/unlink/unlinkat syscalls.
    
    Otherwise, the implementation of this very closely resembles the old
    fsck queue implementation, but is generic across HTTP and sidechannel
    clients.  The existing fsck queue functionality is now implemented using
    ioq.  Thus, fsck queue functionality is mapped by the MogileFS devid and
    not the system st_dev ID as a result of this change.
    
    One benefit of this feature is the ability to run fewer aio_threads
    safely without worrying about cross-device contention on machines with
    limited resources or few disks (or not solely dedicated to MogileFS
    storage).
    
    The capacity of these I/O queues is automatically scaled to the number
    of available aio_threads, so they can change dynamically while your
    admin is tuning "SERVER aio_threads = XX"
    
    However, on a dedicated storage node, running many aio_threads (as is
    the default) should still be beneficial.  Having more threads can keep
    the internal I/O queues of the kernel and storage hardware more
    populated and can improve throughput.
    
    thread shutdown fixes (epoll)
    -----------------------------
    
    Our previous reliance on pthreads cancellation primitives left us open
    to a small race condition where I/O events (from epoll) could be lost
    during graceful shutdown or thread reduction via
    "SERVER aio_threads = XX".  We no longer rely on pthreads cancellation
    for stopping threads and instead implement explicit check points for
    epoll.
    
    This did not affect kqueue users, but the code is simpler and more
    consistent across epoll/kqueue implementations.
    
    Graceful shutdown improvements
    ------------------------------
    
    The addition of our I/O queueing and use of our custom thread shutdown
    API also allowed us to improve the responsiveness and fairness when the
    process enters graceful shutdown mode.  This improves fairness and
    avoids client-side timeouts when large PUT requests are being issued
    over a fast network to slow disks during graceful shutdown.
    
    Currently, graceful shutdown remains single-threaded, but we will likely
    become multi-threaded in the future (like normal runtime).
    
    Miscellaneous fixes and improvements
    ------------------------------------
    
    Further improved matching for (Linux) device-mapper setups where the
    same device (not symlinks) appears multiple times in /dev
    
    aio_threads count is automatically updated when new devices are
    added/removed.  This is currently synced to MOG_DISK_USAGE_INTERVAL, but
    will use inotify (or the kqueue equivalent) in the future.
    
    HTTP read buffers grow monotonically (up to 64K) and always use aligned
    memory.  This allows deployments which pass large HTTP headers do not
    trigger unnecessary reallocations.  Deployments which use small HTTP
    headers should notice no memory increase.
    
    Acceptor threads are now limited to two per process instead of being
    scaled to CPU count.  This avoids excessive threads/memory usage and
    contention of kernel-level mutexes for large multi-core machines.
    
    The gnulib version used for building the tarball is now included in the
    tarball for ease-of-reproducibility.
    
    Additional tests for uncommon error conditions using the fault-injection
    capabilities of GNU ld.
    
    The "shutdown" command over the sidechannel is more responsive for epoll
    users.
    
    Improved reporting of failed requests during PUT requests.  Again, I run
    MogileFS instances on some of the most horrible networks on the planet[2]
    
    fix LIB_CLOCK_GETTIME linkage on some toolchains.
    
    "SERVER mogstored.persist_client = (0|1)" over the sidechannel is supported
    for compatibility with Perlbal/mogstored
    
    The Status: header is no longer returned on HTTP responses.  All known
    MogileFS clients parse the HTTP status response correctly without the
    need for the Status: header.  Neither Perlbal nor nginx set the Status:
    header on responses, so this is unlikely to introduce incompatibilities.
    The Status: header was originally inherited from HTTP servers which had
    to deal with a much larger range of (non-compliant) clients.

commit 97a39a02481dc24582aa7317d8d94c21d753d040 (tag: v1.3.0rc2)
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-09-03 07:31:48 +0000

    cmogstored 1.3.0rc2 - fixes since rc1, systemtap
    
    The Status: header is no longer returned on HTTP responses.  All known
    MogileFS clients parse the HTTP status response correctly without the
    need for the Status: header.  Neither Perlbal nor nginx set the Status:
    header on responses, so this is unlikely to introduce incompatibilities.
    The Status: header was originally inherited from HTTP servers which had
    to deal with a much larger range of (non-compliant) clients.
    
    SystemTap support is mostly fleshed out.  There are some bundled awk
    scripts which should make better sense of the all.stp which logs just
    about everything.
    
    Raising aio_threads now correctly increases ioq capacity.  This
    regression was only introduced in the 1.3.0 rc series, as ioq
    was not in 1.2.x.

commit 82fe4d7dfad38e210bb86d2989e9436c267dd81a
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-09-03 08:49:49 +0000

    Makefile: update for systemtap support files

commit 3a9a1c5cada0630c499fcf42dfb5b38d11694844
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-08-31 03:14:40 +0000

    ioq: correctly reenqueue blocked mfds on capacity increase
    
    Otherwise, reenqueue-ing only one mfd at-a-time is pointless
    and prevents cmogstored from utilizing new threads.

commit 3d55af133e1da342a7eb52c3dc099daf4ed6acf6
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-08-31 00:58:47 +0000

    ioq: avoid over-yielding on and after ioq contention
    
    We do not need to set the contended flag again until we're certain
    we have no free slots in the ioq, not when we assume the client
    is the last one to take a slot.  This is because ioq access itself
    is serialized, and the last client taking the ioq could be getting
    a false positive when another thread is waiting on ioq->mtx to
    release the ioq.
    
    This prevents throughput loss while recovering from a situation
    where an ioq is oversubscribed.  This is reproduced under heavy
    load and switching temporarily to "SERVER aio_threads = 1"
    and then bringing aio_threads back up to a high value.

commit 2b7a572ddd9bcce063e3cd10851fd953f525fe24
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-08-29 19:36:40 +0000

    m4/systemtap.m4: quote cm_cv_sdt_h_usable var
    
    The variable may not be defined at all, so it must be
    quoted to avoid spewing a warning of dtrace/stap are not
    found.

commit 723a81a0e25ff07c2e6dd9dbd6bf838f6bee7411
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-08-27 01:19:53 +0000

    tapset/*awk: document these scripts
    
    Otherwise I will forget what they output one day and will
    have to read the code again.

commit dc35288ce1b6e05e74040aa9e8af1166cfa92bd8
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-08-27 00:37:03 +0000

    TODO: remove item for systemtap/dtrace
    
    systemtap support is implemented, and hopefully dtrace works, too.

commit 37a5071021601480384c2abe20f2d33ad974579d
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-08-07 20:03:34 +0000

    flesh out systemtap support and awk helpers
    
    Our "all.stp" tapset now generates awk-friendly output for feeding
    some sample awk scripts.
    
    Using awk (and gawk) was necessary to avoid reimplementing strftime
    in guru mode for generating CLF (Common Log Format) HTTP access logs.
    
    Using awk also gives us several advantages:
    
    * floating point number support (for time differences)
    
    * a more familiar language to systems administrators
      (given this is for MogileFS, perhaps Perl would be even
       more familiar...).
    
    * fast edit/run cycle, so the slowness of using stap to
      rebuild/reload the kernel module for all.stp changes can
      be avoided when output must be customized.

commit fe1e1200c1541676e6b8402b7972a16105a76a63
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-08-22 23:11:49 +0000

    http: remove Status: header from all responses
    
    This was inherited from a server which needed to deal with
    some broken clients, MogileFS does not have this problem.
    Neither Perlbal nor nginx set this response header, either,
    so lets save ourselves a few bytes.

commit 1199492dd1adb394cf4cc0d599e7f77c52ccbdbf
Author: Eric Wong <e@yhbt.net>
Date:   2013-07-31 20:26:25 +0000

    trywrite: workaround potential inf loops from kernel bugs
    
    While we're fortunate enough to not have encountered a case
    where send/writev returns zero with a non-zero-length buffer,
    it's not inconceivable that it could strike us one day.  In that
    case, error out the connection instead of infinite looping.
    
    Dropping a connection is safer than letting a thread run in
    an infinite loop.

commit 317b979e29774a77fb933c4f42514ff007669b39
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-26 06:32:49 +0000

    test/mgmt: warn about slow mount points on test failure
    
    Unfortunately, slow mount points still cause minor reliability
    issues with the test suite.

commit 596dbef8b4b23657fd78dca4bc55e261c3f6b376
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-26 06:12:34 +0000

    test/mgmt: increase reliability of max devid test
    
    This seems to fail more under heavy load, so wait a bit longer for
    iostat to become aware of the new devices.

commit c49cf315dadbf1cfe2f5e80c1f3c1ae27ad0761e
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-19 02:17:18 +0000

    move trace.h include to global cmogstored.h
    
    We'll have tracing everywhere, so it's too much maintenance overhead
    to add it to every file which wants it.  Increased build-times are
    a problem, but less than the maintenance overhead of finding the
    right headers.

commit 939abdfed71349df87712559553593dc95f406c5
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-19 00:29:47 +0000

    tapset: rename http_request.stp -> all.stp
    
    This tapset will contain every probe point and acts as a
    check/documentation for extracting useful probes.

commit 313a04bd35534a6cd024149d9f2c9b9487f08165
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-19 00:15:31 +0000

    split out {mgmt,http}_parse_continue checks
    
    Incomplete request headers are uncommon, so if we see them,
    something is probably off or strange.  This should make it
    easier to maintain probe points to watch for this behavior.

commit 00d234c6f9362c11938f3b67c03bf208c7638eca
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-18 23:58:23 +0000

    probes: add probes for rbuf growth
    
    Growing the rbufs should be uncommon, but it should set off alarms
    if it happens too often.

commit 4c6a7474a281451b1ef57f686b9b21cbb8216b0d
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-18 20:21:43 +0000

    test/mgmt: cover the large rbuf growth case
    
    mgmt may now encounter large rbufs, so ensure that uncommon case
    is tested.

commit 6d2642bb1a42840e809e7a73896a1631d37b15e6
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-18 00:25:47 +0000

    split out {http,mgmt}_rbuf_grow functions
    
    This should allow easier tracing of rbuf growth, and should
    hopefully make the code more explicit and harder to screw up.

commit 48bbaf84da51644451a3dc0c1254d51c035ccce0
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-17 18:41:48 +0000

    ioq: add probes tracing and documentation
    
    ioq tracing will allow users to notice when devices are saturated
    (from a cmogstored POV) and increase aio_threads if necessary.

commit f8c655bbb3b733a10c6aab9c71246e94652c6cc9
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-16 10:19:29 +0000

    tapset/http_request: log listen address and PID of connection
    
    It is helpful to know the address of the listener on the server
    which accepted the client socket.  Additionally, the PID,FD combination
    should be be safely unique for any point in time.

commit 7c49988ebf5c176cadd4a9e287e443d49a2cdeec
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-17 01:39:24 +0000

    document ioq and mog_{mgmt,http}_drop interaction safety
    
    I needed to spend time to convince myself this was safe, so
    leave a note to others (and future self) in case there is
    cause for concern.
    
    Basically, this is highly dependent on our overall one-shot-based
    concurrency model and safe as long as basic rules are followed.

commit 2869d2bf7a24a0b42bde738589221def0289ce54
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-16 23:05:43 +0000

    queue_epoll: EPOLL_CTL_MOD should be safe on 2.6.32.61+
    
    Willy Tarreau cherry-picked the relevant fix into 2.6.32 longterm
    stable tree
    
    ref:
    commit 1c137a47bbdd6e86298627e04f547afd7f35d523
    git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git

commit 800bb2057ce8559eede740816be06cf60d959f39
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-14 08:23:52 +0000

    alloc: remove mog_rbuf_free_and_null
    
    This function is no longer used as we now attempt to reattach
    rbufs to the TLS space of each thread.

commit 4edbdd6ba3686a60a8ddeed8f6f26e55abf0b207
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-14 07:26:36 +0000

    downgrade thread/device-count fields to unsigned int
    
    It's unlikely we'll even come close to see 2-4 billion devices in a
    MogileFS instance for a while.  Meanwhile, it's also unlikely the
    kernel will ever run that many threads, either.  So make it easier
    to pack and shrink data structures to save a few bytes and perhaps
    get better memory alignement.
    
    For reference, the POSIX semaphore API specifies initial values
    with unsigned (int) values, too.
    
    This leads to a minor size reduction (and we're not even packing):
    
    $ ~/linux/scripts/bloat-o-meter cmogstored.before cmogstored
    add/remove: 0/0 grow/shrink: 0/13 up/down: 0/-86 (-86)
    function                                     old     new   delta
    mog_svc_dev_quit_prepare                      13      12      -1
    mog_mgmt_fn_aio_threads                      147     146      -1
    mog_dev_user_rescale_i                        27      26      -1
    mog_ioq_requeue_prepare                       52      50      -2
    mog_ioq_init                                  80      78      -2
    mog_thrpool_start                            101      96      -5
    mog_svc_dev_user_rescale                     143     137      -6
    mog_svc_start_each                           264     256      -8
    mog_svc_aio_threads_handler                  257     249      -8
    mog_ioq_ready                                263     255      -8
    mog_ioq_next                                 303     295      -8
    mog_svc_thrpool_rescale                      206     197      -9
    mog_thrpool_set_size                        1028    1001     -27

commit e46c221c47e3cd00edfcae199146cb2f50b9b63f (tag: v1.3.0rc1)
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-14 00:52:14 +0000

    cmogstored 1.3.0rc1
    
    For the most part, cmogstored 1.2.2 works well, but 1.3 contains some
    fairly major changes and improvements.
    
    cmogstored CPU usage may be higher than other servers because it's
    designed to use whatever resources it has at its disposal to distribute
    load to different storage devices.  cmogstored 1.3 will continue this,
    but it should be safer to lower thread counts without hurting
    performance too much for non-dedicated servers.
    
    Unfortunately, the minor, Linux-only bug affecting 1.2.2 for (uncommon)
    thread shutdowns required some fairly intrusive changes to fix, so I'm
    not sure if releasing a 1.2.3 is worth it.  If you're happy with 1.2.x,
    I recommend marking the host down via mogadm before lowering
    "SERVER aio_threads = XX" or sending SIGQUIT to cmogstored.  But
    I think thread shutdown is uncommon enough to not affect normal
    deployments.
    
    cmogstored 1.3 will contain improvements for storage hosts at the
    extremes ends of the performance scale.  For large machines with many
    cores, memory/thread usage is reduced because we had too many acceptor
    threads.  There are more improvements for smaller machines, especially
    those with slow/imbalanced drive speeds and few CPUs.  Some of the
    improvements came from my testing with ancient single-core machines,
    others came from testing on 24-core machines :)
    
    The SystemTap tracing work is still in-progress (although the 1.3 cycle
    was originally intended to focus on this :x).  I expect the remaining
    changes to be non-intrusive and will work on them through the RC cycle.
    
    Major features in 1.3:
    
    ioq - a I/O queues for all MogileFS requests
    --------------------------------------------
    
    The new I/O queue (ioq) implements the equivalent of AIO channels
    functionality from Perlbal/mogstored.  This feature prevents a
    failing/overloaded disk from monopolizing all the threads in the system.
    
    Since cmogstored uses threads directly (and not AIO), the common
    (uncontended case) behaves like a successful sem_wait with POSIX
    semaphores.  Queueing+rescheduling only occurs in the contended case
    (unlike with AIO-style APIs, where request are always queued).  I
    experimented with, but did not use POSIX semaphores as contention would
    still starve the thread pool.
    
    Unlike the old fsck_queue, ioq is based on the MogileFS devid in the URL
    and not the st_dev ID of the actual underlying file.  This is less
    correct from a systems perspective, but should make no difference for
    normal production deployments (which are expected to use one MogileFS
    devid for each st_dev ID) and has several advantages:
    
    1) testing/mock deploys of this feature with mock deploys is easier
    
    2) we do not require any additional filesystem syscall (open/*stat)
       to look up the ioq based on st_dev, so we can use ioq to avoid
       stalls from slow open/openat/stat/fstatat/unlink/unlinkat syscalls.
    
    Otherwise, the implementation of this very closely resembles the old
    fsck queue implementation, but is generic across HTTP and sidechannel
    clients.  The existing fsck queue functionality is now implemented using
    ioq.  Thus, fsck queue functionality is mapped by the MogileFS devid and
    not the system st_dev ID as a result of this change.
    
    One benefit of this feature is the ability to run fewer aio_threads
    safely without worrying about cross-device contention on machines with
    limited resources or few disks (or not solely dedicated to MogileFS
    storage).
    
    The capacity of these I/O queues is automatically scaled to the number
    of available aio_threads, so they can change dynamically while your
    admin is tuning "SERVER aio_threads = XX"
    
    However, on a dedicated storage node, running many aio_threads (as is
    the default) should still be beneficial.  Having more threads can keep
    the internal I/O queues of the kernel and storage hardware more
    populated and can improve throughput.
    
    thread shutdown fixes (epoll)
    -----------------------------
    
    Our previous reliance on pthreads cancellation primitives left us open
    to a small race condition where I/O events (from epoll) could be lost
    during graceful shutdown or thread reduction via
    "SERVER aio_threads = XX".  We no longer rely on pthreads cancellation
    for stopping threads and instead implement explicit check points for
    epoll.
    
    This did not affect kqueue users, but the code is simpler and more
    consistent across epoll/kqueue implementations.
    
    Graceful shutdown improvements
    ------------------------------
    
    The addition of our I/O queueing and use of our custom thread shutdown
    API also allowed us to improve the responsiveness and fairness when the
    process enters graceful shutdown mode.  This improves fairness and
    avoids client-side timeouts when large PUT requests are being issued
    over a fast network to slow disks during graceful shutdown.
    
    Currently, graceful shutdown remains single-threaded, but we will likely
    become multi-threaded in the future (like normal runtime).
    
    Miscellaneous fixes and improvements
    ------------------------------------
    
    Further improved matching for (Linux) device-mapper setups where the
    same device (not symlinks) appears multiple times in /dev
    
    aio_threads count is automatically updated when new devices are
    added/removed.  This is currently synced to MOG_DISK_USAGE_INTERVAL, but
    will use inotify (or the kqueue equivalent) in the future.
    
    HTTP read buffers grow monotonically (up to 64K) and always use aligned
    memory.  This allows deployments which pass large HTTP headers do not
    trigger unnecessary reallocations.  Deployments which use small HTTP
    headers should notice no memory increase.
    
    Acceptor threads are now limited to two per process instead of being
    scaled to CPU count.  This avoids excessive threads/memory usage and
    contention of kernel-level mutexes for large multi-core machines.
    
    The gnulib version used for building the tarball is now included in the
    tarball for ease-of-reproducibility.
    
    Additional tests for uncommon error conditions using the fault-injection
    capabilities of GNU ld.
    
    The "shutdown" command over the sidechannel is more responsive for epoll
    users.
    
    Improved reporting of failed requests during PUT requests.  Again, I run
    MogileFS instances on some of the most horrible networks on the planet[2]
    
    fix LIB_CLOCK_GETTIME linkage on some toolchains.
    
    "SERVER mogstored.persist_client = (0|1)" over the sidechannel is supported
    for compatibility with Perlbal/mogstored

commit 12049de467b52f1c8e4e16b53cb10182d06c6a51
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-14 02:31:41 +0000

    m4/systemtap: require stap for enabling systemtap build
    
    Only relying on dtrace leads to build problems on FreeBSD which
    I haven't had a chance to fix.

commit 8f9b7e28eaf74e5fdc72328f0dfb890d92c02ec1
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-14 00:46:10 +0000

    ioq: reset internal queues during requeue/shutdown
    
    This should avoid concurrency bugs where client may run in
    multiple threads if we switch to multi-threaded graceful shutdown.

commit b773c55485a7a50904493a0cdc8dd22da9bbfdee
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-13 19:50:14 +0000

    test/pwrite_wrap: disable test under valgrind for now
    
    This test is too slow and timing-sensitive under valgrind, so
    disable it for now until we have a better solution.

commit f3ff911f3cfeb6af3e32513c4301be389a936d76
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-13 10:32:46 +0000

    ioq: set contended flag if we are the last one acquiring the lock
    
    We could be completely out of threads upon acquiring an ioq, so the
    last thread to acquire a lock slot must trigger a yield soon to
    avoid starvation and fairness issues.  Otherwise, all threads
    for a given device could remained pinned indefinitely.

commit 6333dc06a23a80690f60f3659428df88bd19d736
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-13 10:31:22 +0000

    test/mgmt_persist_client: teardown running processes
    
    Tests need to cleanup by stopping running processes.

commit 5b1c49b1cb6c719eb098beae3823cf63d116d8ed
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-13 02:26:17 +0000

    pass mog_accept instead of mog_svc to post-accept callbacks
    
    This allows us to capture/trace the listen address which
    accepted the request without consuming additional stack space.

commit 5c65fa6a053691ffee983b61298f3863b660b408
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-13 02:10:51 +0000

    set addrinfo field for "struct mog_accept"
    
    This will allow us to properly report the listen address the client
    connected to.

commit 22a718de33fef78bab33bc00e52cd230c22e1945
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-13 00:08:00 +0000

    http: pass "struct mog_fd *" more consistently in API
    
    This makes it easier to write tapsets which key objects
    by: PID,FD for uniqueness.  This also avoids some mog_fd_of()
    calls.

commit ec096dc8de3d37f4e33e7bc47bcfbe5207ae6855
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-12 23:22:29 +0000

    m4/ld_wrap: avoid compiler warning for missing declaration
    
    This avoids noise in config.log

commit 5666c4496facb4ad7cfa073cf1d6d849784e06b8
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-13 06:41:59 +0000

    iostat: keep update prefix on stack instead of heap
    
    The update prefix is bounded in size, so this will save us NR_DEVICES
    malloc/free pairs each second from typical iostat output.

commit fe57de9a8b6b9a6f4f840ab5a2ca17c8f803ce20
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-12 22:21:56 +0000

    mgmt_fn: minor cleanup for emitting blank response
    
    No need to recreate mog_mgmt_fn_blank for sending blank responses.

commit 249c82c4080c7adb08c32ebcd6cd74ffec5acd18
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-12 20:52:39 +0000

    test/http: disable time-dependent test under valgrind
    
    test_head_response_time does not test anything which would
    not be otherwise tested by other tests under valgrind.
    This test is only needed for occasional validation of
    fuckups regarding TCP_NOPUSH on FreeBSD, and not necessary
    for general use.

commit 4244fd63ef360a1b5a201d82e323c54842f0db55
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-12 20:49:46 +0000

    http: check persist_client state when parsing starts
    
    We don't want drop in-flight pipelined requests when disabling
    persistent connections.  Disabling persistent connections will
    always be potentially racy, but hopefully this makes the race
    small enough that lower-level latencies are the only thing
    which affect that.

commit 86c7628b01130559c53dffe1d799f2031a020918 (persist_client)
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-12 02:56:41 +0000

    http: signal connection close during shutdown
    
    While we always properly disconnected clients during shutdown, we
    explicitly set "Connection: close" now to inform clients of our
    pending shutdown.  This avoids potentially confusing clients when we
    disconnect them as there may still be a race condition where we shut
    down a client while their request packets are in-flight.

commit 0e5d6c6f4b28a75853d1020f07e493632031a054
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-12 02:53:17 +0000

    mgmt: support "SET mogstored.persist_client = $BOOL"
    
    This is Perlbal functionality which works in Perl mogstored,
    so we will also support it here, as it makes upgrading to new
    versions easier.

commit 1c9fe8380f14e2b67bed99d16ef465db8d379b41
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-12 00:54:57 +0000

    svc: increase responsiveness of graceful shutdown
    
    By reducing the capacity of each ioq, we force each running worker
    thread to yield the current client and hit an exit point
    (epoll_wait/kqueue) sooner.

commit 56d4a65df3fc011086648563b2235eac49b7ba60
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-12 00:42:55 +0000

    test/mgmt: increase reliability on overloaded systems
    
    Without this, test_iostat_watch fails sometimes under valgrind.

commit f206fc4ee27546c57ebc6b4bf069257c05970cd2
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-12 00:22:29 +0000

    tests: introduce pwrite-wrap test for slow I/O
    
    pwrite can be a slow, blocking function on an overloaded
    system, but a slow pwrite requires a wrapper to simulate.
    
    This allows us to have coverage of the:
    
            if (mog_ioq_contended())
                    return MOG_NEXT_WAIT_RD;
    
    cases in http_put.c

commit e50365f275ada4afcd5f25f2ac3328e341a79d71
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-11 22:15:56 +0000

    ioq: rescale to match user-set aio_threads values
    
    Users reducing or increasing thread counts should increase
    ioq capacity, otherwise there's no point in having more or
    less threads if they are synched to the ioq capacity.

commit f83d0466afc32542f3f4ff962105c817a1be2c96
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-11 19:06:27 +0000

    mgmt: checksumming is interruptible during thread shutdown
    
    We want to yield dying threads as soon as possible during
    thread shutdown, so we check the quit flag and yield the
    running thread to trigger a MOG_NEXT_ACTIVE.

commit daab757f5e52ce36a47e2d713365d68367a0e6dd
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-11 08:57:02 +0000

    ioq: introduce mog_ioq_contended hint
    
    This will allow us to detect I/O contention on our queue
    and yield the current thread to other clients for fairness.
    This can prevent a client from hogging the thread in situations
    where the network is much faster than the filesystem/disk.

commit 9302d584dcf68489a9c4739a3a42a468323ccda6
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-10 07:55:29 +0000

    struct mog_ni: document reasoning for the ':' in ni_serv
    
    This is somewhat strange, but makes the code base slightly easier
    to reuse for non-HTTP purposes.

commit 9897d28bb57f2aa84f91b1a8594c7ecd30be8446
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-09 00:14:56 +0000

    http: include IP:PORT in "client died" message
    
    This should hopefully make failures easier to track down.

commit 2c24cf070dfc9341462fcba59fab4c6b7b330938
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-03 07:50:00 +0000

    remove assertion for handling iostat death
    
    This only triggered if the (undocumented) --worker-processes
    option is used.  This assertion is no longer valid as of
    commit d5a52618ca1f9b5d7f6998716fbfe7714f927112
    (refactor handling of "server aio_threads = " command)

commit b600fc854d2a813dc7cf08eb58590ada90db4c02
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-03 07:32:52 +0000

    file: embed ioq in the opened mog_file object
    
    This allows us to avoid a redundant hash lookup every time we
    "activate" an open file for reading or writing.

commit 013e903340a75b12523bd795d15fe5f23d725be9
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-29 03:14:54 +0000

    ioq: implement and enable generic I/O queues
    
    This will allow us to limit concurrency on a per-device basis with
    limited impact on HTTP header reading/parsing.  This prevents
    pathological slowness on a single device from bringing down an entire
    host.  This also allows users to more safely run with fewer aio_threads
    (e.g. 1:1 thread:device mapping) on fast devices with smaller low-level
    (kernel/hardware) I/O queues.

commit fef978104cf134dc6629115456b27dfa2856ded7
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-29 00:39:49 +0000

    packaddr: simplify mog_sockaddr definition
    
    "struct sockaddr" turns out to be smaller than "struct sockaddr_in6",
    so we can avoid complicated casting and just add that to the union.
    We continue avoiding "struct sockaddr_storage", however, as it is
    unnecessarily large for our needs.

commit 71849ca64134b0cfa197fc4b1ce8fc10c7fb5d98
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-27 22:01:52 +0000

    test/mgmt: remove unused variable
    
    This was triggering warnings with Ruby 2.0.0-p195

commit 160e768fe8d6043af1e435daeb35d5c92e05de11
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-27 03:54:39 +0000

    rbuf: reattach/reuse read buffers when possible
    
    Reattaching/reusing read buffers allows us to avoid repeated
    reallocation/growth/free when clients repeatedly send us large headers.
    This may also increase cache-hits by favoring recently-used buffers as
    long as fragmentation is kept in check.  The fragmentation should be
    no worse that is currently, due to the existing detach nature of rbufs

commit 331e7a1300ae59a052763ffecc77b45a56e2deb3
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-27 00:18:08 +0000

    mgmt: remove restriction on large rbuf sizes
    
    We'll be allowing the migration of buffers between threads
    and from waiting clients back to thread-local storage.

commit d9486d154f69be2bbe44dbc8ea74efce1d0195ad
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-26 20:18:34 +0000

    alloc: cache-align all rbuf memory allocations
    
    Some setups use clients which pass large headers (User-Agent, or
    even cookies(!)) to cmogstored, so large rbufs may be used often
    and repeatedly in those cases.
    
    We limit rbuf sizes to 64K anyways, so keeping "larger" buffers
    around should not be much of an issue for modern systems.
    
    This prepares us for reusing/recycling large rbufs as TLS buffers.

commit bb27afc702459d683a6b6ca5822b746142047acc
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-26 02:16:15 +0000

    mgmt: handle disk-using requests outside of the parser
    
    This will allow us to use control flow similar to the http client
    handling code when we queue clients based on I/O channel.

commit ad961733c0afb96a7ab44dc9837a0f8c8fa239a4
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-26 01:03:52 +0000

    introduce generic I/O queue functionality
    
    This replaces the fsck_queue internals with a generic
    ioq implementation which is based on the MogileFS devid,
    and not the operating system devid.

commit 70efa665edeef05f53978f9d541f411b0e1a2b2a
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-26 00:41:54 +0000

    http: add assertion for unused wbuf
    
    We need to ensure we do not introduce code to launch
    http_process_client while we have buffered data (or socket write
    errors).

commit c86b6a2c769c821a64fc14c62a953244b41cb190
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:20 +0000

    dev: shrink and cache-align struct mog_dev
    
    We will have structures inside the dev struct accessed by multiple
    threads frequently, so keep it cache-aligned.
    
    To reduce memory usage for large-numbered devices, avoid storing the
    prefix on output and instead just rely on the printf-family of
    routines to generate stringified output in uncommon code paths.

commit f56b866f92e195ffd24a2f8f80e8e2cef226c775
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-03 16:30:32 +0000

    mgmt: fix case where rbuf->rsize may be uninitialized
    
    Detachers MUST set rsize properly.  This API is unfortunately fragile
    and will eventually be fixed to be more difficult to misuse.

commit 5027df50b5072d964f551414e259c2903778ea36
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-07-04 00:20:01 +0000

    build: fix LIB_CLOCK_GETTIME linkage on some toolchains
    
    According to the m4/clock_gettime.m4 documentation (from gnulib),
    the LIB_CLOCK_GETTIME variable should be added to a *LDADD variable
    and not AM_LDFLAGS.  This is also consistent with GNU automake
    documentation.
    
    Thanks to Cody Pisto for reporting this problem under Ubuntu 12.04
    
    ref: http://www.gnu.org/software/automake/manual/html_node/Linking.html

commit 212cca976056069d49b120ab196c25e76315a427 (good)
Merge: cb6851f 93c14dd
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-25 22:43:48 +0000

    Merge branch '1.2-stable'
    
    * 1.2-stable:
      cmogstored 1.2.2 - minor maintenance release
      INSTALL: update versions and URLs
      INSTALL: clarify between starting from tarball vs git
      test/cmogstored-cfg: ensure TMPDIR is absolute for valgrind
      iostat_parser: allow '-' for device names
      alloc: posix_memalign does not set errno

commit cb6851fc69a3fb3d47e4e3a350787deef1bfafa6
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:39 +0000

    tests: fault-injection test for ENOSPC on epoll_ctl
    
    For difficult-to-trigger errors, fault injection is necessary for
    testing our error handling.  I have confirmed this test fails with
    "avoid leaks on epoll/kqueue resources exhaustion" reverted.

commit c1ced9e91ddc647a40f343d20d43cf13fe88eeba
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:38 +0000

    avoid leaks on epoll/kqueue resources exhaustion
    
    Simply releasing the descriptor triggering ENOSPC/ENOMEM errors from
    epoll_ctl and kevent is not good enough, as those descriptors may
    have other descriptors (e.g. files to be served) hanging off of them.

commit e12e70b6bd242cb3fea74d1df8b7b44e0a9f7f26
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:36 +0000

    introduce mog_yield wrapper around sched_yield/pthread_yield
    
    While pthread_yield is non-standard, it is relatively common and
    preferable for systems where pthreads are _not_ 1:1 mapped to kernel
    threads.  This also provides a stronger yield to weaken the priority
    of the calling thread wherever we previously used sched_yield.

commit a18a08a0e9a7c472656afc86cbbbfcefda5e456d
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:35 +0000

    call sched_yield repeatedly when terminating threads
    
    This should allow the threads we're terminating to more quickly
    enter a safe state where they're allowed to exit.  On SMP systems,
    we need to yield the signalling thread more times to increase the
    probability the interrupted thread can run (and exit).

commit df9729555394542064d1c9e9d1b67446bf36d3f3
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:32 +0000

    Makefile.am: fix systemtap probes.h distribution
    
    Our tests over-link (to save developer time :P), so we must
    link in probes with our tests.  Also, we must keep probes.h
    around for distclean (but not maintainerclean)

commit f159a33754215eac82b26912bce5592294f9a989
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:31 +0000

    shrink mog_packaddr and improve portability
    
    We cannot assume sa_family_t is the first element of "struct
    sockaddr_in" or "struct sockaddr_in6".  FreeBSD has a "sa_len"
    member as the first element while Linux does not.
    
    So only keep the parts of the "struct sockaddr*" we need and use
    inet_ntop instead of getnameinfo.  This also gives us a little more
    space to add additional fields to "struct mog_http" in the future
    without increasing memory (or CPU cache) use.

commit fe593c035d50efb5cee7ad10697172ee4072556d
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:30 +0000

    dist: include newly-added files to the tarball
    
    Tarballs were otherwise unusable.

commit 0b090760e82545b178cdb0b2d63bf03990fc0595
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:29 +0000

    replace pthreads cancellation with explicit checks
    
    Due to data/event loss, we cannot rely on normal syscalls
    (accept/epoll_wait) being cancellation points.  The benefits of
    using a standardized API to terminate threads asynchronously are
    lost when toggling cancellation flags.
    
    This implementation allows us to be more explicit and obvious at the
    few points where our worker threads may exit and reduces the amount
    of code we have.  By avoiding the calls to pthread_setcancelstate,
    we should halve the number of atomic operations required in the
    common case (where the thread is not marked for termination).

commit 328623972837345dbcf3ed372293201e3bc4fe3c
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:28 +0000

    "server aio_threads = XX" no longer requires malloc
    
    This should prevent one class of "accidental" failures.
    (The sidechannel has never been meant to be secure and exposed
     to the public).

commit 40f84cd0924958c619d434a9147e7ed2b6abaadc
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:27 +0000

    fdmap: do not warn on ENOTCONN due to unavoidable race
    
    A client may disconnect at any time, so shutdown may fail harmlessly
    with ENOTCONN.

commit 9f43d3eb8cf6a156108c714551a7eb68472e17a4
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:26 +0000

    fix "shutdown" over sidechannel with epoll_pwait
    
    The "shutdown" command needs to trigger EINTR when using
    epoll_pwait, otherwise the sleeping thread may not wake up properly.

commit 07569135228020880d8092d9aaf7d6325cc48d26
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:25 +0000

    do not rely on normal syscalls as cancellation points
    
    Cancellation with epoll_wait, accept4 (and accept) may cause events
    to be lost, as cancellation relies on signals anyways in glibc/Linux.
    
    So instead, we use signaling ourselves and explicitly test for
    cancellation only if we know we are interrupted and in a state where
    a thread can safely be cancelled.
    
    ref: http://mid.gmane.org/CAE2sS1gxQkqmcywQ07pmgNHM+CyqzMkuASVjmWDL+hgaTMURWQ@mail.gmail.com

commit ba8a3673a6ada7122c89e420455901b6b1288500
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:24 +0000

    avoid needlessly reinitializing common sigset_t
    
    This should hopefully save a few cycles and reduce stack
    usage slightly.

commit df50c675f127c876e8d74be522ddc858aa3795ef
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:22 +0000

    svc: make thr_per_dev per-svc instead of global
    
    We could eventually make this a tunable parameter, as it could
    be advantageous over a global aio_threads value.

commit d5a52618ca1f9b5d7f6998716fbfe7714f927112
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:21 +0000

    refactor handling of "server aio_threads = " command
    
    We're using per-svc-based thread pools, so different MogileFS
    instances we serve no longer affect each other.  This means
    changing the aio_threads count only affects the svc of the
    sidechannel port which triggered the change.

commit 03c2391078e19dc36ea62c75fa6745569b5cbef6
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:18 +0000

    define MOG_DEVID_MAX and MOG_PATH_MAX variables
    
    This improves maintainability in case MogileFS changest these
    limits.

commit 9312bf345a9329137652f91c079a38931211faba
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:17 +0000

    consistently check OOM from hash_initialize/hash_insert
    
    Both hash_initialize and hash_insert may return NULL to indicate
    allocation errors.  So implement a mog_oom_if_null helper function to
    destroy the process instead of attempting to continue and dereferencing
    NULL pointers.
    
    This may affect configurations with limited memory and lacking
    overcommit; but is unlikely to trigger given the small memory footprint
    of cmogstored.

commit 1fab1e7a7f03f3bc0abb1b5181117f2d4605ce3b
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:16 +0000

    svc: implement top-level by_mog_devid hash
    
    This will allow us to lookup devices for per-(mog)device I/O queues.

commit 6357381200266f4c3e5d8f93403de987db95143c
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:15 +0000

    http_*: fixup long lines from automated conversion
    
    Lines longer than 80 columns aren't readable on my screen
    with gigantic fonts.

commit 89f0cf089b9e68730948ce652b42efaf26b98fd2
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:14 +0000

    parse out mogilefs devid in mgmt/http requests
    
    This will allow us to do lookups for IO queues/semaphores before
    we attempt to fstatat/stat a path.

commit 2376ed3c3da3bd2c9e8326e7dd75be2188fffc35
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:13 +0000

    fix devices/thread count if sidechannel is inactive
    
    If the mogstored sidechannel is inactive (in HTTP-only mode), we should
    still count the number of devices correctly to correctly scale the
    number of worker threads.

commit e90b43119ff33fb591ffb3bc100cf847537ca5fb
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:12 +0000

    switch to per-svc (per-docroot) queues
    
    This simplifies code, reduces contention, and reduces the
    chances of independent MogileFS instances (with one instance
    of cmogstored) stepping over each other.
    
    Most cmogstored deployments are single docroot (for a single
    instance of MogileFS), however cmogstored supports multiple
    docroots for some rare configurations and we support them here.

commit 2acbe7f4001de74091282ee199e3cad50c2e3e7f
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:11 +0000

    thrpool: add comment explaining minimum thread count
    
    I forgot why this bound was necessary, so add a comment
    ensuring I do not forget again.

commit 10a38ab650e3e25e37dd70b310631760d0b2000f
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:10 +0000

    limit acceptors to reduce contention on large machines
    
    Having too many acceptor threads does not help, as it leads to
    lock contention in the accept syscalls and the EPOLL_CTL_ADD
    paths.  The fair FIFO ordering of _blocking_ accept/accept4
    syscalls also means we trigger unnecessary task switching and
    incur cache misses under high load.
    
    Since it is almost impossible for the acceptor threads to
    be stuck on disk I/O since
    commit 832316624f7a8f44b3e1d78a8a7a62a399241840
    ("acceptor threads push directly into event queue")

commit 4d112de546a28b99d52435d4fed075f148455826
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:09 +0000

    update aio_threads count when new devices appear
    
    This will help ensure availability when new devices are added,
    without additional user interaction to manually set aio_threads
    via sidechannel.

commit 3d93bd96c92cedd583e14ea58b34bb143c4e9e87
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 03:34:08 +0000

    make mog_fd_get static, favor mog_fd_init
    
    mog_fd_init enforces setting the correct type, so relegate
    mog_fd_get to private usage inside fdmap.c

commit 97ed7a71d216eb4c6cbd1c40f2759e8d8957864a
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-21 11:17:47 +0000

    build: get the gnulib version via autogen.sh
    
    This is useful for:
    a) repeatibly generating the same tarball off git
    b) diagnosing and tracking down (rare) gnulib bugs
    c) 3rd parties verifying we do not put malicious code into our tarballs

commit 0ad0f16bce2769a599eb718261e0283e79c57639
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-06-25 19:46:02 +0000

    mnt: attempt to match iostat output by st_rdev
    
    st_rdev matching is necessary for cases where the block devices
    are aliased (not via symlinks), and mountlist returns a different
    name for the device than what iostat uses.  This is the case for
    my cryptmount(8) setup, where /dev/mapper/FOO and /dev/dm-N refer
    to the same device (with matching st_dev and st_rdev numbers),
    but neither is a symlink to the other (nor are they hardlinks).
    
    stat() on block devices in /dev should always be fast and
    non-blocking, as /dev is expected to be non-networked on any
    reasonable system (at least those serving as a MogileFS storage
    node).

commit 93c14ddc0977b82718d8b70c0c0e8a297b8a4211 (tag: v1.2.2, origin/1.2-stable, 1.2-stable)
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-05-11 22:56:46 +0000

    cmogstored 1.2.2 - minor maintenance release
    
    This is a minor maintenance release, no need to upgrade unless
    a) your gcc defaults to -march=i386 (e.g. 32-bit CentOS 5)
    b) your device names include '-' (e.g. Linux device mapper users)
    
    There are also some minor doc updates to clarify tarball vs git
    installation and a trivial error-handling fix which should not
    affect any current users.
    
    Eric Wong (6):
          build: add check for GCC atomics
          alloc: posix_memalign does not set errno
          iostat_parser: allow '-' for device names
          test/cmogstored-cfg: ensure TMPDIR is absolute for valgrind
          INSTALL: clarify between starting from tarball vs git
          INSTALL: update versions and URLs
    
    cmogstored 1.3 will have some fairly intrusive internal changes
    and cleanups to make it easier for users to trace and diagnose
    system and network problems.

commit cdf2128a1e183e8abfa3d4fbf033c4fa46848898
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-05-11 13:57:51 +0000

    INSTALL: update versions and URLs
    
    libkqueue recently migrated to SourceForge and Debian 7.0 is
    the new stable.
    
    We still support Debian 6.0 and will likely support it for years to
    come since CentOS 5.x remains supported.
    (cherry picked from commit 86e5d10649f14fe3b3c8af37fd8ec04cc337fc9e)

commit 9d4347d5c8385fa93b6eb31045f7280a4a228c94
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-05-11 13:57:50 +0000

    INSTALL: clarify between starting from tarball vs git
    
    Users unfamiliar with autotools may not realize bootstraping
    is required when building from git.
    (cherry picked from commit 1e80ba592ede05fe40b31686142f82294891afd0)

commit 86e5d10649f14fe3b3c8af37fd8ec04cc337fc9e
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-05-11 13:57:51 +0000

    INSTALL: update versions and URLs
    
    libkqueue recently migrated to SourceForge and Debian 7.0 is
    the new stable.
    
    We still support Debian 6.0 and will likely support it for years to
    come since CentOS 5.x remains supported.

commit 1e80ba592ede05fe40b31686142f82294891afd0
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-05-11 13:57:50 +0000

    INSTALL: clarify between starting from tarball vs git
    
    Users unfamiliar with autotools may not realize bootstraping
    is required when building from git.

commit d698442186bfa7c1b35e68720412c9add422616c
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-05-06 23:45:37 +0000

    test/cmogstored-cfg: ensure TMPDIR is absolute for valgrind
    
    Our use of chdir in this test confuses valgrind which may
    create a temporary file.
    (cherry picked from commit dc801d4a4ded67d74f5306d6dad4aba629045cc8)

commit dc801d4a4ded67d74f5306d6dad4aba629045cc8
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-05-06 23:45:37 +0000

    test/cmogstored-cfg: ensure TMPDIR is absolute for valgrind
    
    Our use of chdir in this test confuses valgrind which may
    create a temporary file.

commit e247cd327850090dca3d500bc4abcafb3d098875
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-04-14 00:50:10 +0000

    iostat_parser: allow '-' for device names
    
    Linux device-mapper names show up as 'dm-0', 'dm-1' and so on.
    This allows users to store MogileFS files on encrypted devices
    using dm-crypt and perhaps other, similar tools.
    (cherry picked from commit 88d34b4686a650dba89674aa302ab13c78e8cef0)

commit 27c299a123597729d011b4ec205acb0e0bc48b83
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-04-15 19:29:31 +0000

    alloc: posix_memalign does not set errno
    
    We must set errno manually for die_errno() if posix_memalign fails
    (cherry picked from commit 8c79cf794f6178b6978743af99d498ca0b449fb1)

commit 0c918c095d8f611f8d0072db468e37683597ef01
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-05-06 22:35:06 +0000

    favor "struct mog_fd" for acceptors over int FDs
    
    There's no reason to be referencing FDs for these acceptors
    since they're infrequently accessed by svc, so this should
    make our internals more consistent.  This also removes our
    use of mog_fd_get (outside of test code).

commit f80c52cfe4e08fba39995830a3fcf5835d0bb846
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-05-06 22:20:05 +0000

    preliminary systemtap support for tracing
    
    We will key most client events by pid() and file descriptors,
    as this is least ambiguous.  There are some minor refactorings
    to pass "struct mog_fd *" around as much as possible instead of
    "struct mog_http *".

commit b60e0eebc4e108f63372f9a0ffe318589599728f
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-04-17 07:59:36 +0000

    http: minor debloat via better alignment
    
    This results in a small size reduction due to better alignment:
    
    $ ~/linux/scripts/bloat-o-meter cmogstored.before cmogstored.after
    add/remove: 0/0 grow/shrink: 2/2 up/down: 20/-56 (-36)
    function                                     old     new   delta
    mog_http_get_open                           1460    1476     +16
    mog_chunk_init                                65      69      +4
    http_forward_in_progress                      63      55      -8
    mog_http_parse                             27171   27123     -48

commit 354eae3bd113e66c863b384765d88680406ed633
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-04-17 03:45:24 +0000

    http_parser: do not differentiate between MD5 sources
    
    It does not matter if the Content-MD5 comes from the trailer or
    header, we process it the same way with the Ragel parser.
    This is obvious when reading our code (and associated hunk
    this commit changes) in http_put.c

commit 7b097d6129a7971197430d817682163adb8e2e8a
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-04-14 00:50:09 +0000

    save socket address on accept/accept4
    
    getpeername() does not work on unconnected sockets.  For error-handling,
    unconnected sockets is a fairly common occurrence, so we want to get
    the address early on when we know the address is still valid.
    
    For IPv4 addresses, this does not increase memory overhead at all.  IPv6
    addresses[1] does require an additional heap allocation, but it does not
    need to be aligned since it is infrequently accessed.  If IPv6 becomes
    common, we may need to expand our per-client storage to 192 bytes (from
    128) on 64-bit (or see if we may pack data more carefully).
    
    [1] IPv6 addresses are rare with MogileFS, as MogileFS does not
        currently support them.

commit 449b85daa42cae1b9542a26e6dd52a1db38cce93
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-04-14 00:50:12 +0000

    allow binding to IPv6 addresses
    
    MogileFS currently does not support IPv6, but maybe one day
    it will.  When it does, we'll be ready.

commit 29342bcd9864e4aabb9e6febef8748a5f51ac944
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-04-16 20:26:57 +0000

    wrap getnameinfo for consistency in error logging
    
    This will allow us to more easily handle error reporting for
    IPv6 addresses and allow for consistent formatting of
    stringified IP addresses.

commit 88d34b4686a650dba89674aa302ab13c78e8cef0
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-04-14 00:50:10 +0000

    iostat_parser: allow '-' for device names
    
    Linux device-mapper names show up as 'dm-0', 'dm-1' and so on.
    This allows users to store MogileFS files on encrypted devices
    using dm-crypt and perhaps other, similar tools.

commit 4d9a4f921c1a79d2d82aae3e104cac43537b1e2d
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-04-14 00:50:08 +0000

    potentially make the mog_sockaddr union smaller
    
    The generic "struct sockaddr" may be padded to be the same size
    as "struct sockaddr_storage" (which is what we were trying to
    avoid in the first place by uinsg mog_sockaddr).  This change
    makes no difference on GNU/Linux.

commit 8c79cf794f6178b6978743af99d498ca0b449fb1
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-04-15 19:29:31 +0000

    alloc: posix_memalign does not set errno
    
    We must set errno manually for die_errno() if posix_memalign fails

commit 9427f2989eae96106090d77ddff1656f8510957d (origin/attr)
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-03-19 09:33:47 +0000

    http: put parser-private attrs in a private struct
    
    This will allow easy use of memset to reset attributes in
    between requests without clobbering more important data.

commit cce7f3c33207c534f9e5a6c0cb389a97df21235b
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-03-08 10:21:38 +0000

    build: add check for GCC atomics
    
    Andrey Okunev noted undefined references on the MogileFS mailing
    list when building cmogstored 1.2.1 on his 32-bit CentOS5 machine.

commit 08b8d7f1e5101631f642134718871dd2ef24c1e5 (tag: v1.2.1)
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-03-04 01:25:09 +0000

    cmogstored 1.2.1 - fix graceful shutdown failure
    
    This release only fixes an assertion failure during graceful shutdown
    while MogileFS fsck is running with checksumming enabled.
    
    This only affects users running fsck with checksumming enabled during a
    graceful shutdown of cmogstored.  For upgrading cmogstored it is
    recommended to:
    
    1) stop fsck on the trackers (via "mogadm fsck stop")
    2) wait for all tracker queues to drain and stop sending
       fsck traffic to the affected host.  You may wish to
       "!want 0 fsck" on all your trackers and wait for the
       fsck workers to stop.
    3) upgrade cmogstored (in place upgrade works)
    
    There are also several code comment updates for internal
    components of cmogstored which may interest potential hackers.

commit bc82924e5f26f4d72b145185254f563526adb8f9
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-03-04 01:22:42 +0000

    TODO: add a few item for our roadmap
    
    We have a future!

commit f128eea752d51a566996043fd159da9be8d83597
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-03-02 10:56:17 +0000

    alloc: document use of TLS buffers
    
    tls_rbuf allows us to avoid nearly all dynamic allocation
    for common HTTP requests.  However, the mog_rbuf structure
    may be detached from TLS as necessary (and another one
    allocated in its place) when the need arises.

commit 20bcb2ccc3d3d38b0fc2f16c25cad74d8404d5bb
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-03-02 10:46:12 +0000

    fdmap: documentation for the FD-based memory allocation
    
    Avoiding heap allocations in common paths is important
    to high performance server design; document this important
    design decision.

commit adc750ab6600980ba98d77d371efb07b38886f30
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-23 20:41:33 +0000

    mgmt: fix fsck digest assert failure in graceful shutdown
    
    Items in the low-priority fsck queue could trigger a assertion failure
    during graceful shutdown due to improper handling of the MOG_NEXT_IGNORE
    state in mog_mgmt_quit_step().
    
    However, using the fsck queue in graceful shutdown (which is
    single-threaded) is probably a bad idea anyways, as the fsck digest
    could monopolize other requests.  So give no special handling to fsck
    digest queries during graceful shutdown.
    
    This only affects users running fsck with checksumming enabled during a
    graceful shutdown of cmogstored.  For checksums users, it is recommended
    to stop fsck from the trackers and wait for all tracker queues to drain
    before upgrading cmogstored (and using graceful shutdown on the old
    cmogstored).

commit 8757c6458e67e9ab20f9a049a9a68f51b3229816
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-23 01:17:09 +0000

    http_get: comment about snprintf() being a hot spot
    
    cmogstored is pretty fast, but it could be faster.

commit c81abd17fbbbb37c4df13771b485e139c8ab71d9
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-21 03:32:07 +0000

    queue_common: update comments to match code
    
    While we're at it, explain the use of cloexec.

commit f57064cc07d872583f50a04b2421f214304cc483 (tag: v1.2.0)
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-18 23:37:17 +0000

    document/reserve SIGWINCH/SIGHUP for future use
    
    Despite having an extensive test suite and minimal room for user
    error, giving users the options to back out of a hot upgrade may
    be worth supporting.

commit cbab5b9d18f13c22f6d94bdad2490e8d280ea927
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-18 23:30:30 +0000

    copyright comment updates for 2013 (part 2)
    
    Many files were missed the first time around in
    commit 37026af96dec638aa850d604003bf7218d90037d

commit d7a6fe7d93c2e7c771e99f7083d2a59d320da12f
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-18 23:27:13 +0000

    manpage: document SIGUSR2 upgrades
    
    This is a new feature and needs to be documented.

commit f5a6eb5faa0459d6ec4ac9255c0f24d4dbe73583
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-18 12:18:55 +0000

    move cmogstored_exit() prototype to cmogstored.h
    
    This fixes a missing prototype warning for cmogstored_exit()
    when checking exit.c with sparse.

commit 56cb260ed21561c2b84c1ca5dec8b25c738343c8
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-18 05:46:47 +0000

    queue_epoll: fix bad cast for epoll.event
    
    The events field of struct epoll_event is a uint32_t, not int.

commit 43d893ac7043ca69f2e93b987856e22cfa4a3978
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-18 05:46:46 +0000

    tests: add valgrind supp for epoll_ctl on 32-bit arch
    
    The epoll_event.data union is 64-bits on 32-bit systems while
    pointers are 32-bit.  We only use 32-bits of that union, but
    valgrind mistakenly complains about it (the kernel does not
    care about the user-supplied data union at all).

commit 92b8a2091414c0024fe9fd35aed6891308c9dc26
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-18 05:46:45 +0000

    ioutil: fix memory access error on from mog_iou_write
    
    sizeof(buf) returns the size of the pointer if buf is a passed
    parameter, even if it the function prototype dictates a fixed
    size for buf as we do in mog_iou_write.  While we're at it,
    make our mog_iou_write buf parameter const.
    
    This bug was introduced in:
    commit a960a351b2248a196c91cdbf6256f98e1bc2ef37
    "split iostat util% tracking from mountlist"
    and never affected an official release of cmogstored.
    
    This bug was caught while testing on a 32-bit GNU/Linux machine.
    My normal 32-bit FreeBSD 9.0 environment did not catch this as
    iostat on that platform only reports integer percentages and
    does not need more than 4 bytes.

commit 719e4fc320e1978bc9ea6ee8be9f8249dcb54dab (next)
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-16 12:55:42 +0000

    handle pthread_create returning ENOMEM on old glibc
    
    Older glibc will return ENOMEM on mprotect() failures.  This bug
    was only fixed in 2011, so the long-term distros and old
    installations may not have the necessary backports.
    
    ref: http://www.sourceware.org/bugzilla/show_bug.cgi?id=386

commit 13cbdcea65248271668562064aafdcc9634ef9ce
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-15 11:04:55 +0000

    graceful handling of pthread_create EAGAIN failure
    
    pthread_create may return EAGAIN as a temporary failure,
    do not abort a running process if this is the case.
    
    For the initial mountlist scan, we must retry indefinitely for
    cmogstored to be usable.  However, with our thread pools, we can
    always run fewer threads (as long as there is at least one
    thread per-pool).

commit fcb41385271818586a162d02aeb23bc3414a602e
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-15 09:27:13 +0000

    test/http_idle_expire: hopefully improve test reliability
    
    This is a tricky test and doesn't always succeed, since
    it's hard to tell how many file descriptors glibc will
    use internally.

commit 476cc380a94db3355f818b2c798cdeeb0c626cc0
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-15 02:54:20 +0000

    sig: avoid pselect if ppoll is present in mog_sleep
    
    We want to favor ppoll over pselect, since ppoll is a better
    interface and we can have a slightly smaller binary with fewer
    dependencies.
    
    While we're at it, use mog_sleep(-1) as an alias for
    mog_selfwake_wait to further reduce binary size.

commit b7403080f0266ac41cecae80adcfa0391f3f93b7
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-15 02:43:51 +0000

    avoid racy sleep on fork failure in master process
    
    We need to atomically enable interrupts and sleep with
    the same syscall.  Fortunately, using pselect (through
    mog_sleep) allows that and is POSIX-compliant, so use
    that.

commit 5629899a12649b9b21f41efc29b92adbd82afe6c
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-15 02:40:50 +0000

    mnt: inform user of slow mountlist scan
    
    This will inform the user of why cmogstored may be slow
    to start, since we need the mountlist to be populated at
    startup.
    
    We also throw a pthread_cancel() in there to load libgcc_s under
    glibc, so we can avoid loading libgcc_s once we're under FD pressure.
    This makes test/http_idle_expire.rb more reliable.

commit 44f4f76d06899b1a0e4719671a4fde3c0851764a
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-14 11:32:11 +0000

    test/http_range: do not allow webrick to perform lookups
    
    DNS lookups cause webrick tests to fail or timeout.  Our
    tests should not have external network dependencies.

commit cfe689f85b0b39d1f3b3e21d9b564d34b2146d88
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-14 10:26:17 +0000

    inherit: avoid DNS lookup on upgrade
    
    A typo caused unnecessary DNS lookups when inheriting sockets.
    While we're at it, fix another typo in the error message, too.

commit f8b30b2846c25461940c99d8fd4432ec49920098
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-14 05:21:27 +0000

    selfwake: use epoll_pwait on Linux instead of eventfd
    
    This saves us a file descriptor in Linux, which provides
    epoll_pwait in 2.6.19+ (and ppoll for 2.6.18, the oldest
    kernel we support).

commit 4ccf06a600ce31c6dbd61d9c44b491233758c18b
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-11 09:48:20 +0000

    mnt: revert to mutex for protecting by_dev hash
    
    Since we now update future copies of by_dev offline and only
    need a lock to swap in the new one, contention for by_dev should
    be less of a problem than it was before.  This should make
    reader-writer locks an unnecessary risk.
    
    Reader-writer locks are riskier since writer starvation can
    potentially be an issue with many readers.

commit f54e27e0ec0a520c0a079d6e8428eeefdcd366ab
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-11 02:07:41 +0000

    test/mogilefs_integration: increase test reliability
    
    Use SO_REUSEADDR, since Linux requires both the new program
    (cmogstored) and this test both use SO_REUSEADDR for
    SO_REUSEADDR to be effective.
    
    Also, minimize the window for port conflicts.  While there are
    hard-to-avoid race windows for conflicts when binding random
    ports, we can minimize those windows by holding those ports open
    in the parent as long as possible.

commit 384c801cced851e782cbe94b548a31b1deaa70f3
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-11 01:25:16 +0000

    kqueue: update NOTIFYRD -> SELFWAKE
    
    This was missed in the earlier changes to allow eventfd
    usage under Linux instead of using an notification pipe.

commit ba24aa82b1c9306e0053089296741f028fafa148
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-11 00:51:06 +0000

    fix signal races when master process is used
    
    In the absence of a pselect/ppoll-like version of waitpid;
    we must use a selfwake descriptor (pipe or eventfd) to
    wake the master up whenever a signal is received.
    
    So wait on the selfwake descriptor and always run waitpid
    with WNOHANG in a loop to ensure all children are reaped.
    
    The: mog_intr_disable(); waitpid(); mog_intr_enable()
    sequence was completely stupid I can't believe I wrote it.

commit 5537c96848f483d403da1ed663809681e7b09f3b
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-10 11:37:34 +0000

    allow self-wakeup to use eventfd under modern Linux
    
    eventfd uses fewer resources than a pipe, so create less
    overhead for our users by using eventfd instead of a pipe.

commit 955991aae8c3da5a13e34e929188db3fd9216a0e
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-10 23:17:41 +0000

    pidfile: delay unlink of old file on aborted upgrades
    
    We don't want to be without any pidfile if writing the new
    pidfile fails.

commit 975a329912818b49f04de15349f6414719430808
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-10 20:39:56 +0000

    upgrade: do not disable interrupts in forked child
    
    The child disables interrupts right away, so there's no
    reason to enable interrupts temporarily.

commit 2163a4c6f09a9813a0e69a9533923623d448dce9
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-09 23:09:14 +0000

    test/upgrade: more thorough PID file checking
    
    We need to ensure the PID file is non-empty, not just
    that it exists.

commit 7d56b023d2aac8530b249b2db7d90a738297a6fc
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-09 23:06:41 +0000

    prioritize upgrade before exit in main loop
    
    If we receive both SIGUSR2 and SIGQUIT in a short
    time period; we should trigger the upgrade before
    gsince raceful exit; as no user will (intentionally) send
    SIGQUIT before SIGUSR2.

commit 3f454ae96e7cc1352f7bf7756a064cf5781154c4
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-09 23:05:23 +0000

    test/upgrade: teardown more careful about killing
    
    We don't want to accidentally kill ourselves by targeting
    PID=0 if the PID file is empty.

commit b96d1018ae5261d8ee9344b959acb04c1be43279
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-09 07:04:13 +0000

    tests: fix several Ruby warnings
    
    Unused variables and unset Content-Type for Net::HTTP requests

commit 7d740e5825e05030b5978ed296fd0b801666b405
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-09 07:03:02 +0000

    test/inherit: fix Ruby 2.0.0 close-on-exec compatibility
    
    FD inheritance from exec() must be done explicitly in Ruby 2.0

commit e427fb773837953c01ebe8dfaf8f8679c7895fc2
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-08 08:48:36 +0000

    mnt: move stat/lstat logic to mnt_usable
    
    This centralizes the mountpoint suitability logic in
    one place.  In the future, it may also allow us to
    parallelize the work of scanning filesystems.

commit 223adf17682765f9e72d3436348700085d823a6e
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-08 03:02:18 +0000

    upgrade: fix env placeholder for valgrind
    
    Having a NULL at the beginning of the list caused
    iteration in the destructor to stop, allowing valgrind
    to detect a memory leak.

commit d8ac1b937647b86342135a91abe933a3f8812909
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-08 02:54:54 +0000

    cfg: require PATH to be set for --daemonize
    
    Maybe some weird users do not have PATH

commit bd37ad7bfae8c9b25a9eef1e1ce9b7c17d1f5257
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-08 02:27:03 +0000

    upgrade: avoid non-async-safe functions in child
    
    execvp may malloc internally in its path lookup, so use
    find_in_path to perform this lookup in the parent instead.
    Additionally, putenv() may not be async-signal-safe either,
    but execve is, so use execve.

commit 117a11e9e2b8a365df90336ae78b61f6562b7bd3
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-07 22:02:41 +0000

    cfg: disallow trailing ':' in PATH with daemonize
    
    Trailing ':' in PATH means using the current path, which
    is now incompatible with daemonize.

commit 4f45f562180489a97a4572ebd3822e9f15289bd6
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-07 20:29:47 +0000

    upgrade: avoid potential deadlock from post-fork mutex use
    
    Pthreads implementations do not require mutexes be in a
    consistent/usable state in a forked child
    Since we don't need the mutex in a single-threaded forked
    child, we can just skip it and avoid reinitializing it
    entirely.

commit 315487f70c90b117aa4e9d63bbb21abae8af80ab
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-07 19:50:49 +0000

    rename fs_usable to mnt_usable
    
    It should be clearer this code is only called from inside
    mnt.c and not fs.c (the latter is for general filesystem
    operations, not operations on a mount point).

commit c3550946c61a43cad54f1aa7c0f0f062a451042f
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-07 11:00:10 +0000

    release memory allocated for upgrade at exit
    
    This is not strictly necessary as this memory is freed anyways,
    but stop valgrind from complaining and avoid unnecessary
    suppressions (since shutdown performance is not important).

commit 03bf577eb3a328f130083d992b180ba72ee1f0b4
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-07 08:36:47 +0000

    forbid relative paths with daemonization
    
    Relative paths are incompatible with daemonization, as it
    does not work for SIGUSR2 upgrades (since daemonize forces
    the server to run in "/").  Relative paths are confusing
    and error-prone anyways, so do not allow users to specify
    them along with --daemonize.

commit 5b7d2608f95332c3bcd69d1eb56044236fc6b978
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-07 09:30:17 +0000

    omit trailing newline from die() and warn() calls
    
    The GNU error() function already emits a newline at
    the end of these messages.

commit 9510a1d244e22725e2710d04571e3d6fbf89b0e8
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-07 09:33:37 +0000

    favor error.h GNU system header over gnulib one
    
    error.h is available on GNU/Linux (and presumably GNU/kFreeBSD
    and GNU/Hurd, so favor that system-wide header over the gnulib
    one.

commit 26432ee7d0cf7f94fdc62804611cdbc7c5ec960c
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-06 02:44:05 +0000

    remove warn module and alias it to error() in gnulib
    
    There is no need to maintain our own code for this.

commit 930da6932ae96b3c5f40324b9f24fc6415f3e500
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-02-06 02:42:58 +0000

    queue_epoll: change fprintf(stderr, ...) to use warn()
    
    This makes it easier alter how/if we write to stderr.

commit 7abd078c4f7e61e87f9394c6662be027fe0253b2
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-31 23:10:50 +0000

    ioutil: avoid assigned but unused variable
    
    Noticed with gcc 4.7.2 in Debian testing (4.7.2-5)

commit c0931fd23e065521237530cd6f9f6068f259e4e1
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-31 06:47:41 +0000

    cmogstored: initialize syslog before inheriting
    
    This ensures the: inherited $ADDRESS:$PORT on fd=...
    messages are prefixed with the PID in logs.

commit 459d163514766653d28d8964a7e2e25d27f7c873
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-31 06:46:39 +0000

    cfg: daemonize is a boolean, not an integer
    
    This project uses C99 features (and some GNU extensions),
    so bool is usable.

commit dffe6b3dc226cafb0a6107443f9d7e23095dd789
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-31 04:20:35 +0000

    sockaddr*-related data structure size reductions
    
    We do not need all the weight of sockaddr_storage or NI_MAXHOST.
    cmogstored currently only supports IPv4 and IPv6[1] and (like
    any respectable server) will not perform reverse DNS lookups.
    
    This allows us to reduce our stack usage in some places and
    keep caches hotter.
    
    [1] MogileFS does not support IPv6, yet, even

commit d68b0ad231192c6ccf701e66be66b6bc956bed2b
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-31 03:52:20 +0000

    minimize interrupt windows for master process
    
    Code is easier to follow when interrupts occur at well-defined
    points.  The worker processes (and master-less standalone) already
    follows this.

commit 088138b235e79fa54a4e3602a4d60975e9581571
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-30 02:55:29 +0000

    implement nginx-style binary upgrade via SIGUSR2
    
    USR2 now forks a new cmogstored process which inherits
    listener file descriptors from the parent.  The parent
    renames its pidfile with a ".oldbin" suffix so the new
    child can use the new PID file.
    
    Clusters may now upgrade to future versions of cmogstored
    without needing to mark hosts down via mogadm.
    
    The behavior of this process should match that of nginx:
    http://wiki.nginx.org/CommandLine#Upgrading_To_a_New_Binary_On_The_Fly

commit 2b252bb6b4704be01d629194aff588b24d579cdd
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-26 00:19:59 +0000

    refactor process management
    
    To support transparent upgrades, we need to be able to reap
    child processes regardless of what the child process was.  So we
    must do away with the iostat/worker-specifc waitpid() calls and
    use waitpid(-1) to cast a wide net to reap anything and
    everything.
    
    When we support transparent upgrades, the fork+exec-ed child
    process may die, so the main process (master if
    --worker-processes are used) needs to be capable of reaping
    that new process.

commit e292e0e874a064fcd39f76565f38935449b7f7c8
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-24 12:26:47 +0000

    inherit: preliminary FD inheritance over exec()
    
    This lets us inherit listen sockets from the parent
    process, in the future this will allow transparent
    upgrades.

commit 80aef1b3c8e9a20ec047dcf040e594a5e2a23811
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-24 03:28:51 +0000

    move graceful exit functionality into its own file
    
    No need to clutter up the main file with graceful exit
    functionality.

commit 25d7a82d9851c204e6ca47ab8af35fdab9bbd37c
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-24 03:21:05 +0000

    move pidfile preparation function out
    
    cmogstored.c is too big, we can move pidfile functionality
    out to pidfile.c easily.

commit 852ca09524c17dc15ab68a1f85cee008c22a3a76
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-31 00:59:34 +0000

    determine mount point usability via statfs/statvfs
    
    Filesystems with no block size are unusable, so avoid
    stat()-ing them and potentially having problems with
    our subsequent stat() stalling when a network connection
    slows (or goes) down.

commit 9f7c1f7b8326a03c2105328f50df3ce2099de1d5
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-31 00:25:24 +0000

    better error handling when faking memstream
    
    For systems without memstream support, using temporary files to
    emulate memstream opens us up to more common (than ENOMEM)
    errors such as: EIO, ENOSPC, ENFILE and EMFILE.
    
    Since we don't want our server to die completely on these
    (sometimes temporary) error cases, we'll just stop publishing
    iostat data to "watch" subscribers.

commit c25d3846d96ad761e0e71304903dec79ca56424d
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-30 23:02:47 +0000

    mnt: allow concurrent readers on mount list
    
    On refresh, we can (slowly) build a new mount list and replace
    the old one quickly and atomically.  This prevents ioutil
    reader/writers from waiting on slow mount list refreshes;
    as the mount list does not change frequently.
    
    This increases windows where iostat utilization may be
    read/updated, especially when network mounts are temporarily
    unreachable or slow.

commit a3ca090b4b01d44e674f4db5cb13f5d111d0aa32
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-30 21:57:40 +0000

    mnt: cleanup/document mountlist storage/nesting
    
    Moving ioutil out to a separate table allows us to discard our
    private mog_mntent struct.  Data structure simplification also
    allows code simplification.

commit a960a351b2248a196c91cdbf6256f98e1bc2ef37
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-30 11:59:58 +0000

    split iostat util% tracking from mountlist
    
    This prevents us from losing iostat utilization each time the
    mount list is rescanned.
    
    Additionally, this allows us to read iostat utilization (and
    write to sidechannel clients) concurrently while the mount list
    is being refreshed.

commit 2e1958b26b926c42f213ba47b71ec735d81448e7
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-30 10:18:40 +0000

    consistent allocation size for iostat utilization
    
    This is better than open-coding a length everywhere.

commit 45843883077bc0a4ef745d03c3b4241278463f3b
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-30 02:48:44 +0000

    test_helper: expand relative paths in $PATH
    
    --daemonize will chdir("/"), so relative paths must be expanded
    for USR2 upgrades to work.

commit af8fe00640d70aab2e44110c88e4772e4daf4f68
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-30 02:48:45 +0000

    test/mogilefs_integration: reduce chance of socket conflicts
    
    Disable SO_REUSADDR to avoid recently-used ports.  Additionally,
    only close (via dereference) sockets when all listener sockets
    are bound.

commit 25740ec5e7e1a13e48552b4c6b0e60e8730641ea
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-31 00:59:00 +0000

    move MOG_STR() macro to util.h
    
    We shall use it outside of defaults.h

commit 6ed94e8b4a9219b10f8132078ba2882fddae40f9
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-24 01:21:52 +0000

    mnt: avoid recursion in mount_entry_free
    
    Insane mount point aliasing may result in stack depth explosion.

commit 24a1f80ca96d2c12f0cefaca4f7040dbc4f07919
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-25 20:55:33 +0000

    limit --worker-processes to UINT_MAX
    
    UINT_MAX worker processes should be more than enough for anyone.

commit dff57bf1b16a9435428c57ae26e6a3701f8d2ea9 (tag: v1.1.0)
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-17 21:25:22 +0000

    queue_epoll: fix version check for 2.x kernels
    
    Oops, we could accidentally mark a 2.x kernel as non-buggy.
    2.6.32 and 2.6.34 may eventually get backports.

commit e47a1fe799edc981272cb66a7f52f11d50826a9b
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-17 20:13:47 +0000

    http: avoid MSG_MORE on HEAD responses
    
    We need to signal we do not have more bytes to write to the
    socket when generating HTTP HEAD responses.  This avoids a
    200ms delay between HTTP responses.  This regression only
    appeared in commit 14e0684507c06439ee9c7a731fd6ca90b7b9adcb
    and was never in a release.

commit 1074f0d2c55fee0de4f2ceba2829f6a3e12ce845
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-17 10:55:13 +0000

    tests: additional test for trysend buffering in Linux
    
    This is hard code to exercise in the real world since
    we only send tiny HTTP headers with trysend.

commit be2062a5eb21718c932aaa4d49685e36763842ed
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-17 06:42:18 +0000

    close: ignore ECONNRESET errors (for FreeBSD, maybe others)
    
    FreeBSD (and possibly other BSDs) returns ECONNRESET on close().
    The descriptor still seems to get released and eventually become
    usable again; so retrying close() is dangerous as we allocate
    file descriptors from multiple threads.

commit dc7c6efeac5fb0fb1f44e7f3cee625a6c33f7e26
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-17 02:51:52 +0000

    http_date: time_t pointer is const
    
    This pointer is passed to gmtime_r(), which also takes a const
    time_t *.  This makes it easier for users of mog_http_date() to
    know the timep parameter does not get altered in any way.
    
    I noticed this discrepancy when rereading http_get.c for the
    first time in a few months.

commit 88b5f33f1f7e79d799da34b350f4dc59b875cf40
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-17 02:34:56 +0000

    lazily call mkdir for file creation
    
    There's no need to waste time creating/checking directories
    which already exist.  Since directories tend to hold multiple
    files, we can optimistically run open()/openat() and only call
    mkdir()/mkdirat() on ENOENT.

commit c57ff63768cedd72abf74e93a3ba59070de77357
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-17 01:54:32 +0000

    trywrite: build fix for platforms without MSG_MORE
    
    Oops...

commit 0e9a8e156f0a060c7822069c4f69eea3710c793c
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-17 01:37:17 +0000

    simplify TCP_NOPUSH support code (remove TCP_CORK)
    
    Since we no longer use TCP_CORK under Linux (where we use
    MSG_MORE instead), we can cleanup the nomenclature and avoid
    confusing people by mentioning TCP_CORK.

commit 14e0684507c06439ee9c7a731fd6ca90b7b9adcb
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-17 01:11:04 +0000

    linux: favor send() w/MSG_MORE over TCP_CORK
    
    This saves several setsockopt() syscalls and reduces system
    CPU usage.

commit 37026af96dec638aa850d604003bf7218d90037d
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-17 01:17:48 +0000

    copyright comment updates for 2013
    
    gnulib did it for us in m4/gnulib-cache.m4, we'll match.

commit 96b173c5516ad56ad2ea41d99406d58998be88b2
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-17 00:37:29 +0000

    http_get: disable FADV_SEQUENTIAL for small responses
    
    So there's no need to waste a syscall on small reads which
    would not benefit from readahead at all.  Using 256K as a
    threshold for "small" reads, which is twice the normal
    max readahead window on modern Linux 3.x.

commit 17e2cca675df298c6d99ee2b7f0e099c02eb271c
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-17 00:26:28 +0000

    epoll: update EPOLL_CTL_MOD workaround for stable kernels
    
    Linux v3.2.37+ does not need this fix.
    Linux v3.0.59+, v3.4.26+, v3.5.7.3+, v3.7.3+ are all under
    review and not need this.

commit 6c4e0c408de81ceb49fe9dac2ad38a1655aea625
Author: Eric Wong <normalperson@yhbt.net>
Date:   2013-01-02 11:19:09 +0000

    epoll: avoid EPOLL_CTL_MOD bug in Linux <= 3.7.1
    
    On SMP machines, EPOLL_CTL_MOD had a race condition under
    Linux <= 3.7.1.  This allowed events to be missed if it
    arrived near the time the EPOLL_CTL_MOD request was issued.
    
    ref: linux.git commit 128dd1759d96ad36c379240f8b9463e8acfd37a1

commit 86477eafdadda573b850cf610a60c5e6eab1c189
Author: Eric Wong <normalperson@yhbt.net>
Date:   2012-12-13 09:29:25 +0000

    Rakefile: fix Regexp encoding issues under 1.9