cmogstored news --------------- 1.8.1 / 2021-02-13 02:25 UTC ---------------------------- This release fixes a segfault on some systems/toolchains where our per-thread stack size was too small. Given the prevalance of 64-bit systems nowadays, using a small stack is unlikely to yield any benefits. Users on 32-bit systems who wish to continue with a minimal stack should use "ulimit -s" in startup scripts or configure their process manager appropriately (e.g. setting the "LimitSTACK" directive in described in systemd.exec(5)). Thanks to Xiao Yu for reporting and testing at our public mailbox: https://yhbt.net/cmogstored-public/CABfxMcW+kb5gwq3pSB_89P49EVv+4UkJXz+mUPQTy19AdrwbAg@mail.gmail.com/T/ 1.8.0 / 2020-08-13 20:54 UTC ---------------------------- devXXX/usage files are emitted properly for systems where the mount point can't be resolved. This is needed for multi-device filesystems such as btrfs. PUT and DELETE requests now update the in-memory representation of these devXXX/usage files, since the 10s update interval may be too low for high-traffic situations. Their is a new "USAGE FILES" section in the manpage documenting the changes from 1.7.0 and this release. Our public mail archives are now available over IMAPS. gnulib is updated to 4e082bffbcc46e68 in the tarball. 1.7.3 / 2020-03-22 00:12 UTC ---------------------------- Improve RFC 7230 conformance w.r.t Content-Length and Transfer-Encoding handling in PUT requests. We now favor "Transfer-Encoding: chunked" if a Content-Length header is also present. Furthermore, we no longer accept Transfer-Encoding values aside from chunk, since we don't support gzip/compress/deflate as described in RFC 7230. 1.7.2 / 2020-02-19 01:30 UTC ---------------------------- s/bogomips.org/yhbt.net/ in all documentation, due to bogomips.org expiring. The tarball is also updated with the latest gnulib changes. 1.7.1 / 2019-05-12 00:46 UTC ---------------------------- The Linux kernel bugfix should hit mainline and stable kernels, soon. But there's no reason for us to be caring if errno is EINTR or not... cf. https://lore.kernel.org/lkml/20190427093319.sgicqik2oqkez3wk@dcvr/ https://lore.kernel.org/lkml/20190507043954.9020-1-deepa.kernel@gmail.com/ There are also some minor build/test updates since v1.7.0 (2018-12-18): test/mgmt-usage.rb: fix mismatched indentation warning add .gitattributes for Ruby files test/mgmt_auto_adjust.rb: improve diagnostic messages .gitignore: add extra ignores for gnulib in Debian 9 notify.c: workaround epoll_pwait bug in current Linux 5.0/5.1 doc: remove mailing list subscription info 1.7.0 / 2018-12-18 04:01 UTC ---------------------------- The big feature in this release is "devNNN/usage" are served from memory, allowing up-to-date usage information even unwritable/unreadable filesystems. This can also be used to reduce spinups and wear on HDDs. "devNNN/usage" files are still updated on the FS by default for compatibility with existing HTTP servers, but admins may wish to disable updates to them by removing all permissions from the "usage" files: chmod 0000 $MOG_DOCROOT/dev*/usage Filesystem errors from the sendfile(2) syscalls are also logged to syslog. There's also a bugfix for zombies for libkqueue-on-epoll users, but that doesn't affect native kqueue users on *BSDs. And the usual round of gnulib, minor doc and style updates. 18 changes since v1.6.0: cmogstored.h: remove unused mog_file.mmptr member doc: documentation for ioq doc: further comment updates around ioq build-aux/txt2pre: support '=' in URLs test/inherit: fix ambiguous parenthese warning test/inherit: stop testing Ruby itself doc: update URLs to HTTPS compat_sendfile: ensure this works without an offset doc/queues.txt: add key point about only retrieving ONE event fix trace.h dependency on probes.h update to gnulib.git 90f289f249a266b1afb9c63e182f5d979d17df5f http_get.c: log filesystem-level errors from sendfile serve /dev*/usage requests from memory doc: URL updates to reduce redirects and favor HTTPS test/inherit.rb: fix syntax error under Ruby 1.8 update copyrights for 2018 and use SPDX for "GPL-3.0+" selfwake: enable self-pipe with kqueue http_parser: workaround parsing OOM in Ragel 6.10 1.6.0 / 2016-08-31 03:14 UTC ---------------------------- There are minor robustness fixes on handling errors when allocating memory or spawn failures on otherwise-hosed systems. These bugfixes will not affect real users unless the system is already hosed or in badly overtaxed, so there's no real need to upgrade. There are minor portability improvements and I now test under FreeBSD 10.x. The iostat test cases are relaxed a bit to account for virtualized devices (as iostat is less useful with modern 17 changes since 1.5.0 (Nov 2015): Rakefile: add missing
for Atom feed test/pwrite-wrap: remove unused variable and comment test/pwrite_wrap: squelch unnecessary output test/pwrite_wrap: reduce space overhead required update copyrights for 2016 build-aux/txt2pre: drop CGI.pm requirement stdin is always redirected to /dev/null minor vfork/fork safety fixes process: try to handle OOM gracefully http_put: gracefully handle path allocation errors iostat_process: declare environ extern test/mgmt: relax checks for iostat mapping gnulib copyright update for 2016 upgrade: avoid syslog call if execve fails rely on gnulib for environ portability INSTALL: update latest Debian stable version to 8.x README: stop mentioning cgit 1.5.0 / 2015-11-21 01:33 UTC ---------------------------- A bunch of minor changes; most notable is systemd-style socket activation support. This was easy-to-add since we've always had socket activation support for nginx-style SIGUSR2 upgrades. This places no link or runtime dependency on libsystemd, so the LISTEN_FDS and LISTEN_PID environment variables may be used in other init systems as well. While I have my own reservations about systemd itself, I also strongly believe in using socket activation to prevent downtime. Existing behavior with CMOGSTORED_FD (used for SIGUSR2 upgrades) is now documented in the manpage and will always supported. We've also added vfork support for Linux systems, allowing faster spawning of iostat if malloc is using too much memory. Behavior changes: Bad Range: headers return 416 responses in more cases for invalid ranges (e.g. miscalculated ranges such as "1--1", while completely wrong ones (lacking a "bytes=" prefix) are ignored entirely as in nginx. Bugfixes: There are also some cleanups to avoid dying on OOM in more places on weird systems which trigger OOM. More work on this is ongoing. Also updates to the latest gnulib.git commit 71d39c1644762745b94e9449c45bfd716a79a5eb ("autoupdate") along with a change which fixes a memory leak when people build from cmogstored.git using gnulib commit c6148bca89e9465fd6ba3a10d273ec4cb58c2dbe or later ("mountlist: add me_mntroot field on Linux machines"). This memory leak did not affect any released tarballs of cmogstored. Note, users building from git (as opposed to the tarball) will need gnulib commit 41d1b6c42641a5b9e21486ca2074198ee7909bd7 ("mountlist: add support for deallocating returned list entries") or later (from July 2013). There are also various documentation updates and our mailing list is now readable over NNTP: nntp://news.public-inbox.org/inbox.comp.file-systems.mogilefs.cmogstored 1.5.0rc1 / 2015-11-11 21:24 UTC ------------------------------- A bunch of minor changes; most notable is systemd-style socket activation support. This was easy-to-add since we've always had socket activation support for nginx-style SIGUSR2 upgrades. This places no link or runtime dependency on libsystemd, so the LISTEN_FDS and LISTEN_PID environment variables may be used in other init systems as well. While I have my own reservations about systemd itself, I also strongly believe in using socket activation to prevent downtime. Behavior changes: Bad Range: headers return 416 responses in more cases for invalid ranges (e.g. miscalculated ranges such as "1--1", while completely wrong ones (lacking a "bytes=" prefix)) are ignored entirely as in nginx. Bugfixes: There are also some cleanups to avoid dying on OOM in more places on weird systems which trigger OOM. More work on this is ongoing. Also updates to the latest gnulib.git commit f197c2c9e5e0d12c373f26d5b3211809457bc972 ("intprops: new public macro EXPR_SIGNED") along with a change which fixes a memory leak when people build from cmogstored.git using gnulib commit c6148bca89e9465fd6ba3a10d273ec4cb58c2dbe or later ("mountlist: add me_mntroot field on Linux machines"). This memory leak did not affect any released tarballs of cmogstored. shortlog of changes since 1.4.3: doc: use "builder" RubyGem to generate Atom feed dev.c: fail gracefully on out-of-memory errors do not die on OOM when for mgmt paths HACKING: update URLs to reduce redirects http: return 416 errors in more cases for bad Ranges update .gitignores for latest autotools + gnulib Rakefile: remove text-only part from the Atom feed support systemd-style socket activation via environment set TCP listener options on inherited sockets doc: add example systemd config files use free_mount_entry from gnulib instead of rolling our own fix tmpdir dependency for slow Ruby tests doc: publish examples directory to website 1.4.3 / 2015-03-09 22:52 UTC ---------------------------- For all platforms, the startup device scanning thread at startup may not handle EINTR properly. This bug only manifested at startup and does not affect running instances. However, this bug is also readily apparent on newer versions of FreeBSD which support the ppoll function call. Thanks to Mykola Golub for the bug report which led to this release. For systems lacking epoll_pwait (older GNU/Linux, all *BSDs), there is also a bugfix for systems which experience signal spam leading to errno clobbering in the main thread. This bug was only only noticed due to a bug report against Ruby: https://bugs.ruby-lang.org/issues/10866 There is no need to upgrade if 1.4.1 is already running well on modern GNU/Linux systems capable of epoll_pwait. But then again nginx-style SIGUSR2 upgrades are transparent to clients. shortlog since 1.4.2: Makefile.am: fix publish rule for website Fix assertion failure during startup avoid relying on ppoll as a cancellation point preserve errno when inside sig handler for self-pipe 1.4.2 / 2015-03-06 02:18 UTC ---------------------------- * Makefile.am: gzip README and associated data * manpage: update contact and copyright information * update copyrights to 2014 (and all contributors) * doc/design.txt: add a few more notes on compromises * http_dav: log 500 errors from DELETE requests * tapset/http_access_log: note CLF differences * copyright updates for 2015 1.4.1 / 2014-09-07 02:21 UTC ---------------------------- The PHP PECL MogileFS extension uses neon to handle WebDAV operations, and neon seems to send (valid but unfortunate) headers with empty string values. Thanks to Patrice Damezin at Skyrock.com for reporting this bug. There's also a few minor cleanups. The latest 2.6.34 stable kernel release no longer requires our EPOLL_CTL_MOD race workaround. There are also some test suite updates for future releases of Ruby. Bigger changes coming later this year... There's also a new public mailing list at: cmogstored-public@bogomips.org No subscription will ever be necessary to post. Subscription is optional via: cmogstored-public+subscribe@bogomips.org Archives are available at http://bogomips.org/cmogstored-public/ Eric Wong (11): minor cleanups for functions which do not return remove old fsck_queue declarations svc_dev: calling free does not need the lock test/mgmt: lengthen test for iostat watch test/http: plug race condition in FIFO test test/http_chunked_put: test for gigantic trailer update address to public mailing list Rakefile: remove freecode/freshmeat references Rakefile: shorten ChangeLog dump queue_epoll: disable buggy epoll workaround for 2.6.34.15+ http_common: correctly handle empty header values 1.4.0 / 2014-02-24 03:01 UTC ---------------------------- bsd_sendfile is now supported on Debian GNU/kFreeBSD systems. This release also fixes a compability bug with Perl mogstored config files where "daemonize = (0|1)" was not supported properly. Eric Wong (3): check for sys/sendfile.h header instead of __linux__ allow bsd_sendfile with freebsd-glue on Debian/kFreeBSD support "daemonize = 0|1" in the config file 1.3.3 / 2014-02-09 04:28 UTC ---------------------------- This release fixes build problems with Debian GNU/kFreeBSD support (turns out it's been broken for over a year and nobody noticed :x). There are also build system upgrades for automake 1.14 and test case cleanups, but no changes to any of the core code. No changes nor need to upgrade if you're on anything other than Debian GNU/kFreeBSD. 1.3.2 / 2013-12-10 22:10 UTC ---------------------------- This release speeds up graceful shutdown on busy systems such as FreeBSD. There is also a minor resource savings for users of the undocumented --worker-processes switch. There are also some minor memory error fixes for test cases (which did not affect the daemon itself). Upgrading is optional unless you are affected by these fixes. Note: GNU/Linux users are encouraged to read the manpage update regarding glibc malloc arenas Eric Wong (9): selfwake: do share pipe descriptors with workers test/chunk-parser-1: fix uninitialized file structures test: fix valgrind warnings in test-only C code doc: refer to malloc-related environment variables thrpool: sleep instead of yield when poking thread test/mgmt-usage: relax regexp for ZFS m4/.gitignore: bump for newer gnulib doc: fix wording in manpage doc: fix link to MogileFS homepage 1.3.1 / 2013-10-12 21:45 UTC ---------------------------- This release fixes a bug which only affects users of the undocumented multi-process configuration feature (which is also multi-threaded). * avoid use-after-free with multi-process setups readdir on the same DIR pointer is undefined if DIR was inherited by multiple children. Using the reentrant readdir_r would not have helped, since the underlying file descriptor and kernel file handle were still shared (and we need rewinddir, too). This readdir usage bug existed in cmogstored since the earliest releases, but was harmless until the cmogstored 1.3 series. This misuse of readdir lead to hitting a leftover call to free(). So this bug only manifested since commit 1fab1e7a7f03f3bc0abb1b5181117f2d4605ce3b (svc: implement top-level by_mog_devid hash) Fortunately, these bugs only affect users of the undocumented multi-process feature (not just multi-threaded). 1.3.0 / 2013-09-30 08:51 UTC ---------------------------- There are no changes from 1.3.0rc2. For the most part, cmogstored 1.2.2 works well, but 1.3 contains some fairly major changes and improvements. cmogstored CPU usage may be higher than other servers because it's designed to use whatever resources it has at its disposal to distribute load to different storage devices. cmogstored 1.3 continues this, but it should be safer to lower thread counts without hurting performance too much for non-dedicated servers. cmogstored 1.3 contains improvements for storage hosts at the extremes ends of the performance scale. For large machines with many cores, memory/thread usage is reduced because we had too many acceptor threads. There are more improvements for smaller machines, especially those with slow/imbalanced drive speeds and few CPUs. Some of the improvements came from my testing with ancient single-core machines, others came from testing on 24-core machines :) Major features in 1.3: ioq - a I/O queues for all MogileFS requests -------------------------------------------- The new I/O queue (ioq) implements the equivalent of AIO channels functionality from Perlbal/mogstored. This feature prevents a failing/overloaded disk from monopolizing all the threads in the system. Since cmogstored uses threads directly (and not AIO), the common (uncontended case) behaves like a successful sem_wait with POSIX semaphores. Queueing+rescheduling only occurs in the contended case (unlike with AIO-style APIs, where request are always queued). I experimented with, but did not use POSIX semaphores as contention would still starve the thread pool. Unlike the old fsck_queue, ioq is based on the MogileFS devid in the URL and not the st_dev ID of the actual underlying file. This is less correct from a systems perspective, but should make no difference for normal production deployments (which are expected to use one MogileFS devid for each st_dev ID) and has several advantages: 1) testing/mock deploys of this feature with mock deploys is easier 2) we do not require any additional filesystem syscall (open/*stat) to look up the ioq based on st_dev, so we can use ioq to avoid stalls from slow open/openat/stat/fstatat/unlink/unlinkat syscalls. Otherwise, the implementation of this very closely resembles the old fsck queue implementation, but is generic across HTTP and sidechannel clients. The existing fsck queue functionality is now implemented using ioq. Thus, fsck queue functionality is mapped by the MogileFS devid and not the system st_dev ID as a result of this change. One benefit of this feature is the ability to run fewer aio_threads safely without worrying about cross-device contention on machines with limited resources or few disks (or not solely dedicated to MogileFS storage). The capacity of these I/O queues is automatically scaled to the number of available aio_threads, so they can change dynamically while your admin is tuning "SERVER aio_threads = XX" However, on a dedicated storage node, running many aio_threads (as is the default) should still be beneficial. Having more threads can keep the internal I/O queues of the kernel and storage hardware more populated and can improve throughput. thread shutdown fixes (epoll) ----------------------------- Our previous reliance on pthreads cancellation primitives left us open to a small race condition where I/O events (from epoll) could be lost during graceful shutdown or thread reduction via "SERVER aio_threads = XX". We no longer rely on pthreads cancellation for stopping threads and instead implement explicit check points for epoll. This did not affect kqueue users, but the code is simpler and more consistent across epoll/kqueue implementations. Graceful shutdown improvements ------------------------------ The addition of our I/O queueing and use of our custom thread shutdown API also allowed us to improve the responsiveness and fairness when the process enters graceful shutdown mode. This improves fairness and avoids client-side timeouts when large PUT requests are being issued over a fast network to slow disks during graceful shutdown. Currently, graceful shutdown remains single-threaded, but we will likely become multi-threaded in the future (like normal runtime). Miscellaneous fixes and improvements ------------------------------------ Further improved matching for (Linux) device-mapper setups where the same device (not symlinks) appears multiple times in /dev aio_threads count is automatically updated when new devices are added/removed. This is currently synced to MOG_DISK_USAGE_INTERVAL, but will use inotify (or the kqueue equivalent) in the future. HTTP read buffers grow monotonically (up to 64K) and always use aligned memory. This allows deployments which pass large HTTP headers do not trigger unnecessary reallocations. Deployments which use small HTTP headers should notice no memory increase. Acceptor threads are now limited to two per process instead of being scaled to CPU count. This avoids excessive threads/memory usage and contention of kernel-level mutexes for large multi-core machines. The gnulib version used for building the tarball is now included in the tarball for ease-of-reproducibility. Additional tests for uncommon error conditions using the fault-injection capabilities of GNU ld. The "shutdown" command over the sidechannel is more responsive for epoll users. Improved reporting of failed requests during PUT requests. Again, I run MogileFS instances on some of the most horrible networks on the planet[2] fix LIB_CLOCK_GETTIME linkage on some toolchains. "SERVER mogstored.persist_client = (0|1)" over the sidechannel is supported for compatibility with Perlbal/mogstored The Status: header is no longer returned on HTTP responses. All known MogileFS clients parse the HTTP status response correctly without the need for the Status: header. Neither Perlbal nor nginx set the Status: header on responses, so this is unlikely to introduce incompatibilities. The Status: header was originally inherited from HTTP servers which had to deal with a much larger range of (non-compliant) clients. 1.3.0rc2 / 2013-09-03 09:05 UTC ------------------------------- The Status: header is no longer returned on HTTP responses. All known MogileFS clients parse the HTTP status response correctly without the need for the Status: header. Neither Perlbal nor nginx set the Status: header on responses, so this is unlikely to introduce incompatibilities. The Status: header was originally inherited from HTTP servers which had to deal with a much larger range of (non-compliant) clients. SystemTap support is mostly fleshed out. There are some bundled awk scripts which should make better sense of the all.stp which logs just about everything. Raising aio_threads now correctly increases ioq capacity. This regression was only introduced in the 1.3.0 rc series, as ioq was not in 1.2.x. 1.3.0rc1 / 2013-07-14 02:32 UTC ------------------------------- For the most part, cmogstored 1.2.2 works well, but 1.3 contains some fairly major changes and improvements. cmogstored CPU usage may be higher than other servers because it's designed to use whatever resources it has at its disposal to distribute load to different storage devices. cmogstored 1.3 will continue this, but it should be safer to lower thread counts without hurting performance too much for non-dedicated servers. Unfortunately, the minor, Linux-only bug affecting 1.2.2 for (uncommon) thread shutdowns required some fairly intrusive changes to fix, so I'm not sure if releasing a 1.2.3 is worth it. If you're happy with 1.2.x, I recommend marking the host down via mogadm before lowering "SERVER aio_threads = XX" or sending SIGQUIT to cmogstored. But I think thread shutdown is uncommon enough to not affect normal deployments. cmogstored 1.3 will contain improvements for storage hosts at the extremes ends of the performance scale. For large machines with many cores, memory/thread usage is reduced because we had too many acceptor threads. There are more improvements for smaller machines, especially those with slow/imbalanced drive speeds and few CPUs. Some of the improvements came from my testing with ancient single-core machines, others came from testing on 24-core machines :) The SystemTap tracing work is still in-progress (although the 1.3 cycle was originally intended to focus on this :x). I expect the remaining changes to be non-intrusive and will work on them through the RC cycle. Major features in 1.3: ioq - a I/O queues for all MogileFS requests -------------------------------------------- The new I/O queue (ioq) implements the equivalent of AIO channels functionality from Perlbal/mogstored. This feature prevents a failing/overloaded disk from monopolizing all the threads in the system. Since cmogstored uses threads directly (and not AIO), the common (uncontended case) behaves like a successful sem_wait with POSIX semaphores. Queueing+rescheduling only occurs in the contended case (unlike with AIO-style APIs, where request are always queued). I experimented with, but did not use POSIX semaphores as contention would still starve the thread pool. Unlike the old fsck_queue, ioq is based on the MogileFS devid in the URL and not the st_dev ID of the actual underlying file. This is less correct from a systems perspective, but should make no difference for normal production deployments (which are expected to use one MogileFS devid for each st_dev ID) and has several advantages: 1) testing/mock deploys of this feature with mock deploys is easier 2) we do not require any additional filesystem syscall (open/*stat) to look up the ioq based on st_dev, so we can use ioq to avoid stalls from slow open/openat/stat/fstatat/unlink/unlinkat syscalls. Otherwise, the implementation of this very closely resembles the old fsck queue implementation, but is generic across HTTP and sidechannel clients. The existing fsck queue functionality is now implemented using ioq. Thus, fsck queue functionality is mapped by the MogileFS devid and not the system st_dev ID as a result of this change. One benefit of this feature is the ability to run fewer aio_threads safely without worrying about cross-device contention on machines with limited resources or few disks (or not solely dedicated to MogileFS storage). The capacity of these I/O queues is automatically scaled to the number of available aio_threads, so they can change dynamically while your admin is tuning "SERVER aio_threads = XX" However, on a dedicated storage node, running many aio_threads (as is the default) should still be beneficial. Having more threads can keep the internal I/O queues of the kernel and storage hardware more populated and can improve throughput. thread shutdown fixes (epoll) ----------------------------- Our previous reliance on pthreads cancellation primitives left us open to a small race condition where I/O events (from epoll) could be lost during graceful shutdown or thread reduction via "SERVER aio_threads = XX". We no longer rely on pthreads cancellation for stopping threads and instead implement explicit check points for epoll. This did not affect kqueue users, but the code is simpler and more consistent across epoll/kqueue implementations. Graceful shutdown improvements ------------------------------ The addition of our I/O queueing and use of our custom thread shutdown API also allowed us to improve the responsiveness and fairness when the process enters graceful shutdown mode. This improves fairness and avoids client-side timeouts when large PUT requests are being issued over a fast network to slow disks during graceful shutdown. Currently, graceful shutdown remains single-threaded, but we will likely become multi-threaded in the future (like normal runtime). Miscellaneous fixes and improvements ------------------------------------ Further improved matching for (Linux) device-mapper setups where the same device (not symlinks) appears multiple times in /dev aio_threads count is automatically updated when new devices are added/removed. This is currently synced to MOG_DISK_USAGE_INTERVAL, but will use inotify (or the kqueue equivalent) in the future. HTTP read buffers grow monotonically (up to 64K) and always use aligned memory. This allows deployments which pass large HTTP headers do not trigger unnecessary reallocations. Deployments which use small HTTP headers should notice no memory increase. Acceptor threads are now limited to two per process instead of being scaled to CPU count. This avoids excessive threads/memory usage and contention of kernel-level mutexes for large multi-core machines. The gnulib version used for building the tarball is now included in the tarball for ease-of-reproducibility. Additional tests for uncommon error conditions using the fault-injection capabilities of GNU ld. The "shutdown" command over the sidechannel is more responsive for epoll users. Improved reporting of failed requests during PUT requests. Again, I run MogileFS instances on some of the most horrible networks on the planet[2] fix LIB_CLOCK_GETTIME linkage on some toolchains. "SERVER mogstored.persist_client = (0|1)" over the sidechannel is supported for compatibility with Perlbal/mogstored 1.2.2 / 2013-05-11 23:04 UTC ---------------------------- This is a minor maintenance release, no need to upgrade unless a) your gcc defaults to -march=i386 (e.g. 32-bit CentOS 5) b) your device names include '-' (e.g. Linux device mapper users) There are also some minor doc updates to clarify tarball vs git installation and a trivial error-handling fix which should not affect any current users. Eric Wong (6): build: add check for GCC atomics alloc: posix_memalign does not set errno iostat_parser: allow '-' for device names test/cmogstored-cfg: ensure TMPDIR is absolute for valgrind INSTALL: clarify between starting from tarball vs git INSTALL: update versions and URLs cmogstored 1.3 will have some fairly intrusive internal changes and cleanups to make it easier for users to trace and diagnose system and network problems. 1.2.1 / 2013-03-04 01:33 UTC ---------------------------- This release only fixes an assertion failure during graceful shutdown while MogileFS fsck is running with checksumming enabled. This only affects users running fsck with checksumming enabled during a graceful shutdown of cmogstored. For upgrading cmogstored it is recommended to: 1) stop fsck on the trackers (via "mogadm fsck stop") 2) wait for all tracker queues to drain and stop sending fsck traffic to the affected host. You may wish to "!want 0 fsck" on all your trackers and wait for the fsck workers to stop. 3) upgrade cmogstored (in place upgrade works) There are also several code comment updates for internal components of cmogstored which may interest potential hackers. 1.2.0 / 2013-02-18 23:39 UTC ---------------------------- This release suppors nginx-style binary upgrades via SIGUSR2. The behavior of this process should match that of nginx: http://wiki.nginx.org/CommandLine#Upgrading_To_a_New_Binary_On_The_Fly SIGWINCH and SIGHUP are currently no-ops and may match nginx behavior in the future. They are not required for binary upgrades. Slow/unreliable mount points (if you have them) should have less effect on iostat sidechannel clients once the process is running. Startup is still slow with unreliable mount points, unfortunately. Error handling is now graceful for thread creation failures and systems lacking stdio memstream support (FreeBSD). 1.1.0 / 2013-01-18 10:54 UTC ---------------------------- cmogstored now works around an EPOLL_CTL_MOD race condition in old kernels. This workaround is unneeded and disabled on Linux v3.0.59+, v3.2.37+, v3.4.26+, v3.5.7.3+, v3.7.3+ and v3.8+ FreeBSD users should no longer see ECONNRESET errors on close(2). Unnecessary mkdir/mkdirat syscalls are optimistically avoided on PUT. MSG_MORE is used (instead of TCP_CORK) on Linux to avoid extra syscalls. We also avoid POSIX_FADV_SEQUENTIAL when returning small responses. 1.0.0 / 2012-12-13 08:38 UTC ---------------------------- FreeBSD support is improved: - ZFS support (tested on FreeBSD 9.0) - Fix disk usage report on FSes w/ small fragment size (ZFS) - faster graceful shutdown on kqueue()-based systems. - systems lacking open_memstream() no longer misreports OOM on SIGQUIT Thanks to Ask Bjørn Hansen for helping with the ZFS support Linux: - several bugfixes for handling of out-of-FD situations There are also minor documentation and build improvements. 0.9.0 / 2012-12-05 01:15 UTC ---------------------------- This release handles out-of-disk-space errors from PUT and usage file generation more gracefully. Failed PUTs due to client-side disconnects are also logged with additional debugging information. There are also minor internal cleanups for shutdowns. 0.8.0 / 2012-11-14 21:34 UTC ---------------------------- HTTP connections remain persistent after a failed Content-MD5, this prevents the MogileFS monitor from generating excessive TIME-WAIT connections. Linux only: Idle HTTP connections are automatically closed under FD pressure from systems with too many trackers. This adds zero overhead unless the process runs out of file descriptors. "server aio_threads = " is supported and attempts to mimic the Perlbal interface. There is no error reporting nor reporting of the current thread count. Requests are silently capped in the 1-100 (inclusive) range. The "shutdown" command from Perlbal is also supported. The Content-Range response header (for partial requests) was missing a '\r' and thus not compliant with extremely strict HTTP parsers. 0.7.0 / 2012-10-17 00:06 UTC ---------------------------- This release fixes a boundary error in Content-MD5 trailer reading+parsing when receiving chunked PUT requests of certain sizes. This bug was found while attempting to upload a 198689228 byte file using 16K chunks (using the Ruby mogilefs-client 3.4.0 to send the Content-MD5 trailer) There are also minor cleanups, new test cases and better error reporting for ENOSPC errors. 0.6.0 / 2012-10-04 09:05 UTC ---------------------------- This release fixes a concurrency assertion failure affecting sidechannel+checksum users. This bug has existed since 0.2.0, but was not noticed without the concurrency simplification in 0.5.0 There are additional test cases for checksumming and gnulib updates for the tarball. 0.5.0 / 2012-09-10 01:14 UTC ---------------------------- * I/O utilization is now reported correctly when multiple MogileFS devices share the same local filesystem * attempt to reduce client-side syscalls on low-latency networks with TCP_NOPUSH/TCP_CORK * remove a tiny chance of starvation when sharing work between threads while improving theoretical fairness * minor code cleanups and gnulib updates * acceptor threads push directly into the event queue to avoid poorly-defined TCP_DEFER_ACCEPT/accept filter semantics * acceptor threads no longer touch the FS, so they are now scaled to CPU count, not filesystem count * absolute URIs are now supported in HTTP/1.1 requests 0.4.0 / 2012-05-19 00:37 UTC ---------------------------- * The kqueue code path now requires no more syscalls than the epoll code path on *BSDs. * avoids malloc() usage after fork() (for iostat(1)). This works around some malloc() implementations which fail to reinitialize malloc locks after fork(). * iostat(1) is no longer spawned for HTTP-only deployments. * usage files are no longer created in HTTP-only deployments. * FIONBIO + ioctl() is no longer required, we'll fall back to the slower (but POSIX) fcntl() equivalent if FIONBIO isn't available. * Testers without native epoll or kqueue support, the the current stable branch of libkqueue (r549) should work as should future releases. - svn://mark.heily.com/libkqueue/branches/stable - http://mark.heily.com/project/libkqueue Users on GNU/Linux should notice no changes from the last release unless they're using HTTP-only. I look forward to hearing from our first GNU/Hurd users! 0.3.0 / 2012-03-27 01:13 UTC ---------------------------- This release adds support for SHA-1 digests over the sidechannel. There are also minor cleanups to signal handling. 0.2.2 / 2012-03-20 01:42 UTC ---------------------------- This release fixes build errors on glibc 2.5 - 2.9 where _ATFILE_SOURCE is required and defined by _GNU_SOURCE. 0.2.1 / 2012-03-20 00:58 UTC ---------------------------- This release fixes build errors on FreeBSD and Debian GNU/kFreeBSD systems. There are also minor, inconsequential cleanups but no other changes. GNU/Linux users may not notice any difference between this release and 0.2.0. 0.2.0 / 2012-03-17 23:36 UTC ---------------------------- * Graceful shutdown support (via SIGQUIT). This prevents process termination from breaking outstanding requests but will drop idle, persistent HTTP connections. cmogstored stops accepting new connections ASAP, so it's possible to start a new cmogstored process (or switch back to regular Perl mogstored) almost immediately after sending SIGQUIT. * PUT creates missing directories (except for the toplevel, like mogstored). MKCOL is now disabled, as forcing PUT to create missing directories speeds up "create_close" in MogileFS as the tracker no longer has to ensure directories exist. * Active clients have thread affinity, this prevents per-client data structures from ping-ponging between cores and caches. Idle clients that become active still retain the ability to migrate to _any_ idle thread and stay on it as long as it's active and there are free threads for other clients. * MD5 + fsck scheduling improvements for checksums testers. This limits fsck MD5 requests to one per-device. A similar patch is also included with the proposed checksum extensions to MogileFS. On Linux systems using the CFQ I/O scheduler, we drop the IO priority temporarily in an effort to prevent fsck traffic from impacting normal traffic. * Removed hard-to-test/support ENOSYS fallback support. Building cmogstored is easy, so fully-supporting ENOSYS fallbacks is too much maintenance/testing overhead. * PUT uses a temporary file so incomplete files are not left on the filesystem. Likewise, Content-MD5 rejections won't leave files on the filesystem, either. There are also some experimental features (not in Perl mogstored) documented only in git commit messages. These features which may be removed, changed, or renamed in the future. See "git log" for full details and rely on them at your own risk. 0.1.0 / 2012-02-12 09:45 UTC ---------------------------- cmogstored now supports enough HTTP to support MogileFS entirely. If you're willing to live on the bleeding edge, a single executable is now all that's needed to run a MogileFS storage node. HTTP features supported include pipelining, persistent connections, chunked PUT, partial PUT, partial GET, and Content-MD5 handling. Graceful shutdown/hot upgrades is not supported yet, so it's recommended you mark storage nodes "down" with mogadm before shutting down/upgrading cmogstored (you should probably do that regardless). While GNU/Linux systems with epoll+NPTL remain the platform of choice, this release is also tested on FreeBSD 9.0 and Debian GNU/kFreeBSD 6.0. FreeBSD 8.x and other *BSDs are likely to work, too. There were no fatal bugs found in 0.0.0, but this release (with HTTP support) is much more complex, so there may still be fatal bugs in it. 0.0.0 / 2012-01-12 03:39 UTC ---------------------------- This release supports the mgmt commands (aka "mogstored_stream_port" on Linux 2.6 systems for now. This will support (enough of) DAV and FreeBSD 8+ systems in the future.