cmogstored.git - alternative mogstored implementation for MogileFS

Date	Commit message (Collapse)
2013-05-11	cmogstored 1.2.2 - minor maintenance release v1.2.2 1.2-stable
	This is a minor maintenance release, no need to upgrade unless a) your gcc defaults to -march=i386 (e.g. 32-bit CentOS 5) b) your device names include '-' (e.g. Linux device mapper users) There are also some minor doc updates to clarify tarball vs git installation and a trivial error-handling fix which should not affect any current users. Eric Wong (6): build: add check for GCC atomics alloc: posix_memalign does not set errno iostat_parser: allow '-' for device names test/cmogstored-cfg: ensure TMPDIR is absolute for valgrind INSTALL: clarify between starting from tarball vs git INSTALL: update versions and URLs cmogstored 1.3 will have some fairly intrusive internal changes and cleanups to make it easier for users to trace and diagnose system and network problems.
2013-05-11	INSTALL: update versions and URLs
	libkqueue recently migrated to SourceForge and Debian 7.0 is the new stable. We still support Debian 6.0 and will likely support it for years to come since CentOS 5.x remains supported. (cherry picked from commit 86e5d10649f14fe3b3c8af37fd8ec04cc337fc9e)
2013-05-11	INSTALL: clarify between starting from tarball vs git
	Users unfamiliar with autotools may not realize bootstraping is required when building from git. (cherry picked from commit 1e80ba592ede05fe40b31686142f82294891afd0)
2013-05-11	test/cmogstored-cfg: ensure TMPDIR is absolute for valgrind
	Our use of chdir in this test confuses valgrind which may create a temporary file. (cherry picked from commit dc801d4a4ded67d74f5306d6dad4aba629045cc8)
2013-05-11	iostat_parser: allow '-' for device names
	Linux device-mapper names show up as 'dm-0', 'dm-1' and so on. This allows users to store MogileFS files on encrypted devices using dm-crypt and perhaps other, similar tools. (cherry picked from commit 88d34b4686a650dba89674aa302ab13c78e8cef0)
2013-05-11	alloc: posix_memalign does not set errno
	We must set errno manually for die_errno() if posix_memalign fails (cherry picked from commit 8c79cf794f6178b6978743af99d498ca0b449fb1)
2013-03-08	build: add check for GCC atomics
	Andrey Okunev noted undefined references on the MogileFS mailing list when building cmogstored 1.2.1 on his 32-bit CentOS5 machine.
2013-03-04	cmogstored 1.2.1 - fix graceful shutdown failure v1.2.1
	This release only fixes an assertion failure during graceful shutdown while MogileFS fsck is running with checksumming enabled. This only affects users running fsck with checksumming enabled during a graceful shutdown of cmogstored. For upgrading cmogstored it is recommended to: 1) stop fsck on the trackers (via "mogadm fsck stop") 2) wait for all tracker queues to drain and stop sending fsck traffic to the affected host. You may wish to "!want 0 fsck" on all your trackers and wait for the fsck workers to stop. 3) upgrade cmogstored (in place upgrade works) There are also several code comment updates for internal components of cmogstored which may interest potential hackers.
2013-03-04	TODO: add a few item for our roadmap
	We have a future!
2013-03-02	alloc: document use of TLS buffers
	tls_rbuf allows us to avoid nearly all dynamic allocation for common HTTP requests. However, the mog_rbuf structure may be detached from TLS as necessary (and another one allocated in its place) when the need arises.
2013-03-02	fdmap: documentation for the FD-based memory allocation
	Avoiding heap allocations in common paths is important to high performance server design; document this important design decision.
2013-02-23	mgmt: fix fsck digest assert failure in graceful shutdown
	Items in the low-priority fsck queue could trigger a assertion failure during graceful shutdown due to improper handling of the MOG_NEXT_IGNORE state in mog_mgmt_quit_step(). However, using the fsck queue in graceful shutdown (which is single-threaded) is probably a bad idea anyways, as the fsck digest could monopolize other requests. So give no special handling to fsck digest queries during graceful shutdown. This only affects users running fsck with checksumming enabled during a graceful shutdown of cmogstored. For checksums users, it is recommended to stop fsck from the trackers and wait for all tracker queues to drain before upgrading cmogstored (and using graceful shutdown on the old cmogstored).
2013-02-23	http_get: comment about snprintf() being a hot spot
	cmogstored is pretty fast, but it could be faster.
2013-02-21	queue_common: update comments to match code
	While we're at it, explain the use of cloexec.
2013-02-18	document/reserve SIGWINCH/SIGHUP for future use v1.2.0
	Despite having an extensive test suite and minimal room for user error, giving users the options to back out of a hot upgrade may be worth supporting.
2013-02-18	copyright comment updates for 2013 (part 2)
	Many files were missed the first time around in commit 37026af96dec638aa850d604003bf7218d90037d
2013-02-18	manpage: document SIGUSR2 upgrades
	This is a new feature and needs to be documented.
2013-02-18	move cmogstored_exit() prototype to cmogstored.h
	This fixes a missing prototype warning for cmogstored_exit() when checking exit.c with sparse.
2013-02-18	queue_epoll: fix bad cast for epoll.event
	The events field of struct epoll_event is a uint32_t, not int.
2013-02-18	tests: add valgrind supp for epoll_ctl on 32-bit arch
	The epoll_event.data union is 64-bits on 32-bit systems while pointers are 32-bit. We only use 32-bits of that union, but valgrind mistakenly complains about it (the kernel does not care about the user-supplied data union at all).
2013-02-18	ioutil: fix memory access error on from mog_iou_write
	sizeof(buf) returns the size of the pointer if buf is a passed parameter, even if it the function prototype dictates a fixed size for buf as we do in mog_iou_write. While we're at it, make our mog_iou_write buf parameter const. This bug was introduced in: commit a960a351b2248a196c91cdbf6256f98e1bc2ef37 "split iostat util% tracking from mountlist" and never affected an official release of cmogstored. This bug was caught while testing on a 32-bit GNU/Linux machine. My normal 32-bit FreeBSD 9.0 environment did not catch this as iostat on that platform only reports integer percentages and does not need more than 4 bytes.
2013-02-16	handle pthread_create returning ENOMEM on old glibc
	Older glibc will return ENOMEM on mprotect() failures. This bug was only fixed in 2011, so the long-term distros and old installations may not have the necessary backports. ref: http://www.sourceware.org/bugzilla/show_bug.cgi?id=386
2013-02-16	graceful handling of pthread_create EAGAIN failure
	pthread_create may return EAGAIN as a temporary failure, do not abort a running process if this is the case. For the initial mountlist scan, we must retry indefinitely for cmogstored to be usable. However, with our thread pools, we can always run fewer threads (as long as there is at least one thread per-pool).
2013-02-16	test/http_idle_expire: hopefully improve test reliability
	This is a tricky test and doesn't always succeed, since it's hard to tell how many file descriptors glibc will use internally.
2013-02-15	sig: avoid pselect if ppoll is present in mog_sleep
	We want to favor ppoll over pselect, since ppoll is a better interface and we can have a slightly smaller binary with fewer dependencies. While we're at it, use mog_sleep(-1) as an alias for mog_selfwake_wait to further reduce binary size.
2013-02-15	avoid racy sleep on fork failure in master process
	We need to atomically enable interrupts and sleep with the same syscall. Fortunately, using pselect (through mog_sleep) allows that and is POSIX-compliant, so use that.
2013-02-15	mnt: inform user of slow mountlist scan
	This will inform the user of why cmogstored may be slow to start, since we need the mountlist to be populated at startup. We also throw a pthread_cancel() in there to load libgcc_s under glibc, so we can avoid loading libgcc_s once we're under FD pressure. This makes test/http_idle_expire.rb more reliable.
2013-02-14	test/http_range: do not allow webrick to perform lookups
	DNS lookups cause webrick tests to fail or timeout. Our tests should not have external network dependencies.
2013-02-14	inherit: avoid DNS lookup on upgrade
	A typo caused unnecessary DNS lookups when inheriting sockets. While we're at it, fix another typo in the error message, too.
2013-02-14	selfwake: use epoll_pwait on Linux instead of eventfd
	This saves us a file descriptor in Linux, which provides epoll_pwait in 2.6.19+ (and ppoll for 2.6.18, the oldest kernel we support).
2013-02-11	mnt: revert to mutex for protecting by_dev hash
	Since we now update future copies of by_dev offline and only need a lock to swap in the new one, contention for by_dev should be less of a problem than it was before. This should make reader-writer locks an unnecessary risk. Reader-writer locks are riskier since writer starvation can potentially be an issue with many readers.
2013-02-11	test/mogilefs_integration: increase test reliability
	Use SO_REUSEADDR, since Linux requires both the new program (cmogstored) and this test both use SO_REUSEADDR for SO_REUSEADDR to be effective. Also, minimize the window for port conflicts. While there are hard-to-avoid race windows for conflicts when binding random ports, we can minimize those windows by holding those ports open in the parent as long as possible.
2013-02-11	kqueue: update NOTIFYRD -> SELFWAKE
	This was missed in the earlier changes to allow eventfd usage under Linux instead of using an notification pipe.
2013-02-11	fix signal races when master process is used
	In the absence of a pselect/ppoll-like version of waitpid; we must use a selfwake descriptor (pipe or eventfd) to wake the master up whenever a signal is received. So wait on the selfwake descriptor and always run waitpid with WNOHANG in a loop to ensure all children are reaped. The: mog_intr_disable(); waitpid(); mog_intr_enable() sequence was completely stupid I can't believe I wrote it.
2013-02-11	allow self-wakeup to use eventfd under modern Linux
	eventfd uses fewer resources than a pipe, so create less overhead for our users by using eventfd instead of a pipe.
2013-02-11	pidfile: delay unlink of old file on aborted upgrades
	We don't want to be without any pidfile if writing the new pidfile fails.
2013-02-11	upgrade: do not disable interrupts in forked child
	The child disables interrupts right away, so there's no reason to enable interrupts temporarily.
2013-02-11	test/upgrade: more thorough PID file checking
	We need to ensure the PID file is non-empty, not just that it exists.
2013-02-11	prioritize upgrade before exit in main loop
	If we receive both SIGUSR2 and SIGQUIT in a short time period; we should trigger the upgrade before gsince raceful exit; as no user will (intentionally) send SIGQUIT before SIGUSR2.
2013-02-11	test/upgrade: teardown more careful about killing
	We don't want to accidentally kill ourselves by targeting PID=0 if the PID file is empty.
2013-02-09	tests: fix several Ruby warnings
	Unused variables and unset Content-Type for Net::HTTP requests
2013-02-09	test/inherit: fix Ruby 2.0.0 close-on-exec compatibility
	FD inheritance from exec() must be done explicitly in Ruby 2.0
2013-02-08	mnt: move stat/lstat logic to mnt_usable
	This centralizes the mountpoint suitability logic in one place. In the future, it may also allow us to parallelize the work of scanning filesystems.
2013-02-08	upgrade: fix env placeholder for valgrind
	Having a NULL at the beginning of the list caused iteration in the destructor to stop, allowing valgrind to detect a memory leak.
2013-02-08	cfg: require PATH to be set for --daemonize
	Maybe some weird users do not have PATH
2013-02-08	upgrade: avoid non-async-safe functions in child
	execvp may malloc internally in its path lookup, so use find_in_path to perform this lookup in the parent instead. Additionally, putenv() may not be async-signal-safe either, but execve is, so use execve.
2013-02-07	cfg: disallow trailing ':' in PATH with daemonize
	Trailing ':' in PATH means using the current path, which is now incompatible with daemonize.
2013-02-07	upgrade: avoid potential deadlock from post-fork mutex use
	Pthreads implementations do not require mutexes be in a consistent/usable state in a forked child Since we don't need the mutex in a single-threaded forked child, we can just skip it and avoid reinitializing it entirely.
2013-02-07	rename fs_usable to mnt_usable
	It should be clearer this code is only called from inside mnt.c and not fs.c (the latter is for general filesystem operations, not operations on a mount point).
2013-02-07	release memory allocated for upgrade at exit
	This is not strictly necessary as this memory is freed anyways, but stop valgrind from complaining and avoid unnecessary suppressions (since shutdown performance is not important).