Date | Commit message (Collapse) |
|
Clients may start an fsck checksum request and not be around
to read the response. So detect client death and abort
checksumming if we have a dead socket.
This is not extensively tested and may be overkill.
|
|
Items in the low-priority fsck queue could trigger a assertion failure
during graceful shutdown due to improper handling of the MOG_NEXT_IGNORE
state in mog_mgmt_quit_step().
However, using the fsck queue in graceful shutdown (which is
single-threaded) is probably a bad idea anyways, as the fsck digest
could monopolize other requests. So give no special handling to fsck
digest queries during graceful shutdown.
This only affects users running fsck with checksumming enabled during a
graceful shutdown of cmogstored. For checksums users, it is recommended
to stop fsck from the trackers and wait for all tracker queues to drain
before upgrading cmogstored (and using graceful shutdown on the old
cmogstored).
|
|
cmogstored is pretty fast, but it could be faster.
|
|
While we're at it, explain the use of cloexec.
|
|
Despite having an extensive test suite and minimal room for user
error, giving users the options to back out of a hot upgrade may
be worth supporting.
|
|
Many files were missed the first time around in
commit 37026af96dec638aa850d604003bf7218d90037d
|
|
This is a new feature and needs to be documented.
|
|
This fixes a missing prototype warning for cmogstored_exit()
when checking exit.c with sparse.
|
|
The events field of struct epoll_event is a uint32_t, not int.
|
|
The epoll_event.data union is 64-bits on 32-bit systems while
pointers are 32-bit. We only use 32-bits of that union, but
valgrind mistakenly complains about it (the kernel does not
care about the user-supplied data union at all).
|
|
sizeof(buf) returns the size of the pointer if buf is a passed
parameter, even if it the function prototype dictates a fixed
size for buf as we do in mog_iou_write. While we're at it,
make our mog_iou_write buf parameter const.
This bug was introduced in:
commit a960a351b2248a196c91cdbf6256f98e1bc2ef37
"split iostat util% tracking from mountlist"
and never affected an official release of cmogstored.
This bug was caught while testing on a 32-bit GNU/Linux machine.
My normal 32-bit FreeBSD 9.0 environment did not catch this as
iostat on that platform only reports integer percentages and
does not need more than 4 bytes.
|
|
Older glibc will return ENOMEM on mprotect() failures. This bug
was only fixed in 2011, so the long-term distros and old
installations may not have the necessary backports.
ref: http://www.sourceware.org/bugzilla/show_bug.cgi?id=386
|
|
pthread_create may return EAGAIN as a temporary failure,
do not abort a running process if this is the case.
For the initial mountlist scan, we must retry indefinitely for
cmogstored to be usable. However, with our thread pools, we can
always run fewer threads (as long as there is at least one
thread per-pool).
|
|
This is a tricky test and doesn't always succeed, since
it's hard to tell how many file descriptors glibc will
use internally.
|
|
We want to favor ppoll over pselect, since ppoll is a better
interface and we can have a slightly smaller binary with fewer
dependencies.
While we're at it, use mog_sleep(-1) as an alias for
mog_selfwake_wait to further reduce binary size.
|
|
We need to atomically enable interrupts and sleep with
the same syscall. Fortunately, using pselect (through
mog_sleep) allows that and is POSIX-compliant, so use
that.
|
|
This will inform the user of why cmogstored may be slow
to start, since we need the mountlist to be populated at
startup.
We also throw a pthread_cancel() in there to load libgcc_s under
glibc, so we can avoid loading libgcc_s once we're under FD pressure.
This makes test/http_idle_expire.rb more reliable.
|
|
DNS lookups cause webrick tests to fail or timeout. Our
tests should not have external network dependencies.
|
|
A typo caused unnecessary DNS lookups when inheriting sockets.
While we're at it, fix another typo in the error message, too.
|
|
This saves us a file descriptor in Linux, which provides
epoll_pwait in 2.6.19+ (and ppoll for 2.6.18, the oldest
kernel we support).
|
|
Since we now update future copies of by_dev offline and only
need a lock to swap in the new one, contention for by_dev should
be less of a problem than it was before. This should make
reader-writer locks an unnecessary risk.
Reader-writer locks are riskier since writer starvation can
potentially be an issue with many readers.
|
|
Use SO_REUSEADDR, since Linux requires both the new program
(cmogstored) and this test both use SO_REUSEADDR for
SO_REUSEADDR to be effective.
Also, minimize the window for port conflicts. While there are
hard-to-avoid race windows for conflicts when binding random
ports, we can minimize those windows by holding those ports open
in the parent as long as possible.
|
|
This was missed in the earlier changes to allow eventfd
usage under Linux instead of using an notification pipe.
|
|
In the absence of a pselect/ppoll-like version of waitpid;
we must use a selfwake descriptor (pipe or eventfd) to
wake the master up whenever a signal is received.
So wait on the selfwake descriptor and always run waitpid
with WNOHANG in a loop to ensure all children are reaped.
The: mog_intr_disable(); waitpid(); mog_intr_enable()
sequence was completely stupid I can't believe I wrote it.
|
|
eventfd uses fewer resources than a pipe, so create less
overhead for our users by using eventfd instead of a pipe.
|
|
We don't want to be without any pidfile if writing the new
pidfile fails.
|
|
The child disables interrupts right away, so there's no
reason to enable interrupts temporarily.
|
|
We need to ensure the PID file is non-empty, not just
that it exists.
|
|
If we receive both SIGUSR2 and SIGQUIT in a short
time period; we should trigger the upgrade before
gsince raceful exit; as no user will (intentionally) send
SIGQUIT before SIGUSR2.
|
|
We don't want to accidentally kill ourselves by targeting
PID=0 if the PID file is empty.
|
|
Unused variables and unset Content-Type for Net::HTTP requests
|
|
FD inheritance from exec() must be done explicitly in Ruby 2.0
|
|
This centralizes the mountpoint suitability logic in
one place. In the future, it may also allow us to
parallelize the work of scanning filesystems.
|
|
Having a NULL at the beginning of the list caused
iteration in the destructor to stop, allowing valgrind
to detect a memory leak.
|
|
Maybe some weird users do not have PATH
|
|
execvp may malloc internally in its path lookup, so use
find_in_path to perform this lookup in the parent instead.
Additionally, putenv() may not be async-signal-safe either,
but execve is, so use execve.
|
|
Trailing ':' in PATH means using the current path, which
is now incompatible with daemonize.
|
|
Pthreads implementations do not require mutexes be in a
consistent/usable state in a forked child
Since we don't need the mutex in a single-threaded forked
child, we can just skip it and avoid reinitializing it
entirely.
|
|
It should be clearer this code is only called from inside
mnt.c and not fs.c (the latter is for general filesystem
operations, not operations on a mount point).
|
|
This is not strictly necessary as this memory is freed anyways,
but stop valgrind from complaining and avoid unnecessary
suppressions (since shutdown performance is not important).
|
|
Relative paths are incompatible with daemonization, as it
does not work for SIGUSR2 upgrades (since daemonize forces
the server to run in "/"). Relative paths are confusing
and error-prone anyways, so do not allow users to specify
them along with --daemonize.
|
|
The GNU error() function already emits a newline at
the end of these messages.
|
|
error.h is available on GNU/Linux (and presumably GNU/kFreeBSD
and GNU/Hurd, so favor that system-wide header over the gnulib
one.
|
|
There is no need to maintain our own code for this.
|
|
This makes it easier alter how/if we write to stderr.
|
|
Noticed with gcc 4.7.2 in Debian testing (4.7.2-5)
|
|
This ensures the: inherited $ADDRESS:$PORT on fd=...
messages are prefixed with the PID in logs.
|
|
This project uses C99 features (and some GNU extensions),
so bool is usable.
|
|
We do not need all the weight of sockaddr_storage or NI_MAXHOST.
cmogstored currently only supports IPv4 and IPv6[1] and (like
any respectable server) will not perform reverse DNS lookups.
This allows us to reduce our stack usage in some places and
keep caches hotter.
[1] MogileFS does not support IPv6, yet, even
|
|
Code is easier to follow when interrupts occur at well-defined
points. The worker processes (and master-less standalone) already
follows this.
|