Date | Commit message (Collapse) |
|
This is somewhat strange, but makes the code base slightly easier
to reuse for non-HTTP purposes.
|
|
This should hopefully make failures easier to track down.
|
|
This only triggered if the (undocumented) --worker-processes
option is used. This assertion is no longer valid as of
commit d5a52618ca1f9b5d7f6998716fbfe7714f927112
(refactor handling of "server aio_threads = " command)
|
|
This allows us to avoid a redundant hash lookup every time we
"activate" an open file for reading or writing.
|
|
This will allow us to limit concurrency on a per-device basis with
limited impact on HTTP header reading/parsing. This prevents
pathological slowness on a single device from bringing down an entire
host. This also allows users to more safely run with fewer aio_threads
(e.g. 1:1 thread:device mapping) on fast devices with smaller low-level
(kernel/hardware) I/O queues.
|
|
"struct sockaddr" turns out to be smaller than "struct sockaddr_in6",
so we can avoid complicated casting and just add that to the union.
We continue avoiding "struct sockaddr_storage", however, as it is
unnecessarily large for our needs.
|
|
This was triggering warnings with Ruby 2.0.0-p195
|
|
Reattaching/reusing read buffers allows us to avoid repeated
reallocation/growth/free when clients repeatedly send us large headers.
This may also increase cache-hits by favoring recently-used buffers as
long as fragmentation is kept in check. The fragmentation should be
no worse that is currently, due to the existing detach nature of rbufs
|
|
We'll be allowing the migration of buffers between threads
and from waiting clients back to thread-local storage.
|
|
Some setups use clients which pass large headers (User-Agent, or
even cookies(!)) to cmogstored, so large rbufs may be used often
and repeatedly in those cases.
We limit rbuf sizes to 64K anyways, so keeping "larger" buffers
around should not be much of an issue for modern systems.
This prepares us for reusing/recycling large rbufs as TLS buffers.
|
|
This will allow us to use control flow similar to the http client
handling code when we queue clients based on I/O channel.
|
|
This replaces the fsck_queue internals with a generic
ioq implementation which is based on the MogileFS devid,
and not the operating system devid.
|
|
We need to ensure we do not introduce code to launch
http_process_client while we have buffered data (or socket write
errors).
|
|
We will have structures inside the dev struct accessed by multiple
threads frequently, so keep it cache-aligned.
To reduce memory usage for large-numbered devices, avoid storing the
prefix on output and instead just rely on the printf-family of
routines to generate stringified output in uncommon code paths.
|
|
Detachers MUST set rsize properly. This API is unfortunately fragile
and will eventually be fixed to be more difficult to misuse.
|
|
According to the m4/clock_gettime.m4 documentation (from gnulib),
the LIB_CLOCK_GETTIME variable should be added to a *LDADD variable
and not AM_LDFLAGS. This is also consistent with GNU automake
documentation.
Thanks to Cody Pisto for reporting this problem under Ubuntu 12.04
ref: http://www.gnu.org/software/automake/manual/html_node/Linking.html
|
|
* 1.2-stable:
cmogstored 1.2.2 - minor maintenance release
INSTALL: update versions and URLs
INSTALL: clarify between starting from tarball vs git
test/cmogstored-cfg: ensure TMPDIR is absolute for valgrind
iostat_parser: allow '-' for device names
alloc: posix_memalign does not set errno
|
|
For difficult-to-trigger errors, fault injection is necessary for
testing our error handling. I have confirmed this test fails with
"avoid leaks on epoll/kqueue resources exhaustion" reverted.
|
|
Simply releasing the descriptor triggering ENOSPC/ENOMEM errors from
epoll_ctl and kevent is not good enough, as those descriptors may
have other descriptors (e.g. files to be served) hanging off of them.
|
|
While pthread_yield is non-standard, it is relatively common and
preferable for systems where pthreads are _not_ 1:1 mapped to kernel
threads. This also provides a stronger yield to weaken the priority
of the calling thread wherever we previously used sched_yield.
|
|
This should allow the threads we're terminating to more quickly
enter a safe state where they're allowed to exit. On SMP systems,
we need to yield the signalling thread more times to increase the
probability the interrupted thread can run (and exit).
|
|
Our tests over-link (to save developer time :P), so we must
link in probes with our tests. Also, we must keep probes.h
around for distclean (but not maintainerclean)
|
|
We cannot assume sa_family_t is the first element of "struct
sockaddr_in" or "struct sockaddr_in6". FreeBSD has a "sa_len"
member as the first element while Linux does not.
So only keep the parts of the "struct sockaddr*" we need and use
inet_ntop instead of getnameinfo. This also gives us a little more
space to add additional fields to "struct mog_http" in the future
without increasing memory (or CPU cache) use.
|
|
Tarballs were otherwise unusable.
|
|
Due to data/event loss, we cannot rely on normal syscalls
(accept/epoll_wait) being cancellation points. The benefits of
using a standardized API to terminate threads asynchronously are
lost when toggling cancellation flags.
This implementation allows us to be more explicit and obvious at the
few points where our worker threads may exit and reduces the amount
of code we have. By avoiding the calls to pthread_setcancelstate,
we should halve the number of atomic operations required in the
common case (where the thread is not marked for termination).
|
|
This should prevent one class of "accidental" failures.
(The sidechannel has never been meant to be secure and exposed
to the public).
|
|
A client may disconnect at any time, so shutdown may fail harmlessly
with ENOTCONN.
|
|
The "shutdown" command needs to trigger EINTR when using
epoll_pwait, otherwise the sleeping thread may not wake up properly.
|
|
Cancellation with epoll_wait, accept4 (and accept) may cause events
to be lost, as cancellation relies on signals anyways in glibc/Linux.
So instead, we use signaling ourselves and explicitly test for
cancellation only if we know we are interrupted and in a state where
a thread can safely be cancelled.
ref: http://mid.gmane.org/CAE2sS1gxQkqmcywQ07pmgNHM+CyqzMkuASVjmWDL+hgaTMURWQ@mail.gmail.com
|
|
This should hopefully save a few cycles and reduce stack
usage slightly.
|
|
We could eventually make this a tunable parameter, as it could
be advantageous over a global aio_threads value.
|
|
We're using per-svc-based thread pools, so different MogileFS
instances we serve no longer affect each other. This means
changing the aio_threads count only affects the svc of the
sidechannel port which triggered the change.
|
|
This improves maintainability in case MogileFS changest these
limits.
|
|
Both hash_initialize and hash_insert may return NULL to indicate
allocation errors. So implement a mog_oom_if_null helper function to
destroy the process instead of attempting to continue and dereferencing
NULL pointers.
This may affect configurations with limited memory and lacking
overcommit; but is unlikely to trigger given the small memory footprint
of cmogstored.
|
|
This will allow us to lookup devices for per-(mog)device I/O queues.
|
|
Lines longer than 80 columns aren't readable on my screen
with gigantic fonts.
|
|
This will allow us to do lookups for IO queues/semaphores before
we attempt to fstatat/stat a path.
|
|
If the mogstored sidechannel is inactive (in HTTP-only mode), we should
still count the number of devices correctly to correctly scale the
number of worker threads.
|
|
This simplifies code, reduces contention, and reduces the
chances of independent MogileFS instances (with one instance
of cmogstored) stepping over each other.
Most cmogstored deployments are single docroot (for a single
instance of MogileFS), however cmogstored supports multiple
docroots for some rare configurations and we support them here.
|
|
I forgot why this bound was necessary, so add a comment
ensuring I do not forget again.
|
|
Having too many acceptor threads does not help, as it leads to
lock contention in the accept syscalls and the EPOLL_CTL_ADD
paths. The fair FIFO ordering of _blocking_ accept/accept4
syscalls also means we trigger unnecessary task switching and
incur cache misses under high load.
Since it is almost impossible for the acceptor threads to
be stuck on disk I/O since
commit 832316624f7a8f44b3e1d78a8a7a62a399241840
("acceptor threads push directly into event queue")
|
|
This will help ensure availability when new devices are added,
without additional user interaction to manually set aio_threads
via sidechannel.
|
|
mog_fd_init enforces setting the correct type, so relegate
mog_fd_get to private usage inside fdmap.c
|
|
This is useful for:
a) repeatibly generating the same tarball off git
b) diagnosing and tracking down (rare) gnulib bugs
c) 3rd parties verifying we do not put malicious code into our tarballs
|
|
st_rdev matching is necessary for cases where the block devices
are aliased (not via symlinks), and mountlist returns a different
name for the device than what iostat uses. This is the case for
my cryptmount(8) setup, where /dev/mapper/FOO and /dev/dm-N refer
to the same device (with matching st_dev and st_rdev numbers),
but neither is a symlink to the other (nor are they hardlinks).
stat() on block devices in /dev should always be fast and
non-blocking, as /dev is expected to be non-networked on any
reasonable system (at least those serving as a MogileFS storage
node).
|
|
This is a minor maintenance release, no need to upgrade unless
a) your gcc defaults to -march=i386 (e.g. 32-bit CentOS 5)
b) your device names include '-' (e.g. Linux device mapper users)
There are also some minor doc updates to clarify tarball vs git
installation and a trivial error-handling fix which should not
affect any current users.
Eric Wong (6):
build: add check for GCC atomics
alloc: posix_memalign does not set errno
iostat_parser: allow '-' for device names
test/cmogstored-cfg: ensure TMPDIR is absolute for valgrind
INSTALL: clarify between starting from tarball vs git
INSTALL: update versions and URLs
cmogstored 1.3 will have some fairly intrusive internal changes
and cleanups to make it easier for users to trace and diagnose
system and network problems.
|
|
libkqueue recently migrated to SourceForge and Debian 7.0 is
the new stable.
We still support Debian 6.0 and will likely support it for years to
come since CentOS 5.x remains supported.
(cherry picked from commit 86e5d10649f14fe3b3c8af37fd8ec04cc337fc9e)
|
|
Users unfamiliar with autotools may not realize bootstraping
is required when building from git.
(cherry picked from commit 1e80ba592ede05fe40b31686142f82294891afd0)
|
|
libkqueue recently migrated to SourceForge and Debian 7.0 is
the new stable.
We still support Debian 6.0 and will likely support it for years to
come since CentOS 5.x remains supported.
|
|
Users unfamiliar with autotools may not realize bootstraping
is required when building from git.
|