Re: Shared Metrics Between Workers

unicorn Ruby/Rack server user+dev discussion/patches/pulls/bugs/help
 help / color / mirror / code / Atom feed

From: Eric Wong <e@80x24.org>
To: Jeff Utter <jeff.utter@firespring.com>
Cc: unicorn-public@bogomips.org
Subject: Re: Shared Metrics Between Workers
Date: Fri, 13 Nov 2015 20:54:27 +0000	[thread overview]
Message-ID: <20151113205427.GA7237@dcvr.yhbt.net> (raw)
In-Reply-To: <etPan.5645f4a3.231a9b16.5a5@jharris.hq.firespring.com>

Jeff Utter <jeff.utter@firespring.com> wrote:
> On November 12, 2015 at 7:23:13 PM, Eric Wong (e@80x24.org) wrote:
> > You don't have to return all the data you'd aggregate with raindrops,  
> > though. Just what was requested.  
> 
> Just to make sure I understand this correctly though, in order for the
> metrics to be available between workers, the raindrops structs would
> need to be setup for each metric before unicorn forks? 

Yes.  But most (if not all) metrics you'd care about will need
aggregation, and thus must be known/aggregated for the lifetime
of a process, correct?

> > GDBM (in the stdlib), SQLite, RRD, or any "on-filesystem"[1] data store  
> > should work, even. If you do have a lot of stats; batch the updates  
> > locally (per-worker) and write them to a shared area periodically.
> 
> Are you suggesting this data store would be shared between workers or
> one per worker (and whatever displays the metrics would read all the
> stores)? I tried sharing between workers with DBM and GDBM and both of
> them end up losing metrics due to being overwritten by other threads.
> I imagine I would have to lock the file whenever one is writing, which
> would block other workers (not ideal). Out of the box PStore works
> fine for this (surprisingly). I'm guessing it does file locks behind
> the scenes.

The data in the on-filesystem store would be shared across processes.
But you'd probably want to aggregate locally in a hash before flushing
periodically.

You're probably losing data because DB file descriptors are shared
across fork.  You need to open DBs/connections after forking.  With any
DB, you can't expect to open/share open file descriptors across fork.

You can safely share UDP sockets across fork, and likely SCTP if
implemented in the kernel (I haven't tried).  But any userland wrappers
on top of the UDP socket (e.g. statsd, as Michael mentioned) will need
to be checked for fork-friendliness.

> Right now I'm thinking that the best way to handle this would be one
> data store per worker and then whatever reads the metrics scrapes them
> all read-only. My biggest concern with this approach is knowing which
> data-stores are valid. I suppose I could put them all in a folder
> based off the parent's pid. However, would it be possible that some
> could be orphaned if a worker is killed by the master? I would need
> some way for the master to communicate to the collector (probably in a
> worker) what other workers are actively running. Is that possible?

I don't think you to worry about all that.  You'd want stats even for
dead workers to stick around if they were running the same code as
current worker.

OTOH, you probably want to reset/drop stats on new deploys;
so maybe key the stats based on the version of the app you're running.

I also forget to mention I've used memcached for some stats, too.  It's
great when the data is fast-expiring, disposable and needs to be shared
across several machines; not just processes within the same host.

next prev parent reply	other threads:[~2015-11-13 20:54 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-13  0:51 Shared Metrics Between Workers Jeff Utter
2015-11-13  1:23 ` Eric Wong
2015-11-13 14:33   ` Jeff Utter
2015-11-13 20:54     ` Eric Wong [this message]
2015-11-13 11:04 ` Michael Fischer
2015-11-13 14:37   ` Jeff Utter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://yhbt.net/unicorn/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151113205427.GA7237@dcvr.yhbt.net \
    --to=e@80x24.org \
    --cc=jeff.utter@firespring.com \
    --cc=unicorn-public@bogomips.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://yhbt.net/unicorn.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).