This is the second and final article in a set of two.The first covered
a pluggable locking toolkit. Here, we'll explore more advanced patterns.
At Booking.com, we use Perl and IPC::ConcurrencyLimit
[3].
You may use another language and another locking API: the strategies
described here should be transferable.
Writing daemons correctly and dealing with all maintenance and reliability requirements correctly isn't easy. In our primary development language, Perl, you additionally suffer from rather unpredictable handling of signals[1], both relating to interruption of system calls and non-interruption of pathologically slow internal operations (eg. slow regular expressions). In short, it can be very convenient to avoid having to deploy daemons or system services at all.
At Booking.com, we occasionally resort to a rather curious replacement for daemons in certain classes of services. The basic recipe is to run a cron job frequently and encode a notion of life time into the program so that it does not live forever and gets replaced by subsequent runs. Sounds odd? Consider the following: Slow memory leaks no longer kill your service (and machine) dead. You should still fix the leaks, but you can do so in your schedule instead of on the spot. Rolling out new code does not require explicit daemon restart. No need to even have shell access to the machines you roll out to. Nor do you have to encode knowledge of the service into your generic roll-out infrastructure[2]. Further down the road, we'll show that this strategy also allows for advanced patterns like scaling capacity with demand and distributing across multiple machines. The system is also somewhat resilient to crashes, but so is an init-based daemon. At the heart of the strategy is, once again, locking. The running process obtains a lock which it releases when it exits. The candidates attempt to get the lock at start-up, fail, and terminate. Here is the basic recipe in pseudo-code:
- Run once per minute (or once every couple). - Attempt to get run lock. - If failed to get run lock, then exit. - If succeeded, then execute main loop until life time reached, then exit.
Simple enough! Using the IPC::ConcurrencyLimit
Perl module, we can
implement this trivially:
use5.14.2;usewarnings;useIPC::ConcurrencyLimit;useconstantLIFE_TIME=>5*60;# 5 min process life timemy$limit=IPC::ConcurrencyLimit->new(type=>'Flock',max_procs=>1,path=>'/var/run/myapp',);my$lock_id=$limit->get_lock;if(not$lock_id){# Other process running.exit(0);}else{my$end_time=time()+LIFE_TIME;while(1){process_work_unit();lastiftime()>=$end_time;}exit(0);}
In real code, you would obviously have logging facilities and pull the
life time and lock path out of your configuration system. This simple
setup works fine if it is acceptable for your service to be down for
up to a minute at a time. This happens when the running process exits
right after another minutely cron invocation. With the above code, we
basically guarantee this to happen all the time since the cron invocation
and the life time have the same basic unit (minutes). This is easy to
improve on with some jitter. Replace the LIFE_TIME
constant with this:
useconstantLIFE_TIME=>5*60+int(rand(60));# 5-6 min life time
Of course, we can still run into the same gaps in process availability by chance. Since we started from the premise of a daemon, that is really rather unlikely to be acceptable, so there is room for some improvement. We can get rid of the problem down to the level of seconds by introducing the notion of a standby process that takes over from the recently deceased daemon:
- Run once per minute (or once every couple). - Attempt to get run lock. - If succeeded to get run lock, then execute main loop until life time reached, then exit. - If failed to get run lock, then - Attempt to get standby lock. - If failed, exit. - If succeeded, then attempt to get run lock in a loop with short retry interval.
In other words, a second process stays in memory to replace the
main process on short notice. The cron iteration time simply needs to
be lower than the main process life time and we will have virtually
no appreciable gaps in availability[4].
Thankfully, this kind of logic
is already available from IPC::ConcurrencyLimit
out of the box:
use5.14.2;usewarnings;useIPC::ConcurrencyLimit;useIPC::ConcurrencyLimit::WithStandby;# 5-6 min process life timeuseconstantLIFE_TIME=>5*60+int(rand(60));# in seconds# Attempt to get main lock every 1/2 seconduseconstantMAIN_LOCK_INTERVAL=>1/2;# in seconds# Keep retrying for ~3x lifetime of the worker processuseconstantMAIN_LOCK_RETRIES=>1+3*int(LIFE_TIME/MAIN_LOCK_INTERVAL);my$limit=IPC::ConcurrencyLimit::WithStandby->new(type=>'Flock',path=>'/var/run/myapp',max_procs=>1,standby_path=>'/var/run/myapp/standby',standby_max_procs=>1,interval=>MAIN_LOCK_INTERVAL,retries=>MAIN_LOCK_RETRIES,);my$lock_id=$limit->get_lock;if(not$lock_id){# Other process running.exit(0);}else{my$end_time=time()+LIFE_TIME;while(1){process_work_unit();lastiftime()>=$end_time;}exit(0);}
Only the setup of the lock object has changed in this modified example.
All the standby lock logic is hidden within theIPC::ConcurrencyLimit::WithStandby
interface. We did have to configure
it a bit more thoroughly, however. IPC::ConcurrencyLimit::WithStandby
uses two locks internally -- one for the actual worker process and one
for the standby process that can take over the main process'
responsibilities on short notice. What "short notice" means is defined
by the interval
parameter: It re-attempts to get the lock in the specified
interval and retries this "retries" times before giving up.
Don't be fooled to think that this means that we can still end up with gaps in the time-coverage beyond the length of one interval. When the main process is done, the standby process gets promoted and there's a new standby process spawned by cron. This is reliable if there are no early crashes and the following relations hold:
S > L > C (1)
where C
is the time interval with which cron spawns new processes,L
is the maximum life-time of the main worker, and S
is the minimum
wait time of the standby process. The relation
above is easy to show if you consider that a standby process needs
to stick around long enough that it replaces the main process when
it exits (thus S > L
) while cron only has to spawn new standby
processes often enough to replenish the standby process before the
main process might exit (thus L > C
).
You can increase S
and decrease C
to your heart's content to allow
for contingency in the real world of rarely (but occasionally)
crashing worker processes[5].
After this bit of consideration, let's look at another extension to our
example. We can trivially support multiple main processes at the same
time to extend capacity beyond one core. It only takes two small
changes to our setup. The max_procs
and standby_max_procs
parameters can be increased to allow for more processes of this type
to run simultaneously[6]. In order to satisfy relation (1) above,
you will now have to increase the rate at which cron spawns new
processes, too. Since that might not be possible, you can instead
extend process life time on the standby- and main processes or opt
to spawn multiple processes from cron per iteration. This yields a
modified form of relation (1) to assert full capacity:
S/k > L/k > C (2)
where k
is understood to be the number of concurrent processes
required to guarantee full capacity. If the cron is set up to spawn
many processes at a time, but at a lower rate (eg. spawn ten workers
at a time, but do that only every ten minutes), then we still require
relation (1) to hold true independently to guarantee that there is
always a process available. This is easy to see in the extreme case:
If you spawn a very large number of processes at a low rate, then they
fill all available worker and standby process slots, the remainder
just quits. The workers and standby processes then die off before the
next cron that replenishes the pools. A multi-worker fork version of
the previous example follows.
use5.14.2;usewarnings;useIPC::ConcurrencyLimit;useIPC::ConcurrencyLimit::WithStandby;usePOSIXqw(ceil);useTime::HiResqw(sleep);# external settingsuseconstantLIFE_TIME=>5*60;# base life time in secondsuseconstantMAX_WORKERS=>16;# max concurrent workersuseconstantCRON_RATE=>60;# cron spawns more every 60s# This asserts relation (2), but not necessarily relation (1).useconstantNCHILDREN=>2+ceil(2*MAX_WORKERS*CRON_RATE/LIFE_TIME);# Attempt to get main lock every 1/2 seconduseconstantMAIN_LOCK_INTERVAL=>1/2;# in seconds# Set the standby life time: Keep retrying for ~3x lifetime of the worker processuseconstantMAIN_LOCK_RETRIES=>1+int(3*LIFE_TIME/MAIN_LOCK_INTERVAL);# Fork the decided number of workersmy$is_child;for(1..NCHILDREN){$is_child=fork_child();lastif$is_child;sleep(0.2);# no need to have artificial contention on the locks}exit(0)ifnot$is_child;# all children daemonizedmy$limit=IPC::ConcurrencyLimit::WithStandby->new(type=>'Flock',path=>'/var/run/myapp',max_procs=>MAX_WORKERS,standby_path=>'/var/run/myapp/standby',standby_max_procs=>MAX_WORKERS,interval=>MAIN_LOCK_INTERVAL,retries=>MAIN_LOCK_RETRIES,process_name_change=>1,);my$lock_id=$limit->get_lock;if(not$lock_id){# Other process running.exit(0);}else{my$end_time=time()+LIFE_TIME*(1+rand(0.1));while(1){process_work_unit();lastiftime()>=$end_time;}exit(0);}# mostly standard daemonization from the perlipc manpagesub fork_child{useautodie;defined(my$pid=fork)ordie"Can't fork: $!";return()if$pid;chdir'/';openSTDIN,'/dev/null';openSTDOUT,'>/dev/null';die"Can't start a new session: $!"ifPOSIX::setsid==-1;openSTDERR,'>&STDOUT';return1;}
Thanks to the pluggable IPC::ConcurrencyLimit::Lock
locking back-ends,
you can use the exact same technique to scale your application across
multiple machines. All it takes is swapping out the lock type for a
non-local lock. Do note, however, that distributed locking comes at a
a price. Some implementations only offer very coarse grained locking
(eg. Apache ZooKeeper), some at very high cost (eg.
the NFS locking back-end), and most of them have non-trivial complexity in edge cases.
The new process_name_change
option simply makesIPC::ConcurrencyLimit::WithStandby
modify the process name of the
standby processes to note that they are on standby. This helps when
trying to tell apart stuck workers from workers on standby.
As a final example of how the method can be adapted for many needs, we'll visit dynamic scaling. This is to say, changing the processing capacity (ie. the number of processes) depending on the demand. All the ingredients were covered earlier. The only significant change is giving an active process some control over its own life time. Once it has passed the minimum life time required for full availability, it may choose to continue processing up to some maximum life time if there is high demand. By modifying the maximum number of running jobs to be higher than the expected number of running jobs given the cron spawn rate and the minimum life time, one obtains a system that will scale the number of workers up to what is required to satisfy demand.
In this article, we've explored several advanced techniques around the pattern of replacing true daemons with cron & lock based multi-processing. While some of the discussion went into some detail, the complexity of operating such a system is relatively low and depends on the exact requirements. The technique is easy to adapt to many situations and avoids many problems associated with writing daemons.
[1] For details, seeprevious article about Devel::TrackSIG.
[2] By the way, ours is driven bygit-deploy.
[3]IPC::ConcurrencyLimit was covered in the first article.
[4] Remember the premise "replacing daemons in certain classes of services"? This is the main limitation. With this setup we still get holes in the time coverage of the order of a second or a fraction of a second, so it's not appropriate for low-latency, very-high-availability situations.
[5] Application of probability theory to quantify this left as an exercise to the reader, but in practice, it's okay to be generous with the life time of the standby process and choosing the life time of the main process to be much larger than the cron interval.
[6] Writing the application to allow for concurrency is, once again left as an exercise to the reader.