Skip to content

treasure-data/perfectsched

Folders and files

NameName
Last commit message
Last commit date
Oct 2, 2022
May 20, 2022
Aug 29, 2011
Sep 22, 2017
Dec 8, 2015
May 15, 2012
Sep 16, 2016
Sep 18, 2016
Sep 7, 2015
Feb 26, 2018
Feb 26, 2018
Nov 6, 2015
May 15, 2012
Sep 16, 2016
Sep 22, 2017

Repository files navigation

PerfectSched

Build Status Coverage Status

PerfectSched is a highly available distributed cron built on top of RDBMS.

It provides at-least-once semantics; Even if a worker node fails during process a task, the task is retried by another worker.

PerfectSched also guarantees that only one worker server processes a task if the server is alive.

All you have to consider is implementing idempotent worker programs. It's recommended to use PerfectQueue with PerfectSched.

API overview

# open a schedule collection
PerfectSched.open(config, &block)  #=> #<ScheduleCollection>

# add a schedule
ScheduleCollection#add(task_id, type, options)

# poll a scheduled task
# (you don't have to use this method directly. see following sections)
ScheduleCollection#poll  #=> #<Task>

# get data associated with a task
Task#data  #=> #<Hash>

# finish a task
Task#finish!

# retry a task
Task#retry!

# create a schedule reference
ScheduleCollection#[](key)  #=> #<Schedule>

# chack the existance of the schedule
Schedule#exists?

# delete a schedule
Schedule#delete!

Error classes

ScheduleError < StandardError

##
# Workers may get these errors:
#

AlreadyFinishedError < ScheduleError

NotFoundError < ScheduleError

PreemptedError < ScheduleError

ProcessStopError < RuntimeError

##
# Client or other situation:
#

ConfigError < RuntimeError

AlreadyExistsError < ScheduleError

NotSupportedError < ScheduleError

Example

# submit a task
PerfectSched.open(config) {|sc|
  data = {'key'=>"value"}
  options = {
    :cron => '0 * * * *',
    :delay => 30,
    :timezone => 'Asia/Tokyo',
    :next_time => Time.parse('2013-01-01 00:00:00 +0900').to_i,
    :data => data,
  }
  sc.add("sched-id", "type1", options)
}

Writing a worker application

1. Implement PerfectSched::Application::Base

class TestHandler < PerfectSched::Application::Base
  # implement run method
  def run
    # do something ...
    puts "acquired task: #{task.inspect}"

    # call task.finish!, task.retry! or task.release!
    task.finish!
  end
end

2. Implement PerfectSched::Application::Dispatch

class Dispatch < PerfectSched::Application::Dispatch
  # describe routing
  route "type1" => TestHandler
  route /^regexp-.*$/ => :TestHandler  # String or Regexp => Class or Symbol
end

3. Run the worker

In a launcher script or rake file:

system('perfectsched run -I. -rapp/schedules/dispatch Dispatch')

or:

require 'perfectsched'
require 'app/schedules/dispatch'

PerfectSched::Worker.run(Dispatch) {
  # this method is called when the worker process is restarted
  raw = File.read('config/perfectsched.yml')
  yml = YAJL.load(raw)
  yml[ENV['RAILS_ENV'] || 'development']
}

Signal handlers

  • TERM,INT,QUIT: shutdown
  • USR1,HUP: restart
  • USR2: reopen log files

Configuration

  • type: backend type (required; see following sections)
  • log: log file path (default: use stderr)
  • poll_interval: interval to poll tasks in seconds (default: 1.0 sec)
  • timezone: default timezone (default: 'UTC')
  • alive_time: duration to continue a heartbeat request (default: 300 sec)
  • retry_wait: duration to retry a retried task (default: 300 sec)

Backend types

rdb_compat

additional configuration:

  • url: URL to the RDBMS (example: 'mysql://user:password@host:port/database')
  • table: name of the table to use

rdb

Not implemented yet.

Command line management tool

Usage: perfectsched [options] <command>

commands:
    list                             Show list of registered schedules
    add <key> <type> <cron> <data>   Register a new schedule
    delete <key>                     Delete a registered schedule
    run <class>                      Run a worker process
    init                             Initialize a backend database

options:
    -e, --environment ENV            Framework environment (default: development)
    -c, --config PATH.yml            Path to a configuration file (default: config/perfectsched.yml)

options for add:
    -d, --delay SEC                  Delay time before running a schedule (default: 0)
    -t, --timezone NAME              Set timezone (default: UTC)
    -s, --start UNIXTIME             Set the first schedule time (default: now)
    -a, --at UNIXTIME                Set the first run time (default: start+delay)

options for run:
    -I, --include PATH               Add $LOAD_PATH directory
    -r, --require PATH               Require files before starting

initializing a database

# assume that the config/perfectsched.yml exists
$ perfectsched init

submitting a task

$ perfectsched add s1 user_task '* * * * *' '{}'

listing tasks

$ perfectsched list
                           key            type               cron   delay    timezone                    next_time                next_run_time  data
                            s1       user_task          * * * * *       0         UTC      2012-05-18 22:04:00 UTC      2012-05-18 22:04:00 UTC  {}
1 entries.

delete a schedule

$ perfectsched delete s1

running a worker

$ perfectsched run -I. -Ilib -rconfig/boot.rb -rapps/schedules/schedule_dispatch.rb ScheduleDispatch

About

Highly available distributed cron built on RDBMS

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages