go get github.com/catmullet/go-workers
giving an alias helps since go-workers doesn't exactly follow conventions.
(If you're using a JetBrains IDE it should automatically give it an alias)
import (
workers "github.com/catmullet/go-workers"
)
The NewWorker factory method returns a new worker.
(Method chaining can be performed on this method like calling .Work() immediately after.)
type MyWorker struct {}
func NewMyWorker() Worker {
return &MyWorker{}
}
func (my *MyWorker) Work(in interface{}, out chan<- interface{}) error {
// work iteration here
}
runner := workers.NewRunner(ctx, NewMyWorker(), numberOfWorkers)
Send accepts an interface. So send it anything you want.
runner.Send("Hello World")
Any error that bubbles up from your worker functions will return here.
if err := runner.Wait(); err != nil {
//Handle error
}
By using the InFrom method you can tell workerTwo
to accept output from workerOne
runnerOne := workers.NewRunner(ctx, NewMyWorker(), 100).Work()
runnerTwo := workers.NewRunner(ctx, NewMyWorkerTwo(), 100).InFrom(workerOne).Work()
It is possible to accept output from more than one worker but it is up to you to determine what is coming from which worker. (They will send on the same channel.)
runnerOne := workers.NewRunner(ctx, NewMyWorker(), 100).Work()
runnerTwo := workers.NewRunner(ctx, NewMyWorkerTwo(), 100).Work()
runnerThree := workers.NewRunner(ctx, NewMyWorkerThree(), 100).InFrom(workerOne, workerTwo).Work()
Fields can be passed via the workers object. Be sure as with any concurrency in Golang that your variables are concurrent safe. Most often the golang documentation will state the package or parts of it are concurrent safe. If it does not state so there is a good chance it isn't. Use the sync package to lock and unlock for writes on unsafe variables. (It is good practice NOT to defer in the work function.)
ONLY use the Send()
method to get data into your worker. It is not shared memory unlike the worker objects values.
type MyWorker struct {
message string
}
func NewMyWorker(message string) Worker {
return &MyWorker{message}
}
func (my *MyWorker) Work(in interface{}, out chan<- interface{}) error {
fmt.Println(my.message)
}
runner := workers.NewRunner(ctx, NewMyWorker(), 100).Work()
If your workers needs to stop at a deadline or you just need to have a timeout use the SetTimeout or SetDeadline methods. (These must be in place before setting the workers off to work.)
// Setting a timeout of 2 seconds
timeoutRunner.SetTimeout(2 * time.Second)
// Setting a deadline of 4 hours from now
deadlineRunner.SetDeadline(time.Now().Add(4 * time.Hour))
func workerFunction(in interface{}, out chan<- interface{} error {
fmt.Println(in)
time.Sleep(1 * time.Second)
}
If you want to write out to a file or just stdout you can use SetWriterOut(writer io.Writer). The worker will have the following methods available
runner.Println()
runner.Printf()
runner.Print()
The workers use a buffered writer for output and can be up to 3 times faster than the fmt package. Just be mindful it won't write out to the console as quickly as an unbuffered writer. It will sync and eventually flush everything at the end, making it ideal for writing out to a file.
If your application is based solely around using workers, consider upping the percentage of when the scheduler will garbage collect. (ex. GOGC=200) 200% -> 300% is a good starting point. Make sure your machine has some good memory behind it. By upping the percentage your application will interupt the workers less, meaning they get more work done. However, be aware of the rest of your applications needs when modifying this variable.
For workers that run quick bursts of lots of simple data consider lowering the GOMAXPROCS. Be carfeful though, this can affect your entire applicaitons performance. Profile your application and benchmark it. See where your application runs best.