From c5cbe6741d08e4134e65f591ae31737485298ccf Mon Sep 17 00:00:00 2001 From: Tiago Katcipis Date: Fri, 10 Nov 2017 18:55:59 -0200 Subject: [PATCH 1/3] Add first notes on rfork --- proposal/2-concurrency.md | 83 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 83 insertions(+) create mode 100644 proposal/2-concurrency.md diff --git a/proposal/2-concurrency.md b/proposal/2-concurrency.md new file mode 100644 index 00000000..f7a6cf9d --- /dev/null +++ b/proposal/2-concurrency.md @@ -0,0 +1,83 @@ +# Proposal: Concurrency on Nash + +There has been some discussion on how to provide concurrency to nash. +There is a [discussion here](https://github.com/NeowayLabs/nash/issues/224) +on how concurrency could be added as a set of built-in functions. + +As we progressed discussing it seemed desirable to have a concurrency +that enforced no sharing between concurrent functions. It eliminates +races and forces all communication to happen explicitly, and the +performance overhead would not be a problem to a high level language +as nash. + +Converging to a no shared state between concurrent functions initiated +the idea of using the current rfork built-in as a means to express +concurrency on Nash. This would already be possible today, the idea +is just to make it even easier, specially the communication between +different concurrent processes. + +This idea enables an even greater amount of isolation between concurrent +processes since rfork enables different namespaces isolation (besides memory), +but it has the obvious fallback of not being very lightweight. + +Since the idea of nash is to write simple scripts this does not seem +to be a problem. If it is on the future we can create lightweight concurrent +processes (green threads) that works orthogonally with rfork. + +The prototype for the new rfork would be something like this: + +```sh +chan <= rfork [ns_param1, ns_param2] (chan) { + //some code +} +``` + +The code on the rfork block does not have access to the +lexical outer scope but it receives as a parameter a channel +instance. + +This channel instance can be used by the forked processes and +by the creator of the process to communicate. We could use built-in functions: + +```sh +chan <= rfork [ns_param1, ns_param2] (chan) { + cwrite($chan, "hi") +} + +a <= cread($chan) +``` + +Or some syntactic extension: + +```sh +chan <= rfork [ns_param1, ns_param2] (chan) { + $chan <- "hi" +} + +a <= <-$chan +``` + +Since this channel is meant only to be used to communicate with +the created process, it will be closed when the process exit: + +```sh +chan <= rfork [ns_param1, ns_param2] (chan) { +} + +# returns empty string when channel is closed +<-$chan +``` + +Fan out and fan in should be pretty trivial: + +```sh +chan1 <= rfork [ns_param1, ns_param2] (chan) { +} + +chan2 <= rfork [ns_param1, ns_param2] (chan) { +} + +# waiting for both to finish +<-$chan1 +<-$chan2 +``` From 941b2c1beb262d84dfa158f2e312ca61973f132a Mon Sep 17 00:00:00 2001 From: Tiago Katcipis Date: Thu, 18 Jan 2018 01:10:58 -0200 Subject: [PATCH 2/3] Add initial draft of concurrency with actor model --- proposal/2-concurrency.md | 60 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 60 insertions(+) diff --git a/proposal/2-concurrency.md b/proposal/2-concurrency.md index f7a6cf9d..feca1bf8 100644 --- a/proposal/2-concurrency.md +++ b/proposal/2-concurrency.md @@ -10,6 +10,66 @@ races and forces all communication to happen explicitly, and the performance overhead would not be a problem to a high level language as nash. +## Lightweight Processes + +This idea is inspired on Erlang concurrency model. Since Nash does +not aspire to do everything that Erlang does (like distributed programming) +so this is not a copy, we just take some things as inspiration. + +Why call this a process ? On the [Erlang docs](http://erlang.org/doc/getting_started/conc_prog.html) +there is a interesting definition of process: + +``` +the term "process" is usually used when the threads of execution share no +data with each other and the term "thread" when they share data in some way. +Threads of execution in Erlang share no data, +that is why they are called processes +``` + +In this context the process word is used to mean a concurrent thread of +execution that does not share any data. The only means of communication +are through message passing. Since these processes are lightweight +creating a lot of them will be cheap (at least must cheaper than +OS processes). + +Instead of using channel instances in this model you send messages +to processes (actor model), it works pretty much like a networking +model using UDP datagrams. + +The idea is to leverage this as a syntactic construction of the language +to make it as explicit and easy as possible to use. + +This idea introduces 4 new concepts, 3 built-in functions and one +new keyword. + +The keyword **spawn** is used to spawn a function as a new process. +The function **send** is used to send messages to a process. +The function **receive** is used to receive messages from a process. +The function **self** returns the pid of the process calling it. + +An example of a simple ping/pong: + +``` +pid <= spawn fn () { + ping, senderpid <= receive() + echo $ping + send($senderpid, "pong") +}() + +send($pid, "ping", self()) +pong <= receive() + +echo $pong +``` + +TODO: + +* If send is never blocking, what if process queue gets too big ? just go on until memory exhausts ? +* What happens when you send to a invalid pid ? (or a pid of a process that is not running anymore) +* Example on how would fan-out/fan-in look with this idea + +## Extend rfork + Converging to a no shared state between concurrent functions initiated the idea of using the current rfork built-in as a means to express concurrency on Nash. This would already be possible today, the idea From ec7bdb4b40389bec81653d220d4c445fac430d33 Mon Sep 17 00:00:00 2001 From: Tiago Katcipis Date: Thu, 18 Jan 2018 16:38:29 -0200 Subject: [PATCH 3/3] Add fan-out fan-in example --- proposal/2-concurrency.md | 57 ++++++++++++++++++++++++++++++++++++--- 1 file changed, 53 insertions(+), 4 deletions(-) diff --git a/proposal/2-concurrency.md b/proposal/2-concurrency.md index feca1bf8..f7708baa 100644 --- a/proposal/2-concurrency.md +++ b/proposal/2-concurrency.md @@ -62,11 +62,60 @@ pong <= receive() echo $pong ``` -TODO: +Spawned functions can also receive parameters (always deep copies): + +``` +pid <= spawn fn (answerpid) { + send($answerpid, "pong") +}(self()) + +pong <= receive() +echo $pong +``` + +A simple fan-out/fan-in implementation: + +``` +jobs = ("1" "2" "3" "4" "5") + +for job in $jobs { + spawn fn (job, answerpid) { + import io + + io_println("job[%s] done", $job) + send($answerpid, format("result [%s]", $job)) + }($job, self()) +} + +for job in $jobs { + result <= receive() + echo $result +} +``` + +### Error Handling + +Error handling on this concurrency model is very similar to +how we do it on a distributed system. If a remote service fails and +just dies and you are using UDP you will never be informed of it, +the behavior will be to timeout the request and try again (possibly +to another service instance through a load balancer). + + +### TODO + +Spawned functions should have access to imported modules ? +(seems like no, but some usages of this may seem odd) + +If send is never blocking, what if process queue gets too big ? +just go on until memory exhausts ? + +Not sure if passing parameters in spawn will not make things +inconsistent with function calls + +What happens when you send to a invalid pid ? +(or a pid of a process that is not running anymore). -* If send is never blocking, what if process queue gets too big ? just go on until memory exhausts ? -* What happens when you send to a invalid pid ? (or a pid of a process that is not running anymore) -* Example on how would fan-out/fan-in look with this idea ## Extend rfork