WIP: Concurrency design #247

katcipis · 2017-11-10T20:57:21Z

No description provided.

codecov-io · 2017-11-10T20:58:02Z

Codecov Report

❗ No coverage uploaded for pull request base (master@2d426d7). Click here to learn what that means.
The diff coverage is n/a.

@@            Coverage Diff            @@
##             master     #247   +/-   ##
=========================================
  Coverage          ?   56.12%           
=========================================
  Files             ?       26           
  Lines             ?     4269           
  Branches          ?        0           
=========================================
  Hits              ?     2396           
  Misses            ?     1646           
  Partials          ?      227

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2d426d7...ddb99b0. Read the comment docs.

i4ki · 2017-11-11T23:06:05Z

proposal/2-concurrency.md

+<-$chan
+```
+
+Fan out and fan in should be pretty trivial:


What about the blocking/nonblocking types of channel?

If only blocking channels are supported, then some builtin function to 'select' between then will be required. What do you think?

Select would be great =D. Nothing against buffered channels either...just not thinking about it right now. Actually I'm supposing you are talking about the buffered channels, nonblocking channels do not exist on Go, when the channel is full it will also block, the only way to guarantee that you will never block is by using select + default.

What about the blocking/nonblocking types of channel?

What I mean was blocking/non-blocking types of communication, sorry, but buffered channels and/or select are ways to achieve this in Go, but I don't know how much of Go semantics makes sense to export to nash.

Ow I see. Erlang has only message delivery, like network datagrams, they are non blocking. The problem with non-blocking is that they infer some sort of queue on the receiver (on Erlang it is the process mailbox), because I can just send 10000 trillion messages without blocking.

Go makes this queue explicit and will still block the sender...not sure what model is better right now.

i4ki · 2018-01-18T03:12:39Z

❌ Build nash 1.0.0.87 failed (commit NeowayLabs@7504b19db1 by @katcipis)

vitorarins · 2018-01-18T16:25:11Z

proposal/2-concurrency.md

+    send($senderpid, "pong")
+}()
+
+send($pid, "ping", self())


I feel this is always going to be ..., self()). Do you agree? Maybe we can remove this third argument.

Actually I don't. This concurrency model is highly decoupled, I'm not sure if it is a good idea to infer that a child process will always talk with the parent. The parent could actually send the pid of another process that will want the answer. Like starting a job source and N workers, maybe the answers will go to the job source so he knows if more jobs need to be sent.

But I feel your feelings that may make the parent->child relationship a little more verbose

Perhaps we could go to a more specific model that always impose a parent/child relationship, it would make some things easier...but I was liking the idea of the function spawned having no pre-defined signature for arguments (full flexibility), but I may be biased. @tiago4orion what do you think ?

Maybe name this builtin sendto and add send to stdlib like:

fn send(pid, data) { return sendto($pid, $data, self()) }

Good idea. But I'm rethinking being able to send multiple args and receiving them. On the send part it is ok since there is no timeout for sending, since it will not block. But receiving is a little more complicated. It seems useful to have a timeout when you run a receive, and you will need to be notified if the receive worked or if the timeout has expired, so receiving a dynamic number of args on the return makes it a little awkward to receive the error too (the return arity will always be the send arity + 1).

Perhaps if receive was a syntatic construct instead of a function and we had something similar to pattern matching, but only for arity, like (WARNING, heavily improvised syntax =P):

receive timeout { onearg <= { # code that used onearg here } two1, two2 <= { # code that uses two args here } timeout <= { # code that handles timeout here } }

@tiago4orion @vitorarins What you think about the syntatic receive ?

If we dont go for some syntatic support I think we need to always send/receive only one value (would not be that bad if we got maps soon).

I like the idea but the example is kind of confusing to me.
I would be more comfortable with something like:

receive timeout { fn(onearg) { # code that used onearg here } fn(two1, two2) { # code that uses two args here } fn() { # code that handles timeout here } }

For now neither make me very happy...but if the idea is good we can search for something that looks nice =D

i4ki · 2018-01-18T18:40:55Z

❌ Build nash 1.0.0.88 failed (commit NeowayLabs@c2c55499a9 by @katcipis)

i4ki · 2018-01-18T18:52:00Z

❌ Build nash 1.0.0.89 failed (commit NeowayLabs@c300d5f4f5 by @katcipis)

i4ki · 2018-01-18T21:06:18Z

❌ Build nash 1.0.0.90 failed (commit NeowayLabs@0d5812d379 by @katcipis)

…cyDesign

i4ki · 2018-01-19T17:41:59Z

❌ Build nash 1.0.0.92 failed (commit NeowayLabs@de0afe1ad0 by @katcipis)

i4ki · 2018-01-22T14:10:37Z

proposal/2-concurrency.md

+In this context the process word is used to mean a concurrent thread of
+execution that does not share any data. The only means of communication
+are through message passing. Since these processes are lightweight
+creating a lot of them will be cheap (at least must cheaper than


that really depends on the implementation (I think).
For example, the code below:

import io import fmt import hugelibrary import anotherhugelibrary fn worker() { # uses io, fmt and hugelibrary } spawn worker()

In the code above, the entire environment of the parent interpreter should be copied to every lightweight process? Then maybe it will not be lightweight anymore and we can have a big performance penalty at each process spawn. Note that worker function doesn't use anotherhugelibrary but it gets copied anyway.

But if the parent environment is not copied, having each process to import their own libraries could be painful:

fn worker1() { import io import fmt import hugelibrary } fn worker2() { import io import fmt import anotherhugelibrary # some code } spawn worker1() spawn worker2()

This way the processes are really lightweight, could bootstrap really fast, but the processes needs to be self-contained and import everything they need every time.

I don't know what's better. Maybe a mixed approach? Copy the stdlib to every process but leave other libraries to explicit import? I dont know.

I'm not sure either, I thought about that after I wrote this, my problem is not even with huge dependencies on the sense that this will make things not be lightweight, my problem is lack of isolation and sharing of state. If the same imported module is shared them the module state is also shared and it is a violation of the idea. I'm in doubt for two behaviors:

1 - Automatically reload the modules of the parent, but they will be freshly loaded modules (all module initialization is executed again). Perhaps that is already what you have in mind.

2 - Don't load anything, import again

Perhaps is a lot of cases the code executed concurrenly will not use the dependencies. It will depend a lot on the use case. az cli use cases will be a mess with any model since the login action will affect the user's home directory...but this is pretty much bad coding from microsoft...as usual. On this case the only one who garantees isolation is the rfork approach (we are going to have both anyway since they have pretty different use cases)

Not a huge fan o the hybrid =P

i4ki · 2018-01-22T14:11:39Z

proposal/2-concurrency.md

+the behavior will be to timeout the request and try again (possibly
+to another service instance through a load balancer).
+
+To implement this idea we can add a timeout to the receive an add


typo: 'and add'

i4ki · 2018-01-22T14:21:39Z

proposal/2-concurrency.md

+
+Instead of using channel instances in this model you send messages
+to processes (actor model), it works pretty much like a networking
+model using UDP datagrams.


Something must be written about queuing of data. Should Every process have a queue of incoming data? What happens if data is being sent to some process but it never reads (never invokes receives)? Is it buffered? discarded? and so on. I think it should be buffered and we must document a maximum buffer size.
What do you guys think?

Sorry, now I saw the TODO section :-)

I don't think that writing code that depends on queues being infinite is a good idea, but when we where talking about implementing more higher level languages the idea of a natural number that gets as big as it can get did not sound like a problem =P, but to be fair the queue exausting memory will be easier indeed.

For me, or it is always infinite (as erlang) and you should never just send trillions of messages without waiting for some answer. Or on spawn we can pass the queue size. An internal fixed magic chosen values does not seem like a good choice to me x_x.

i4ki · 2018-01-22T14:22:54Z

proposal/2-concurrency.md

+error will always involve a pid that has no owner (the process never
+existed or already exited).
+
+We could add a more specific error message if we decide that


i4ki · 2018-01-22T14:25:38Z

proposal/2-concurrency.md

+
+### TODO
+
+Spawned functions should have access to imported modules ?


we need to think carefully about this, too many considerations. What about erlang?

Unable to find anything:

http://erlang.org/doc/reference_manual/processes.html

There is a lot of interesting concepts like linking and success exit VS errored exit...but they useful to build more robust/complex distributed systems where supervisors restarts failed processes. For now I think we can live without it =D. Perhaps just providing a way to check if a process terminated would be interesting.

i4ki · 2018-01-22T14:27:03Z

proposal/2-concurrency.md

+(seems like no, but some usages of this may seem odd)
+
+If send is never blocking, what if process queue gets too big ?
+just go on until memory exhausts ?


I don't think so. I like the idea of limits in the runtime, maybe user could change with builtin functions or env vars?

In networking you usually get to know that some other process is unable to answer you because the answer never gets received. This can happen for a lot of reasons (even packets being dropped because the queue is too big). So it would be a severe error to just send a lot of messages never waiting for some kind of response (it seems odd to me, but perhaps there is some use case). I think it is because of that that Erlang has no limit on the process mailbox. But I'm not against having a queue size either, I'm just not that agains infinite mailboxes as I used to be =P, for them to generate problems you must be already doing something wrong (exausting memory using messages required really big messages and a lot of messages).

The only thing that I'm certaing is that writing code that depends on infinite queue sizes is really a bad idea, perhaps we will help people to avoid idiotic problems.

In this case send will not return a boolean "ok" anymore since it is important to differentiate between a process that has its queue full and a process that is dead. Perhaps this kind of complexity that Erlang avoided with unlimited mailboxes.

Thinking about datagram networking with UDP the only error handling that exists is the ICMP package indicating that there is no program listening on that port...all other errors will be detected by not receiving an answer forever (if you care for one).

Perhaps we can still stick with the boolean and use a queue size as a parameter...like OS'es do ?

Well, we can have infinite queues also but then we'll need some API for queue monitoring. If the memory used is too high, I want a way to easily know how much data is pending to be processed for debugging slow processes.

i4ki · 2018-01-22T14:36:01Z

proposal/2-concurrency.md

+inconsistent with function calls
+
+What happens when something is written on the stdout of a spawned
+process ? redirect to parent shell ?


yes, by default I think this is the right behaviour: redirect to parent's stdout.

i4ki · 2018-01-24T21:09:25Z

❌ Build nash 1.0.0.120 failed (commit NeowayLabs@48cdf6bf7b by @katcipis)

Add first notes on rfork

c5cbe67

katcipis requested review from i4ki and vitorarins November 10, 2017 20:57

katcipis self-assigned this Nov 10, 2017

katcipis mentioned this pull request Nov 10, 2017

builtin: concurrency as builtin functions #224

Open

i4ki reviewed Nov 11, 2017

View reviewed changes

i4ki mentioned this pull request Nov 16, 2017

lang: Discussion about 'worlds' for handling fs side-effects #249

Open

Add initial draft of concurrency with actor model

941b2c1

vitorarins reviewed Jan 18, 2018

View reviewed changes

Add fan-out fan-in example

ec7bdb4

Add some ideas on error handling

05f4ba9

Add more TODO

5e0830f

Merge branch 'master' of github.com:NeowayLabs/nash into addConcurren…

72f995f

…cyDesign

i4ki reviewed Jan 22, 2018

View reviewed changes

katcipis added 2 commits January 24, 2018 18:56

Add more elaborated example

95a5840

Add info about how stdout is handled

ddb99b0

katcipis merged commit 4a37709 into master May 4, 2018

katcipis deleted the addConcurrencyDesign branch May 4, 2018 03:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Concurrency design #247

WIP: Concurrency design #247

katcipis commented Nov 10, 2017

codecov-io commented Nov 10, 2017 •

edited

Loading

i4ki Nov 11, 2017

katcipis Nov 13, 2017

i4ki Nov 16, 2017

katcipis Nov 16, 2017

i4ki commented Jan 18, 2018

vitorarins Jan 18, 2018

katcipis Jan 18, 2018

katcipis Jan 18, 2018

i4ki Jan 22, 2018

katcipis Jan 22, 2018

katcipis Jan 23, 2018

vitorarins Jan 23, 2018

katcipis Jan 23, 2018

i4ki commented Jan 18, 2018

i4ki commented Jan 18, 2018

i4ki commented Jan 18, 2018

i4ki commented Jan 19, 2018

i4ki Jan 22, 2018

katcipis Jan 22, 2018

katcipis Jan 22, 2018

i4ki Jan 22, 2018

i4ki Jan 22, 2018

i4ki Jan 22, 2018

katcipis Jan 22, 2018

i4ki Jan 22, 2018

i4ki Jan 22, 2018

katcipis Jan 22, 2018

i4ki Jan 22, 2018

katcipis Jan 22, 2018

katcipis Jan 22, 2018

i4ki Jan 23, 2018

i4ki Jan 22, 2018

i4ki commented Jan 24, 2018


		### TODO

		Spawned functions should have access to imported modules ?

WIP: Concurrency design #247

WIP: Concurrency design #247

Conversation

katcipis commented Nov 10, 2017

codecov-io commented Nov 10, 2017 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

i4ki commented Jan 18, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

i4ki commented Jan 18, 2018

i4ki commented Jan 18, 2018

i4ki commented Jan 18, 2018

i4ki commented Jan 19, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

i4ki commented Jan 24, 2018

codecov-io commented Nov 10, 2017 •

edited

Loading