Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

local goto #101

Closed
StefanKarpinski opened this issue Jul 8, 2011 · 14 comments
Closed

local goto #101

StefanKarpinski opened this issue Jul 8, 2011 · 14 comments
Assignees
Labels
speculative Whether the change will be implemented is speculative

Comments

@StefanKarpinski
Copy link
Member

Basically a jmp instruction within a function, much like C's local goto. Requires two pieces of syntax:

  1. labels
  2. goto statements

One issue to keep in mind is that goto might get used anywhere return does, including this:

x = a < b ? -1 :
    a > b ? +1 : goto equal

Also: is there any way we can hack this in with minimal changes using the existing macro syntax?

@ghost ghost assigned JeffBezanson Jul 8, 2011
@StefanKarpinski
Copy link
Member Author

See http://kernel.org/doc/Documentation/CodingStyle, Chapter 7 for rationale of this kind of goto. Local gotos are also, in my experience, the only sane way to handle interactive input with error handling where you might need to unwind the input back to various input stages.

@Borgvall
Copy link

A goto in an assignment!? What is supposed to happen, if x is used afterwards?

The Linux CodingStyle Chapter 7 has a very specialized use case for goto ("centralized exiting of functions"). It might be good to have a language feature rather than relying on convention. Maybe a "prereturn" keyword:

function foobar(...)
    # allocate memory
    ...
    # do something
    ...
    if (isGood(foo)) then return foo else return bar

prereturn
    # clean up my mess
    ...
end

Note that this is only needed because of the "stop the function execution" characteristic of return. For most other use cases a "local function/method" would be a less clumsy way. Proper error handling should be done by some sort of "try/catch". In short gotos even local ones propably generate more problems then it solves.

@StefanKarpinski
Copy link
Member Author

If x were used after that, I would say the same thing would happen as if x simply hadn't been assigned: you get an "x not defined" error. The kernel-style single exit point is only one particular use case for a local goto; there are others. Handling interactive input with error handling with the option of unwinding multiple inputs, for example:

preA:
  A = get_input_A()
  if bad(A); goto preA; end
preB:
  B = get_input_B()
  if bad(B); goto preB; end
  if bad(A,B); goto preA; end
preC:
  C = get_input_C()
  if bad(C); goto preC; end
  if bad(A,B,C); goto preA; end

Something like that. This isn't a contrived hypothetical situation; I've written code like this many times. Doing it with if/then is a complete nightmare.

I'm also not sure what problems goto causes when confined to a single method body. Sure, you can write insane, incomprehensible code with goto, but you can do that without goto just as easily. You can also use goto in a variety of perfectly reasonable ways. Moreover it's very efficient and easy to implement, so why not have it for those cases where it's appropriate?

@Borgvall
Copy link

That is a typical nested "do{A=getA()}while(bad(A))" loop:

do
{
  do
  {
    do{A=getA()}while(bad(A))
    do{B=getB()}while(bad(B))
  }while(bad(A,B))
    do{C=getC()}while(bad(C))
}while(bad(A,B,C))

However it is not my intention to discuss more examples of that kind. I know one can express any control structure with gotos. In a good designed it should rarely be needed or useful. For most sane things such language contains a "matching" control structure (actually Julia lacks do-while loops).

Update: misses -> lacks; code indentation;

@vtjnash
Copy link
Member

vtjnash commented Nov 7, 2012

I've always thought that return makes a nice non-local control construct, with the added benefit that functions stay shorter and more composable. (although the net length of code might be a bit longer)

Nested loops on the stack:

checked_getA() = while true; A = getA(); if !(badA(A)) return A; end; end
checked_getB() = while true; B = getB(); if !(badB(B)) return B; end; end

function checked_getAB()
    while true
    A = checked_getA()
    B = checked_getB()
    if !(badAB(A,B)); return A,B; end
end

Function cleanup (in most cases this could probably be done with a regular function instead of an anonymous one. to some extent, this is like implementing Python's function decorators):

function foobar(...)
    # allocate memory
    ..
    f = function ()
        # do something
        ...
        if (isGood(foo)) then return foo else return bar
    end
    f()
    # clean up my mess
    ...
end

@pygy
Copy link
Contributor

pygy commented May 8, 2013

Local gotos also allow to cleanly emulate tail recursion.

@dcjones
Copy link
Contributor

dcjones commented Feb 5, 2014

I'm revisiting a project I had explored earlier to write a Julia backend for the Ragel parser generator, and am finding myself more or less blocked by the lack of goto in Julia. The really fast parsers that Ragel generates assume the host language has gotos, and trying to emulate them with functions and ifs is going to be ugly and slow.

Is there any chance of gotos in 0.3?

Alternately, if llvmcall gets merged, could that conceivably be used to manually emit br instructions to mimic gotos? That would be good enough for me, since this is generated code that need not necessarily be readable.

@Keno
Copy link
Member

Keno commented Feb 5, 2014

llvmcall wouldn't allow non-local goto's (where local in this case means within one llvmcall block).

@StefanKarpinski
Copy link
Member Author

I tried just writing macros to emit LabelNode and GotoNode expressions, but that doesn't quite work – the goto part actually does, it's the label that seems to get stripped out in the lowering process.

@dcjones
Copy link
Contributor

dcjones commented Feb 6, 2014

I didn't know about LabelNode and GotoNode. After fumbling around I found that the label doesn't get stripped, it's just decremented, so this totally works:

macro label(n)
    LabelNode(eval(n))
end

macro goto(n)
    GotoNode(eval(n) - 1)
end

function infinite_loop()
    @label 1
    @goto 1
    return
end

infinite_loop()

That's totally satisfactory for my purposes.

@vtjnash
Copy link
Member

vtjnash commented Feb 6, 2014

there shouldn't be an eval in the macro. but aside from that, I think this will cause errors in the function lowering code, which does not expect to a label 1 to already exist in the code.

@dcjones
Copy link
Contributor

dcjones commented Feb 6, 2014

Ok, I see. So proper goto support does need to be baked in. Yet, if I were feeling dangerous I could just assign my labels backwards from some very large number and they'd be unlikely to collide with julia's which count upwards from 0.

@jakebolewski
Copy link
Member

I would love to see the function that made that plan even remotely dangerous...

@ivarne
Copy link
Member

ivarne commented Feb 6, 2014

@jakebolewski I think it looks like this:

MemoryError()

As a side note I actually like the @label and @goto syntax. It does not complicate the parser, it is a bit weird to discourage usage, and it feels somewhat in line with @inbounds.

Could we use a hash function to enable the use of symbols as labels?

dcjones added a commit to dcjones/julia that referenced this issue Feb 6, 2014
dcjones added a commit to dcjones/julia that referenced this issue Feb 6, 2014
dcjones added a commit to dcjones/julia that referenced this issue Feb 26, 2014
dcjones added a commit to dcjones/julia that referenced this issue Feb 26, 2014
dcjones added a commit to dcjones/julia that referenced this issue Mar 30, 2014
JeffBezanson added a commit that referenced this issue Jun 18, 2014
Add labels and gotos. (Fixes #101)
mauro3 pushed a commit to mauro3/julia that referenced this issue Jun 19, 2014
StefanKarpinski pushed a commit that referenced this issue Feb 8, 2018
StefanKarpinski pushed a commit that referenced this issue Feb 8, 2018
Fixes a problem in julia 0.7

Also a minor fix to keep the julia name in uuid_to_name
tanmaykm added a commit to tanmaykm/julia that referenced this issue Aug 13, 2019
Occasionally while adding a large number of workers and particularly worker-worker connections are not lazy, it is possible to encounter the following error:

```
ERROR (unhandled task failure): MethodError: no method matching manage(::Base.Distributed.DefaultClusterManager, ::Int64, ::WorkerConfig, ::Symbol)
Closest candidates are:
  manage(!Matched::Base.Distributed.SSHManager, ::Integer, ::WorkerConfig, ::Symbol) at distributed/managers.jl:224
  manage(!Matched::Base.Distributed.LocalManager, ::Integer, ::WorkerConfig, ::Symbol) at distributed/managers.jl:337
  manage(!Matched::Union{ClusterManagers.PBSManager, ClusterManagers.QRSHManager, ClusterManagers.SGEManager}, ::Int64, ::WorkerConfig, ::Symbol) at /home/jrun/.julia/v0.6/ClusterManagers/src/qsub.jl:115
  ...
Stacktrace:
 [1] deregister_worker(::Base.Distributed.ProcessGroup, ::Int64) at ./distributed/cluster.jl:903
 [2] message_handler_loop(::TCPSocket, ::TCPSocket, ::Bool) at ./distributed/process_messages.jl:220
 [3] process_tcp_streams(::TCPSocket, ::TCPSocket, ::Bool) at ./distributed/process_messages.jl:118
 [4] (::Base.Distributed.#JuliaLang#101#102{TCPSocket,TCPSocket,Bool})() at ./event.jl:73
```

It can be simulated with this exact sequence of events:
- worker2 in process of connecting to master
    - master has received the worker2s listen port, connected to it, sent the JoinPGRP message to it
    - master is now aware of worker2, and has added it to its list of workers
    - worker2 has still not processed the JoinPGRP message, so it is still unaware of its worker id
- worker3 now connects to master
    - master sends the JoinPGRP message along with list of existing workers that includes worker2
- worker3 connects to worker2
- worker2 receives a new connection from worker3 and attempts to process it
- worker3 faces an error and exits, thus breaking the connection
- worker2 gets an error processing message from worker3
    - goes into error handling
    - the current error handling code sees the self pid as 1 and incorrectly thinks it is the master
    - attempts to process the worker disconnection as a master and gets the error we see

The MethodError prevents proper cleanup at the worker where it happens. To me the issue seems to be that it is not correct to identify whether a Julia process is master or worker by looking at the process id. Instead we should have a dedicated indicator for that. This fix adds a new local process type variable that is set to `:master` by default, but is set to `:worker` when `start_worker` is invoked. This allows a process to know that it is running as a worker irrespective of whether it has received a process id or not.
tanmaykm added a commit to tanmaykm/julia that referenced this issue Aug 13, 2019
Occasionally while adding a large number of workers and particularly worker-worker connections are not lazy, it is possible to encounter the following error:

```
ERROR (unhandled task failure): MethodError: no method matching manage(::Base.Distributed.DefaultClusterManager, ::Int64, ::WorkerConfig, ::Symbol)
Closest candidates are:
  manage(!Matched::Base.Distributed.SSHManager, ::Integer, ::WorkerConfig, ::Symbol) at distributed/managers.jl:224
  manage(!Matched::Base.Distributed.LocalManager, ::Integer, ::WorkerConfig, ::Symbol) at distributed/managers.jl:337
  manage(!Matched::Union{ClusterManagers.PBSManager, ClusterManagers.QRSHManager, ClusterManagers.SGEManager}, ::Int64, ::WorkerConfig, ::Symbol) at /home/jrun/.julia/v0.6/ClusterManagers/src/qsub.jl:115
  ...
Stacktrace:
 [1] deregister_worker(::Base.Distributed.ProcessGroup, ::Int64) at ./distributed/cluster.jl:903
 [2] message_handler_loop(::TCPSocket, ::TCPSocket, ::Bool) at ./distributed/process_messages.jl:220
 [3] process_tcp_streams(::TCPSocket, ::TCPSocket, ::Bool) at ./distributed/process_messages.jl:118
 [4] (::Base.Distributed.#JuliaLang#101#102{TCPSocket,TCPSocket,Bool})() at ./event.jl:73
```

It can be simulated with this exact sequence of events:
- worker2 in process of connecting to master
    - master has received the worker2s listen port, connected to it, sent the JoinPGRP message to it
    - master is now aware of worker2, and has added it to its list of workers
    - worker2 has still not processed the JoinPGRP message, so it is still unaware of its worker id
- worker3 now connects to master
    - master sends the JoinPGRP message along with list of existing workers that includes worker2
- worker3 connects to worker2
- worker2 receives a new connection from worker3 and attempts to process it
- worker3 faces an error and exits, thus breaking the connection
- worker2 gets an error processing message from worker3
    - goes into error handling
    - the current error handling code sees the self pid as 1 and incorrectly thinks it is the master
    - attempts to process the worker disconnection as a master and gets the error we see

The MethodError prevents proper cleanup at the worker where it happens.

The issue seems to be that it is not correct to identify whether a Julia process is master or worker by looking at the process id. Instead we should have a dedicated indicator for that. This fix adds a new local process type variable that is set to `:master` by default, but is set to `:worker` when `start_worker` is invoked. This allows a process to know that it is running as a worker irrespective of whether it has received a process id or not.
tanmaykm added a commit to tanmaykm/julia that referenced this issue Aug 13, 2019
Occasionally while adding a large number of workers and particularly worker-worker connections are not lazy, it is possible to encounter the following error:

```
ERROR (unhandled task failure): MethodError: no method matching manage(::Base.Distributed.DefaultClusterManager, ::Int64, ::WorkerConfig, ::Symbol)
Closest candidates are:
  manage(!Matched::Base.Distributed.SSHManager, ::Integer, ::WorkerConfig, ::Symbol) at distributed/managers.jl:224
  manage(!Matched::Base.Distributed.LocalManager, ::Integer, ::WorkerConfig, ::Symbol) at distributed/managers.jl:337
  manage(!Matched::Union{ClusterManagers.PBSManager, ClusterManagers.QRSHManager, ClusterManagers.SGEManager}, ::Int64, ::WorkerConfig, ::Symbol) at /home/jrun/.julia/v0.6/ClusterManagers/src/qsub.jl:115
  ...
Stacktrace:
 [1] deregister_worker(::Base.Distributed.ProcessGroup, ::Int64) at ./distributed/cluster.jl:903
 [2] message_handler_loop(::TCPSocket, ::TCPSocket, ::Bool) at ./distributed/process_messages.jl:220
 [3] process_tcp_streams(::TCPSocket, ::TCPSocket, ::Bool) at ./distributed/process_messages.jl:118
 [4] (::Base.Distributed.#JuliaLang#101#102{TCPSocket,TCPSocket,Bool})() at ./event.jl:73
```

It can be simulated with this exact sequence of events:
- worker2 in process of connecting to master
    - master has received the worker2s listen port, connected to it, sent the JoinPGRP message to it
    - master is now aware of worker2, and has added it to its list of workers
    - worker2 has still not processed the JoinPGRP message, so it is still unaware of its worker id
- worker3 now connects to master
    - master sends the JoinPGRP message along with list of existing workers that includes worker2
- worker3 connects to worker2
- worker2 receives a new connection from worker3 and attempts to process it
- worker3 faces an error and exits, thus breaking the connection
- worker2 gets an error processing message from worker3
    - goes into error handling
    - the current error handling code sees the self pid as 1 and incorrectly thinks it is the master
    - attempts to process the worker disconnection as a master and gets the error we see

The MethodError prevents proper cleanup at the worker where it happens.

The issue seems to be that it is not correct to identify whether a Julia process is master or worker by looking at the process id. Instead we should have a dedicated indicator for that.

This change adds a new local process role variable that is set to `:master` by default, but is set to `:worker` when `start_worker` is invoked. This allows a process to know that it is running as a worker irrespective of whether it has received a process id or not.
tanmaykm added a commit to tanmaykm/julia that referenced this issue Aug 14, 2019
Occasionally while adding a large number of workers and particularly worker-worker connections are not lazy, it is possible to encounter the following error:

```
ERROR (unhandled task failure): MethodError: no method matching manage(::Base.Distributed.DefaultClusterManager, ::Int64, ::WorkerConfig, ::Symbol)
Closest candidates are:
  manage(!Matched::Base.Distributed.SSHManager, ::Integer, ::WorkerConfig, ::Symbol) at distributed/managers.jl:224
  manage(!Matched::Base.Distributed.LocalManager, ::Integer, ::WorkerConfig, ::Symbol) at distributed/managers.jl:337
  manage(!Matched::Union{ClusterManagers.PBSManager, ClusterManagers.QRSHManager, ClusterManagers.SGEManager}, ::Int64, ::WorkerConfig, ::Symbol) at /home/jrun/.julia/v0.6/ClusterManagers/src/qsub.jl:115
  ...
Stacktrace:
 [1] deregister_worker(::Base.Distributed.ProcessGroup, ::Int64) at ./distributed/cluster.jl:903
 [2] message_handler_loop(::TCPSocket, ::TCPSocket, ::Bool) at ./distributed/process_messages.jl:220
 [3] process_tcp_streams(::TCPSocket, ::TCPSocket, ::Bool) at ./distributed/process_messages.jl:118
 [4] (::Base.Distributed.#JuliaLang#101#102{TCPSocket,TCPSocket,Bool})() at ./event.jl:73
```

It can be simulated with this exact sequence of events:
- worker2 in process of connecting to master
    - master has received the worker2s listen port, connected to it, sent the JoinPGRP message to it
    - master is now aware of worker2, and has added it to its list of workers
    - worker2 has still not processed the JoinPGRP message, so it is still unaware of its worker id
- worker3 now connects to master
    - master sends the JoinPGRP message along with list of existing workers that includes worker2
- worker3 connects to worker2
- worker2 receives a new connection from worker3 and attempts to process it
- worker3 faces an error and exits, thus breaking the connection
- worker2 gets an error processing message from worker3
    - goes into error handling
    - the current error handling code sees the self pid as 1 and incorrectly thinks it is the master
    - attempts to process the worker disconnection as a master and gets the error we see

The MethodError prevents proper cleanup at the worker where it happens.

The issue seems to be that it is not correct to identify whether a Julia process is master or worker by looking at the process id. Instead we should have a dedicated indicator for that.

This change adds a new local process role variable that is set to `:master` by default, but is set to `:worker` when `start_worker` is invoked. This allows a process to know that it is running as a worker irrespective of whether it has received a process id or not.
JeffBezanson pushed a commit that referenced this issue Aug 15, 2019
Occasionally while adding a large number of workers and particularly worker-worker connections are not lazy, it is possible to encounter the following error:

```
ERROR (unhandled task failure): MethodError: no method matching manage(::Base.Distributed.DefaultClusterManager, ::Int64, ::WorkerConfig, ::Symbol)
Closest candidates are:
  manage(!Matched::Base.Distributed.SSHManager, ::Integer, ::WorkerConfig, ::Symbol) at distributed/managers.jl:224
  manage(!Matched::Base.Distributed.LocalManager, ::Integer, ::WorkerConfig, ::Symbol) at distributed/managers.jl:337
  manage(!Matched::Union{ClusterManagers.PBSManager, ClusterManagers.QRSHManager, ClusterManagers.SGEManager}, ::Int64, ::WorkerConfig, ::Symbol) at /home/jrun/.julia/v0.6/ClusterManagers/src/qsub.jl:115
  ...
Stacktrace:
 [1] deregister_worker(::Base.Distributed.ProcessGroup, ::Int64) at ./distributed/cluster.jl:903
 [2] message_handler_loop(::TCPSocket, ::TCPSocket, ::Bool) at ./distributed/process_messages.jl:220
 [3] process_tcp_streams(::TCPSocket, ::TCPSocket, ::Bool) at ./distributed/process_messages.jl:118
 [4] (::Base.Distributed.##101#102{TCPSocket,TCPSocket,Bool})() at ./event.jl:73
```

It can be simulated with this exact sequence of events:
- worker2 in process of connecting to master
    - master has received the worker2s listen port, connected to it, sent the JoinPGRP message to it
    - master is now aware of worker2, and has added it to its list of workers
    - worker2 has still not processed the JoinPGRP message, so it is still unaware of its worker id
- worker3 now connects to master
    - master sends the JoinPGRP message along with list of existing workers that includes worker2
- worker3 connects to worker2
- worker2 receives a new connection from worker3 and attempts to process it
- worker3 faces an error and exits, thus breaking the connection
- worker2 gets an error processing message from worker3
    - goes into error handling
    - the current error handling code sees the self pid as 1 and incorrectly thinks it is the master
    - attempts to process the worker disconnection as a master and gets the error we see

The MethodError prevents proper cleanup at the worker where it happens.

The issue seems to be that it is not correct to identify whether a Julia process is master or worker by looking at the process id. Instead we should have a dedicated indicator for that.

This change adds a new local process role variable that is set to `:master` by default, but is set to `:worker` when `start_worker` is invoked. This allows a process to know that it is running as a worker irrespective of whether it has received a process id or not.
cmcaine pushed a commit to cmcaine/julia that referenced this issue Sep 24, 2020
LilithHafner pushed a commit to LilithHafner/julia that referenced this issue Oct 11, 2021
Removed checknan argument from scalarstats.rst
Keno pushed a commit that referenced this issue Oct 9, 2023
Keno pushed a commit that referenced this issue Oct 9, 2023
Keno pushed a commit that referenced this issue Jun 5, 2024
Occasionally while adding a large number of workers and particularly worker-worker connections are not lazy, it is possible to encounter the following error:

```
ERROR (unhandled task failure): MethodError: no method matching manage(::Base.Distributed.DefaultClusterManager, ::Int64, ::WorkerConfig, ::Symbol)
Closest candidates are:
  manage(!Matched::Base.Distributed.SSHManager, ::Integer, ::WorkerConfig, ::Symbol) at distributed/managers.jl:224
  manage(!Matched::Base.Distributed.LocalManager, ::Integer, ::WorkerConfig, ::Symbol) at distributed/managers.jl:337
  manage(!Matched::Union{ClusterManagers.PBSManager, ClusterManagers.QRSHManager, ClusterManagers.SGEManager}, ::Int64, ::WorkerConfig, ::Symbol) at /home/jrun/.julia/v0.6/ClusterManagers/src/qsub.jl:115
  ...
Stacktrace:
 [1] deregister_worker(::Base.Distributed.ProcessGroup, ::Int64) at ./distributed/cluster.jl:903
 [2] message_handler_loop(::TCPSocket, ::TCPSocket, ::Bool) at ./distributed/process_messages.jl:220
 [3] process_tcp_streams(::TCPSocket, ::TCPSocket, ::Bool) at ./distributed/process_messages.jl:118
 [4] (::Base.Distributed.##101#102{TCPSocket,TCPSocket,Bool})() at ./event.jl:73
```

It can be simulated with this exact sequence of events:
- worker2 in process of connecting to master
    - master has received the worker2s listen port, connected to it, sent the JoinPGRP message to it
    - master is now aware of worker2, and has added it to its list of workers
    - worker2 has still not processed the JoinPGRP message, so it is still unaware of its worker id
- worker3 now connects to master
    - master sends the JoinPGRP message along with list of existing workers that includes worker2
- worker3 connects to worker2
- worker2 receives a new connection from worker3 and attempts to process it
- worker3 faces an error and exits, thus breaking the connection
- worker2 gets an error processing message from worker3
    - goes into error handling
    - the current error handling code sees the self pid as 1 and incorrectly thinks it is the master
    - attempts to process the worker disconnection as a master and gets the error we see

The MethodError prevents proper cleanup at the worker where it happens.

The issue seems to be that it is not correct to identify whether a Julia process is master or worker by looking at the process id. Instead we should have a dedicated indicator for that.

This change adds a new local process role variable that is set to `:master` by default, but is set to `:worker` when `start_worker` is invoked. This allows a process to know that it is running as a worker irrespective of whether it has received a process id or not.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
speculative Whether the change will be implemented is speculative
Projects
None yet
Development

No branches or pull requests

9 participants