Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORC calls destructor hook twice on Linux (arm64) #24578

Open
elcritch opened this issue Dec 28, 2024 · 1 comment
Open

ORC calls destructor hook twice on Linux (arm64) #24578

elcritch opened this issue Dec 28, 2024 · 1 comment

Comments

@elcritch
Copy link
Contributor

elcritch commented Dec 28, 2024

Description

This code sometimes calls the destructor on the AgentObj twice, but only sporadically. Not really sure if this is a bug in ORC, or just programmer error on my part since I'm doing somewhat hacky stuff. The issue appears to occur when ORC runs the destructor a second time when a temporary copy of the agent is created.

Compiled using nim c --cc:clang --mm:orc -d:breakOrc -d:debug -r torc_badness.nim and run multiple times for i in seq 1 100; do "./torc_badness"; done. Usually 3-5 runs is sufficient.

## torc_badness.nim
import std/tables
import std/unittest

type WeakRef*[T] {.acyclic.} = object
  pt* {.cursor.}: T # cursor effectively is just a pointer, also happens w/ pointer type

template `[]`*[T](r: WeakRef[T]): lent T =
  ## using this in the destructor is fine because it's lent
  cast[T](r.pt)

proc toRef*[T: ref](obj: WeakRef[T]): T =
  ## using this in the destructor breaks ORC
  result = cast[T](obj)

type
  AgentObj = object of RootObj
    subscribers*: Table[int, WeakRef[Agent]] ## agents listening to me
    when defined(debug):
      freed*: bool
      moved*: bool

  Agent* = ref object of AgentObj
  # Agent* {.acyclic.} = ref object of AgentObj ## this also avoids the issue

proc `=wasMoved`(agent: var AgentObj) =
  echo "agent was moved"
  agent.moved = true

proc `=destroy`*(agentObj: AgentObj) =
  let xid: WeakRef[Agent] = WeakRef[Agent](pt: cast[Agent](addr agentObj)) ##\
    ## This is pretty hacky, but we need to get the address of the original
    ## Agent (ref object) since it's used to unsubscribe from other agents in the actual code,
    ## Luckily the agent address is the same as `addr agent` of the agent object here.
  echo "Destroying agent: ",
          " pt: ", cast[pointer](xid.pt).repr,
          " freed: ", agentObj.freed,
          " moved: ", agentObj.moved,
          " lstCnt: ", xid[].subscribers.len()
  when defined(debug):
    if agentObj.freed:
      raise newException(Defect, "already freed!")

  xid[].freed = true

  ## remove subscribers via their WeakRef's
  ## this is where we create a problem
  ## by using `toRef` which creates a *new* Agent reference
  ## which gets added to ORC as a potential cycle check (?)
  ## adding `{.acyclic.}` to 
  when defined(breakOrc):
    if xid.toRef().subscribers.len() > 0:
      echo "has subscribers"
  else:
    if xid[].subscribers.len() > 0:
      echo "has subscribers"

  `=destroy`(xid[].subscribers)
  echo "finished destroy: agent: ", " pt: ", cast[pointer](xid.pt).repr

type
  Counter* = ref object of Agent
    value: int

suite "threaded agent slots":
  test "sigil object thread runner":

    block:
      var b = Counter.new()
    
    GC_fullCollect()

Nim Version

Nim Compiler Version 2.0.14 [Linux: arm64]
Compiled at 2024-12-23
Copyright (c) 2006-2023 by Andreas Rumpf

git hash: bf4de6a394e040d9810cba8c69fb2829ff04dcc6
active boot switches: -d:release -d:danger

However it does not appear to happen on Mac:

Nim Compiler Version 2.0.14 [MacOSX: arm64]
Compiled at 2024-12-23
Copyright (c) 2006-2023 by Andreas Rumpf

git hash: bf4de6a394e040d9810cba8c69fb2829ff04dcc6
active boot switches: -d:release -d:danger

Current Output

➜  sigils git:(thread-lifetimes-alt1) ✗ for i in `seq 1 100`; do "/home/elcritch.linux/sigils/tests/torc_badness" ; done

[Suite] threaded agent slots
Destroying agent:  pt: 0000BB1E90C58770 freed: false moved: false lstCnt: 0
finished destroy: agent:  pt: 0000BB1E90C58770
  [OK] sigil object thread runner

[Suite] threaded agent slots
Destroying agent:  pt: 0000C7C275B5D770 freed: false moved: false lstCnt: 0
finished destroy: agent:  pt: 0000C7C275B5D770
Destroying agent:  pt: 0000C7C275B5D770 freed: true moved: false lstCnt: 0
/home/elcritch.linux/sigils/tests/torc_badness.nim(71) torc_badness
/home/elcritch.linux/.asdf/installs/nim/2.0.14/lib/system/orc.nim(458) GC_fullCollect
/home/elcritch.linux/.asdf/installs/nim/2.0.14/lib/system/orc.nim(396) collectCycles
/home/elcritch.linux/.asdf/installs/nim/2.0.14/lib/system/orc.nim(358) collectCyclesBacon
/home/elcritch.linux/.asdf/installs/nim/2.0.14/lib/system/orc.nim(97) free
/home/elcritch.linux/sigils/tests/torc_badness.nim(42) =destroy

    Unhandled exception: already freed! [Defect]
  [FAILED] sigil object thread runner


### Expected Output

```text
[Suite] threaded agent slots
Destroying agent:  pt: 0000BB1E90C58770 freed: false moved: false lstCnt: 0
finished destroy: agent:  pt: 0000BB1E90C58770
  [OK] sigil object thread runner
...


### Known Workarounds

_No response_

### Additional Information

_No response_
@elcritch
Copy link
Contributor Author

Also it still occurs randomly when using valgrind and -d:useMalloc on Linux.

Also valgrind complains about bad reads/writes from ORC procs even when it doesn't crash.

Hmmm, I wonder if this is possibly related to #24586 as xid.toRef() creates a temporary object.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants