Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repr missing pointer when applied to seq/string with arc #16052

Closed
pietroppeter opened this issue Nov 19, 2020 · 11 comments
Closed

repr missing pointer when applied to seq/string with arc #16052

pietroppeter opened this issue Nov 19, 2020 · 11 comments

Comments

@pietroppeter
Copy link
Collaborator

system.repr (docs) when applied to a sequence and compiled with gc:arc does not output the pointer (it does with default gc).

Example

example.nim:

echo @[0].repr

Current Output

nim r example:

0000000000419AA0@[0]

nim r --gc:arc example:

@[0]

Expected Output

I would expect the same output when compiled with both gc options

Additional Information

$ nim -v
Nim Compiler Version 1.4.0 [Windows: amd64]
Compiled at 2020-10-16
Copyright (c) 2006-2020 by Andreas Rumpf

active boot switches: -d:release
@timotheecour
Copy link
Member

timotheecour commented Nov 19, 2020

@pietroppeter please send a PR if you can, it shouldn't be complicated.

see tests I had added in tests/stdlib/trepr.nim (https://github.com/nim-lang/Nim/pull/16034/files) noticing this, and that should be edited when this issue is fixed.
the point of repr is debugging, so those addresses are indeed needed.

the following can be done in followup work:

for repr in vm:

  • the address should also be shown for string/cstring, since it's also backed by a string in vm.

  • for seq in VM, a bit more controversial since it's a seq[PNode] and not a seq[T] so addr(a.sons[0]) != addr(a) (where a: PNode)

  • for repr in js at CT: same as VM

  • for repr in js at RT:
    in debug mode, an object id could be attached to each object for identification purposes; see related stackoverflow topics

@pietroppeter
Copy link
Collaborator Author

mmh, I would be happy to be able to submit a PR but I might need some hand holding, since I never touched the compiler (not even run koch or testament for which this could be a good opportunity to try).

If you or someone wants to help me, here is my analysis of the issue:

  • I see repr as defined in systems.nim is defined as magic. I do not know what that precisely entails, only that this will use some compiler magic.
  • I see in lib/system there are two files repr.nim and repr_v2. I am assuming repr.nim is actually responsible and that I should avoid touching repr_v2.
  • I guess the proc responsible is reprSequence...
  • ...and the issue should happen with reprPointer (which is a compilerproc whatever that means) and possibly in the lines that call c_sprintf (I guess that this would be a reference to C function sprintf, so I would deduce that this code is only used for c and cpp targets).
  • btw, it is also the first time I see pointer type, but I get an idea what that might be from https://forum.nim-lang.org/t/823

Now I ask myself: how could c_sprintf have different behaviour when gc:arc is set? I gotta say I do not have a real clue about that. Only I vaguely guess that the pointer should have been different from the start. And I have no clue where this pointer is produced.

@pietroppeter
Copy link
Collaborator Author

Also, for context: I ran into this while I was working on a fix for a bigints issue.

The minimal example I had there (playground):

import sugar

var
  a = @[0]

proc op(a: var seq[int], b: seq[int]) =
  dump a.repr
  dump b.repr
  a.setLen(2)
  dump a.repr
  dump b.repr

op(a, a)

outputs:

a.repr = 0x7ff35fdfd050@[0]

b.repr = 0x7ff35fdfd050@[0]

a.repr = 0x7ff35fdfd1a0@[0, 0]

b.repr = 0x7ff35fdfd050@[]

apart the fact that I am not sure if this is a bug or expected behaviour (the fact that b is empty), when running that example with arc I get

a.repr = @[0]
b.repr = @[0]
a.repr = @[0, 0]
b.repr = @[3169088]

which, apart for missing pointer in representation, is actually worse than the above (b having an arbitrary value inside).

@timotheecour
Copy link
Member

I see in lib/system there are two files repr.nim and repr_v2. I am assuming repr.nim is actually responsible and that I should avoid touching repr_v2.
I guess the proc responsible is reprSequence...

once I finish cleaning up #15827 you'll be able to use:

when true:
  import std/exectraces
  proc main()=
    var a = [10,11]
    var b = a[0].addr
    enableRuntimeTracing(true)
    let s = repr(b)
    enableRuntimeTracing(false)
  main()

nim r --exectrace:on --gc:arc main.nim
and then get a trace of function calls, pointing to this:

[ 0]  | main 0 /Users/timothee/git_clone/nim/timn/tests/nim/all/t11328.nim:12
[ 1]   > repr 1 /Users/timothee/git_clone/nim/Nim_prs/lib/system/repr_v2.nim:127
[ 2]     > nimErrorFlag 2 /Users/timothee/git_clone/nim/Nim_prs/lib/system/excpt.nim:416
[ 2]     | nimErrorFlag 2 /Users/timothee/git_clone/nim/Nim_prs/lib/system/excpt.nim:417
[ 1]   | repr 2 /Users/timothee/git_clone/nim/Nim_prs/lib/system/repr_v2.nim:128
[ 1]   | repr 2 /Users/timothee/git_clone/nim/Nim_prs/lib/system/repr_v2.nim:134
[ 2]     > nimIntToStr 3 /Users/timothee/git_clone/nim/Nim_prs/lib/system/strmantle.nim:69
[ 2]     | nimIntToStr 3 /Users/timothee/git_clone/nim/Nim_prs/lib/system/strmantle.nim:70
[ 3]       > rawNewString 4 /Users/timothee/git_clone/nim/Nim_prs/lib/system/strs_v2.nim:104
...

=> for gc:arc it's in lib/system/repr_v2.nim:127

will answer other questions later

@pietroppeter
Copy link
Collaborator Author

thanks, that --exectrace:on looks awesome! and of course my first assumption was wrong! 😄

looking at repr_v2 it is know much clearer where the difference is. the repr proc for seq container clearly is missing the pointer as prefix.

I was then curious on how does the repr of pointer looks like and I would say that for arc is also bad, so probably this should be fixed contextually:

let s = @[1]
echo s.repr
echo unsafeAddr(s[0]).repr
# default gc:

# 000000000091F050@[1]
#
# ptr 000000000091F060 --> 1

# arc:

# @[1]
# ptr 1

I guess for this the guilty part is this: https://github.com/nim-lang/Nim/blob/version-1-4/lib/system/repr_v2.nim#L127

Now the unclear part for me is how to produce the correct representation of the pointer. Should I use something as reprPointer in repr.nim (or exactly that? and why then there are two versions repr and repr_v2?).

@timotheecour
Copy link
Member

timotheecour commented Nov 20, 2020

why then there are two versions repr and repr_v2?

  • repr uses type info (PNimType)
  • repr_v2 doesn't, with the rationale (IIRC) that PNimType was hurting performance

but beyond that some parts should be factorizable to avoid code duplication.
Since this is your 2nd PR, I'd advise to do the minimal targetted change to achieve the goal at first, without doing much refactoring, but yes, reprPointer should be refactored, not duplicated; if needed maybe some system/repr_common.nim imported from both

@pietroppeter
Copy link
Collaborator Author

pietroppeter commented Nov 20, 2020

thanks again. yes, I will definitely follow the advice of targeting minimal change (you could very well say that this is my actual 1st PR to code in nim codebase, the other one was mostly to understand git workflow and was basically my first PR ever in github).

And actually, given the explanation on why the two version and giving a better look my impression is that repr_v2 looks like a better version of repr (modern proc signatures without the type in the name...) that I guess it might be able to supplant it at some point.
If this is indeed the case, it might not hurt to add two lines of comments at the beginning of repr_v2 explaining the rationale.

Also I noticed that there is in fact already a proc repr*(p: pointer): string that can be used for the fix (we might want to add also 0x as prefix). So I do not think there is a need for a repr_common.

All considered I think that, thanks to your support, I do not see at the moment other obstacles on my side for making the PR. Of course more questions might come up when actually writing it. I plan to have it ready at some point during the weekend.

@timotheecour
Copy link
Member

feel free to send early draft (mark it as draft) for early feedback

@Clyybber
Copy link
Contributor

Clyybber commented Nov 20, 2020

apart the fact that I am not sure if this is a bug or expected behaviour (the fact that b is empty), when running that example with arc I get ...

This behaviour is intended (not printing the pointer), but maybe it should be changed.

which, apart for missing pointer in representation, is actually worse than the above (b having an arbitrary value inside).

The reason for this is not related to repr, it's because arc is missing error messages for illegal aliasing/not introducing a copy to prevent it.

@Araq
Copy link
Member

Araq commented Nov 23, 2020

I took the liberty and re-implemented repr for ARC without caring too much about the repr's old ugliness. I don't remember the seq's address to be helpful, hence I didn't bother.

@timotheecour timotheecour changed the title repr missing pointer when applied to seq with arc repr missing pointer when applied to seq/string with arc Apr 17, 2021
@Araq Araq added the Won't Fix label Jul 13, 2021
@Araq
Copy link
Member

Araq commented Jul 13, 2021

The new repr is better than the old one, at least you can write unit tests when the address is left out.

@Araq Araq closed this as completed Jul 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants