Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] Compiler stuck on building x86-linux target #308

Closed
wants to merge 7 commits into from

Conversation

linxuanm
Copy link
Contributor

@linxuanm linxuanm commented Dec 4, 2024

Compiling the code in this PR halts the compiler. This was originally discovered when adding tests to the JsonParser in the standard library.

Reproduce: cd into test/lib and run TEST_TARGETS=x86-linux ./test.bash.

@k-sareen
Copy link
Contributor

k-sareen commented Dec 6, 2024

Okay so it's the register allocator going into an infinite loop here:

def recordLive(index: int) {
for (l = activeList; l != null; l = l.next) {
livemap[index, l.varNum] = true;
}
}

So what has happened is the activeList points to itself so it's just gotten into an infinite loop setting the same bit again and again in the bit matrix.

I confirmed this by adding:

if (l == l.next) {
  System.error("", "The current VReg points to itself!");
}

to the for loop above. I get:

!: The current VReg points to itself!
        in LinearScanRegAlloc.recordLive() [aeneas/src/mach/LinearScanRegAlloc.v3 @ 104:45]
        in LinearScanRegAlloc.assignRegs() [aeneas/src/mach/LinearScanRegAlloc.v3 @ 96:58]
        in X86CodeGen.genCode() [aeneas/src/x86/X86CodeGen.v3 @ 80:32]
        in X86Backend.genCodeFromSsa() [aeneas/src/x86/X86Backend.v3 @ 99:28]
        in MachBackend.genAllCode() [aeneas/src/mach/MachBackend.v3 @ 1343:39]
        in LinuxTarget.emit() [aeneas/src/os/Linux.v3 @ 128:35]
        in Compilation.emit() [aeneas/src/main/Compiler.v3 @ 268:45]
        in Compilation.compile() [aeneas/src/main/Compiler.v3 @ 184:32]
        in Aeneas.compile() [aeneas/src/main/Aeneas.v3 @ 81:56]
        in Aeneas.main() [aeneas/src/main/Aeneas.v3 @ 25:39]

P.S. @titzer Running

$ V3C_OPTS="-dwarf=true -symbols=true" aeneas bootstrap

gives me

Compiling (/home/$USER/git/virgil/bin/bootstrap/x86-linux/Aeneas -> /home/$USER/git/virgil/bin/current/x86-64-linux/Aeneas)...
!BoundsCheckException
        in SsaX86_64Gen.assemble() [aeneas/src/x86-64/SsaX86_64Gen.v3 @ 879:49]
        in SsaMachGen.assembleInstrs() [aeneas/src/mach/MachBackend.v3 @ 1069:33]
        in X86_64Backend.genCodeFromSsa() [aeneas/src/x86-64/X86_64Backend.v3 @ 145:39]
        in MachBackend.genAllCode() [aeneas/src/mach/MachBackend.v3 @ 1343:39]
        in LinuxTarget.emit() [aeneas/src/os/Linux.v3 @ 128:35]
        in Compilation.emit() [aeneas/src/main/Compiler.v3 @ 268:45]
        in Compilation.compile() [aeneas/src/main/Compiler.v3 @ 184:32]
        in Aeneas.compile() [aeneas/src/main/Aeneas.v3 @ 81:56]
        in Aeneas.main() [aeneas/src/main/Aeneas.v3 @ 25:39]

for the x86-64-linux target.

@k-sareen
Copy link
Contributor

k-sareen commented Dec 6, 2024

Print the SSA and the SSA uid of the failing register says it's the JsonValue.parse_array method with the allocation of the Vector<JsonValue>.new():

!The current VReg points to itself!: VReg from SSAInstr 70003
SSA dump of the parse_array method
Machine SSA for: JsonParser.parse_array [] -> [JsonValue]
        values @44680#91 @44682#-1 @44687##2690:JsonValue.Null @44695#44 @44702#93 @70002#12 @70004#256 @70006#&TextReader.req1[code] @70008#ં[data] @70009#&JsonParser.parse_value[code] @70011#&Vector.put[code] @70013#36 @70016#&TextReader.opt1[code] @70023#&Vector.extract[code] @70025#8 @70027#240 @70029#&JsonValue.JArray.new[code]
        param $0@44677: JsonParser [V_NON_ZERO]
        block #44678
                @70003:Alloc>: Vector (@70002=#12:int) [V_NON_ZERO] (uses: @70024 @70020 @70012 @44690 @44709 @44714 @70005)
                @70005:PtrStore, int>: void (@70003, @70004=#256:int) [O_NO_NULL_CHECK]
                @70007:CallAddress<$p32<&((TextReader, byte) -> int)>>: int (@70006=#&TextReader.req1[code]:$p32<&((TextReader, byte) -> int)>, $0@44677, @44680=#91:int) [O_NO_NULL_CHECK] (uses: @44683)
                @44683:IntEq: bool (@70007, @44682=#-1:int) [O_PURE O_FOLDABLE O_COMMUTATIVE] (uses: @44686)
                @44686:if     (@44683)
                         true  => #44684
                         false => #44685
        block #44684 #44686->#44684
                @44688:ret    (@70008=#ં[data]:JsonValue.Null)
        block #44685 #44686->#44685
                @70010:CallAddress<$p32<&(JsonParser -> JsonValue)>>: JsonValue (@70009=#&JsonParser.parse_value[code]:$p32<&(JsonParser -> JsonValue)>, $0@44677) [O_NO_NULL_CHECK] (uses: @70012 @44690)
                @70012:CallAddress<$p32<&((Vector, JsonValue) -> Vector)>>: Vector (@70011=#&Vector.put[code]:$p32<&((Vector, JsonValue) -> Vector)>, @70003, @70010) [V_NON_ZERO O_NO_NULL_CHECK]
                @70014:PtrAdd: JsonParser ($0@44677, @70013=#36:int) [O_PURE O_FOLDABLE] (uses: @70015)
                @70015:PtrLoad: bool (@70014) [O_NO_NULL_CHECK] (uses: @44694)
                @44694:if     (@70015)
                         true  => #44692
                         false => #44693
        block #44692 #44694->#44692 #44712->#44692
                @70017:CallAddress<$p32<&((TextReader, byte) -> int)>>: int (@70016=#&TextReader.opt1[code]:$p32<&((TextReader, byte) -> int)>, $0@44677, @44695=#44:int) [O_NO_NULL_CHECK] (uses: @44697)
                @44697:IntEq: bool (@70017, @44682=#-1:int) [O_PURE O_FOLDABLE O_COMMUTATIVE] (uses: @44700)
                @44700:if     (@44697)
                         true  => #44698
                         false => #44699
        block #44693 #44694->#44693
                @44701:ret    (@70008=#ં[data]:JsonValue.Null)
        block #44698 #44700->#44698
                @70018:CallAddress<$p32<&((TextReader, byte) -> int)>>: int (@70006=#&TextReader.req1[code]:$p32<&((TextReader, byte) -> int)>, $0@44677, @44702=#93:int) [O_NO_NULL_CHECK] (uses: @44704)
                @44704:IntEq: bool (@70018, @44682=#-1:int) [O_PURE O_FOLDABLE O_COMMUTATIVE] (uses: @44707)
                @44707:if     (@44704)
                         true  => #44705
                         false => #44706
        block #44699 #44700->#44699
                @70019:CallAddress<$p32<&(JsonParser -> JsonValue)>>: JsonValue (@70009=#&JsonParser.parse_value[code]:$p32<&(JsonParser -> JsonValue)>, $0@44677) [O_NO_NULL_CHECK] (uses: @70020 @44709)
                @70020:CallAddress<$p32<&((Vector, JsonValue) -> Vector)>>: Vector (@70011=#&Vector.put[code]:$p32<&((Vector, JsonValue) -> Vector)>, @70003, @70019) [V_NON_ZERO O_NO_NULL_CHECK]
                @70021:PtrAdd: JsonParser ($0@44677, @70013=#36:int) [O_PURE O_FOLDABLE] (uses: @70022)
                @70022:PtrLoad: bool (@70021) [O_NO_NULL_CHECK] (uses: @44712)
                @44712:if     (@70022)
                         true  => #44692
                         false => #44711
        block #44705 #44707->#44705
                @44713:ret    (@70008=#ં[data]:JsonValue.Null)
        block #44706 #44707->#44706
                @70024:CallAddress<$p32<&(Vector -> Array)>>: Array (@70023=#&Vector.extract[code]:$p32<&(Vector -> Array)>, @70003) [O_NO_NULL_CHECK] (uses: @70030 @44715)
                @70026:Alloc: JsonValue.JArray (@70025=#8:int)  (uses: @70030 @70028)
                @70028:PtrStore: void (@70026, @70027=#240:int) [O_NO_NULL_CHECK]
                @70030:CallAddress<$p32<&((JsonValue.JArray, Array) -> JsonValue.JArray)>>: JsonValue.JArray (@70029=#&JsonValue.JArray.new[code]:$p32<&((JsonValue.JArray, Array) -> JsonValue.JArray)>, @70026, @70024) [V_NON_ZERO] (uses: @44717)
                @44717:ret    (@70030)
        block #44711 #44712->#44711
                @44718:ret    (@70008=#ં[data]:JsonValue.Null)

Using rr I was able to figure out that the LinearScanRegAlloc.assignEnd method is creating the circular list here:

if (p != null) p.next = n;
if (n != null) n.prev = p;

What is happening is currently p == n and there is a circular linked list of size 2 already with p and vreg both pointing to each other. So when vreg is removed from the list, we get a circular linked list of size 1. Though I imagine a circular linked list of any kind should be wrong here, not just of size 1. It's just in our case we see a list of size 1.

EDIT: The circular linked list comes from SSAInstr @44677 from above:

!Have a circular linked list!: SSAInstr 44677 <-> 70003 linked list

@k-sareen
Copy link
Contributor

k-sareen commented Dec 6, 2024

Okay. So @titzer the high-level reason for the bug is that the live range of the VReg in SSAInstr @70003 is made active twice without the previous range being ended. That is to say it is nested like so: [ [ ] ] instead of [ ] [ ] where the brackets represent live ranges. This doesn't sound right for a linear scan register allocator, but I may be misunderstanding.

I'm pretty much at my limit of virgil internal debugging abilities. Hope this was useful. I would recommend creating an assertion library and adding assertions. This would have been very easy to detect if there was an assertion against creating a circular linked list.

EDIT: Did a bit more digging and exactly what is happening is that assignDef makes the VReg #7 for SSAInstr @70003 active here:

if ((uses[i] & gen.UNUSED_MASK) == 0) makeActive(vreg);

Later on, when the reg allocator actually processes SSAInstr @70003 itself, it results in creating the circular list like mentioned in the above comment.

The quick fix, of course, is to just break from the infinite for-loop like so: if (l == l.next) break; (and this does indeed work when I tested it), but that's not the real fix imo. You could have a circular linked list of size 2 and then this fix would break. The real issue is the overlapping makeActives/circular linked list creation I think.

@titzer
Copy link
Owner

titzer commented Dec 6, 2024

Hi @k-sareen I did a similar analysis and came to the conclusion that the test in question has a critical edge that the register allocator doesn't handle properly. One fix is to split all critical edges before codegen, but I don't like that, because the other targets and other allocators (other than the Wasm target, which needs to split critical edges for shadow stack spilling) don't need this. There is a branch https://github.com/titzer/virgil/tree/fix_lsra where I was tinkering with this.

The real fix is to dive deeper into why the allocator doesn't handle critical edges, particularly a critical edge that points back to a loop header.

@k-sareen
Copy link
Contributor

k-sareen commented Dec 6, 2024

There is a branch https://github.com/titzer/virgil/tree/fix_lsra where I was tinkering with this.

Oh I didn't realize that. Apologies for the walls of text then haha.

particularly a critical edge that points back to a loop header.

I'm not sure what do you mean here. There is no loop in this particular case right. Or do you mean you've been able to reproduce a failing case with a loop.

Also maybe it's the GC person in me talking, but surely one solution is to just avoid adding/processing something to the list that is already there? Maybe all we need is a bit in VReg to prevent this from happening.

@titzer
Copy link
Owner

titzer commented Dec 9, 2024

There is a branch https://github.com/titzer/virgil/tree/fix_lsra where I was tinkering with this.

Oh I didn't realize that. Apologies for the walls of text then haha.

particularly a critical edge that points back to a loop header.

I'm not sure what do you mean here. There is no loop in this particular case right. Or do you mean you've been able to reproduce a failing case with a loop.

Also maybe it's the GC person in me talking, but surely one solution is to just avoid adding/processing something to the list that is already there? Maybe all we need is a bit in VReg to prevent this from happening.

I'd like to understand why the variable in question has a nested live range like that and fix it in the allocator. One fix would be to have a counter instead of a bit, but I don't want to do that without understanding how the example gave rise to this. The examples I wrote in the branch where I was investigating ending up triggering a paranoid assertion, but I couldn't reproduce the infinite loop except in the original.

I'm planning to do a stable release, so I'd like to fix this before that release. Any help would be appreciated. (I was working in this branch: https://github.com/titzer/virgil/tree/fix_lsra)

@k-sareen
Copy link
Contributor

Right. I'll try and look into it. It'd be great if we could have a MWE instead -- I'll see if I can trigger this bug with a smaller test case.

@k-sareen
Copy link
Contributor

k-sareen commented Dec 10, 2024

It's quite odd. Printing the live interval range (using the in-built printer) doesn't seem to suggest 70003 has a nested live range.

EDIT: No actually. See LSRA pos 18 and 54 below:

18: start   #1
18: start   #0
18: use     use:@44677#0u36@edi
18: kill
18: live
[...]
54: start   #1
54: start   #0
54: end     #0
54: use     use:@44677#0u110@edi use:@44702#26u112@eax
54: kill
54: live

This is the nested live range. The ranges for 70003 (#1) and 44677 (#0) are started at LSRA pos 18 but never ended. Then 54 starts the same live range again.

These are the instructions:

18:    +                                     call     def:@70010#10u30@eax live:@70002#2u32 kill:u34@{all} use:@44677#0u36@edi
[...]
54:    + |           +                       call     def:@70018#25u104@eax live:@44686#7u106 kill:u108@{all} use:@44677#0u110@edi use:@44702#26u112@eax

These instructions correspond to SsaInstrs:

                @70010:CallAddress<$p32<&(JsonParser -> JsonValue)>>: JsonValue (@70009=#&JsonParser.parse_value[code]:$p32<&(JsonParser -> JsonValue)>, $0@44677) [O_NO_NULL_CHECK] (uses: @70012 @44690)
[...]
                @70018:CallAddress<$p32<&((TextReader, byte) -> int)>>: int (@70006=#&TextReader.req1[code]:$p32<&((TextReader, byte) -> int)>, $0@44677, @44702=#93:int) [O_NO_NULL_CHECK] (uses: @44704)

So correct me if I'm wrong, but what is happening then is that the LsraKill is not ending the live ranges for these variables?

Live interval range print
Processing broken block: 44678
liveness for block #44678 at bottom
   XX... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... .....
liveness for block #44678 at top
   X.... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... .....
1: def      def:@44677#0u0@edi
2: use      use:@70002#2u8@eax
2: kill
2: live
3: def      def:@70003#1u2@eax
4: use      use:@70003#1u10@{gpr}
6: use      use:@44677#0u18@edi use:@44680#5u20@eax
6: kill
6: live
7: def      def:@70007#4u12@eax
12: end     #4
12: use     use:@70007#4u26
16: end     #1
16: end     #0
16: use     use:@70008#9u28@eax
18: start   #1
18: start   #0
18: use     use:@44677#0u36@edi
18: kill
18: live
19: def     def:@70010#10u30@eax
20: end     #10
20: use     use:@70003#1u44@edi use:@70010#10u46@eax
20: kill
20: live
21: def     def:@70012#11u38@eax
24: use     use:@44677#0u54@{gpr}
25: def     def:@70015#13u52@{gpr}
26: end     #13
26: use     use:@70015#13u56
30: use     use:@44677#0u64@edi use:@44695#16u66@eax
30: kill
30: live
31: def     def:@70017#15u58@eax
36: end     #15
36: use     use:@70017#15u72
40: use     use:@44677#0u80@edi
40: kill
40: live
41: def     def:@70019#19u74@eax
42: end     #19
42: use     use:@70003#1u88@edi use:@70019#19u90@eax
42: kill
42: live
43: def     def:@70020#20u82@eax
46: use     use:@44677#0u98@{gpr}
47: def     def:@70022#22u96@{gpr}
48: end     #22
48: use     use:@70022#22u100
52: use     use:@70008#9u102@eax
54: start   #1
54: start   #0
54: end     #0
54: use     use:@44677#0u110@edi use:@44702#26u112@eax
54: kill
54: live
55: def     def:@70018#25u104@eax
60: end     #25
60: use     use:@70018#25u118
64: end     #1
64: use     use:@70008#9u120@eax
66: start   #1
66: end     #1
66: use     use:@70003#1u128@edi
66: kill
66: live
67: def     def:@70024#30u122@eax
68: use     use:@70025#32u136@eax
68: kill
68: live
69: def     def:@70026#31u130@eax
70: use     use:@70026#31u138@{gpr}
72: end     #30
72: end     #31
72: use     use:@70026#31u146@edi use:@70024#30u148@eax
72: kill
72: live
73: def     def:@70030#34u140@eax
74: end     #34
74: use     use:@70030#34u150@eax
76: use     use:@70008#9u152@eax
0:                                           entry    def:@44677#0u0@edi
1:     =
2:     |+                                    alloc    def:@70003#1u2@eax live:@44677#0u4 kill:u6@{all} use:@70002#2u8@eax
3:     | =
4:     | +                                   stored_i use:@70003#1u10@{gpr}
5:     | |
6:     + |+                                  call     def:@70007#4u12@eax live:@70003#1u14 kill:u16@{all} use:@44677#0u18@edi use:@44680#5u20@eax
7:     | | =
8:     | | |
9:     | | |
10:    | | |
11:    | | |
12:    | | +                                 cmp_i    use:@70007#4u26
13:    | |
14:    | |                                   br
15:    | |
16:         +                                ret      use:@70008#9u28@eax
17:
18:    +                                     call     def:@70010#10u30@eax live:@70002#2u32 kill:u34@{all} use:@44677#0u36@edi
19:    | |   =
20:    | +   +                               call     def:@70012#11u38@eax live:@70005#3u40 kill:u42@{all} use:@70003#1u44@edi use:@70010#10u46@eax
21:    | |    =
22:    | |
23:    | |
24:    + |                                   loadub   def:@70015#13u52@{gpr} use:@44677#0u54@{gpr}
25:    | |     =
26:    | |     +                             cmp_i    use:@70015#13u56
27:    | |
28:    | |                                   br
29:    | |
30:    + |      +                            call     def:@70017#15u58@eax live:@70007#4u60 kill:u62@{all} use:@44677#0u64@edi use:@44695#16u66@eax
31:    | |       =
32:    | |       |
33:    | |       |
34:    | |       |
35:    | |       |
36:    | |       +                           cmp_i    use:@70017#15u72
37:    | |
38:    | |                                   br
39:    | |
40:    + |                                   call     def:@70019#19u74@eax live:@44680#5u76 kill:u78@{all} use:@44677#0u80@edi
41:    | |        =
42:    | +        +                          call     def:@70020#20u82@eax live:@44683#6u84 kill:u86@{all} use:@70003#1u88@edi use:@70019#19u90@eax
43:    | |         =
44:    | |
45:    | |
46:    + |                                   loadub   def:@70022#22u96@{gpr} use:@44677#0u98@{gpr}
47:    | |          =
48:    | |          +                        cmp_i    use:@70022#22u100
49:    | |
50:    | |                                   br
51:    | |
52:    | |  +                                ret      use:@70008#9u102@eax
53:    | |
54:    + |           +                       call     def:@70018#25u104@eax live:@44686#7u106 kill:u108@{all} use:@44677#0u110@edi use:@44702#26u112@eax
55:      |            =
56:      |            |
57:      |            |
58:      |            |
59:      |            |
60:      |            +                      cmp_i    use:@70018#25u118
61:      |
62:      |                                   br
63:      |
64:         +                                ret      use:@70008#9u120@eax
65:
66:      +                                   call     def:@70024#30u122@eax live:@44688#8u124 kill:u126@{all} use:@70003#1u128@edi
67:                    =
68:                    |+                    alloc    def:@70026#31u130@eax live:@70008#9u132 kill:u134@{all} use:@70025#32u136@eax
69:                    | =
70:                    | +                   stored_i use:@70026#31u138@{gpr}
71:                    | |
72:                    + +                   call     def:@70030#34u140@eax live:@70010#10u142 kill:u144@{all} use:@70026#31u146@edi use:@70024#30u148@eax
73:                       =
74:                       +                  ret      use:@70030#34u150@eax
75:
76:         +                                ret      use:@70008#9u152@eax
77:

@k-sareen
Copy link
Contributor

k-sareen commented Dec 10, 2024

Surely LSRA pos 18 should say live: @70003 instead of 70002? 70002 is not used right. It's just 0/whatever is the default for Vector.new():

18:    +                                     call     def:@70010#10u30@eax live:@70002#2u32 kill:u34@{all} use:@44677#0u36@edi

Also a quick point of clarification: is re-defining the same variable in SSA expected? I see, for example, that the generated SSA uses (or more precisely, defines) 70009 for all locations where parse_value is called (70010 and 70019) and 70006 for all locations where req1 is called (70007 and 70018) when ostensibly these should be different variables at each call-site.

@titzer
Copy link
Owner

titzer commented Dec 10, 2024

Surely LSRA pos 18 should say live: @70003 instead of 70002? 70002 is not used right. It's just 0/whatever is the default for Vector.new():

18:    +                                     call     def:@70010#10u30@eax live:@70002#2u32 kill:u34@{all} use:@44677#0u36@edi

Also a quick point of clarification: is re-defining the same variable in SSA expected? I see, for example, that the generated SSA uses (or more precisely, defines) 70009 for all locations where parse_value is called (70010 and 70019) and 70006 for all locations where req1 is called (70007 and 70018) when ostensibly these should be different variables at each call-site.

Obviously no, redefined variables in SSA should not be allowed. However in the machine representation, VREGs might be redefined, but I think that should only happen on the Wasm target when using the shadow stack spiller. There are some smaller tests cases in the branch I referenced before. I find it a lot easier to work on smaller examples.

@k-sareen
Copy link
Contributor

k-sareen commented Dec 10, 2024

There are some smaller tests cases in the branch I referenced before. I find it a lot easier to work on smaller examples.

Yeah -- I'll try them out. I just felt like I was close to resolving it.

Obviously no, redefined variables in SSA should not be allowed. However in the machine representation, VREGs might be redefined, but I think that should only happen on the Wasm target when using the shadow stack spiller.

Right. That's what confused me. So why is Virgil re-defining 70009 and 70006? Does this mean the entire input SSA is broken for the instr selector/reg allocator? Because I was working under the assumption that the SSA is correct and debugging the reg allocator/liveness propagator.

Also another quick question Re: DWARF debug info. Does the DWARF debug info Virgil generates embed values of variables or is it just for debug symbols because I haven't been able to get it to print variable values/names in gdb/rr. Maybe I'm just using the DWARF functionality incorrectly.

EDIT: I mean this:

                @70010:CallAddress<$p32<&(JsonParser -> JsonValue)>>: JsonValue (@70009=#&JsonParser.parse_value[code]:$p32<&(JsonParser -> JsonValue)>, $0@44677) [O_NO_NULL_CHECK] (uses: @70012 @44690)
[...]
                @70019:CallAddress<$p32<&(JsonParser -> JsonValue)>>: JsonValue (@70009=#&JsonParser.parse_value[code]:$p32<&(JsonParser -> JsonValue)>, $0@44677) [O_NO_NULL_CHECK] (uses: @70020 @44709)

My understanding of @70009=JsonParser.parse_value in both of these is that the variable @70009 is being defined as the return value of JsonParser.parse_value.

@titzer
Copy link
Owner

titzer commented Dec 10, 2024

My understanding of @70009=JsonParser.parse_value in both of these is that the variable @70009 is being defined as the return value of JsonParser.parse_value.

Oh, I see why you could conclude that. Both of these are SSA constants and are arguments to the SSA call instructions. The trace output includes the @<id>=<value> as convenience; it doesn't mean that @id is being defined. SSA constants are global in the graph and are printed out before the first basic block. They don't show up in the machine representation because these constants are the target methods and have been turned into immediates, which the register allocator ignores.

@k-sareen
Copy link
Contributor

k-sareen commented Dec 10, 2024

Okay so in the end, I think it's a very subtle bug caused by an interaction of finishBlock, finishLoop, and the sorted block order. This is with the SSA for test/core/zjson00.v3:

SSA for JsonParser.parse_array
Machine SSA for: JsonParser.parse_array [] -> [JsonValue]
        values @5634#4 @3511#91 @3513#-1 @3518##71:JsonValue.Null @3526#44 @3533#93 @5636#32 @5638#&JsonParser.req1[code] @5640#G[data] @5641#&JsonParser.parse_value[code] @5643#&Vector.put[code] @5647#&JsonParser.opt1[code] @5654#&Vector.extract[code] @5656#8 @5658#20 @5660#&JsonValue.JArray.new[code]
        param $0@3508: JsonParser [V_NON_ZERO]
        block #3509
                @5635:Alloc>: Vector (@5634=#4:int) [V_NON_ZERO] (uses: @5655 @5651 @5644 @3521 @3540 @3545 @5637)
                @5637:PtrStore, int>: void (@5635, @5636=#32:int) [O_NO_NULL_CHECK]
                @5639:CallAddress<$p32<&((JsonParser, byte) -> int)>>: int (@5638=#&JsonParser.req1[code]:$p32<&((JsonParser, byte) -> int)>, $0@3508, @3511=#91:int) [O_NO_NULL_CHECK] (uses: @3514)
                @3514:IntEq: bool (@5639, @3513=#-1:int) [O_PURE O_FOLDABLE O_COMMUTATIVE] (uses: @3517)
                @3517:if     (@3514)
                         true  => #3515
                         false => #3516
        block #3515 #3517->#3515
                @3519:ret    (@5640=#G[data]:JsonValue.Null)
        block #3516 #3517->#3516
                @5642:CallAddress<$p32<&(JsonParser -> JsonValue)>>: JsonValue (@5641=#&JsonParser.parse_value[code]:$p32<&(JsonParser -> JsonValue)>, $0@3508) [O_NO_NULL_CHECK] (uses: @5644 @3521)
                @5644:CallAddress<$p32<&((Vector, JsonValue) -> void)>>: void (@5643=#&Vector.put[code]:$p32<&((Vector, JsonValue) -> void)>, @5635, @5642) [O_NO_NULL_CHECK]
                @5645:PtrAdd: JsonParser ($0@3508, @5634=#4:int) [O_PURE O_FOLDABLE] (uses: @5646)
                @5646:PtrLoad: bool (@5645) [O_NO_NULL_CHECK] (uses: @3525)
                @3525:if     (@5646)
                         true  => #3523
                         false => #3524
        block #3523 #3525->#3523 #3543->#3523
                @5648:CallAddress<$p32<&((JsonParser, byte) -> int)>>: int (@5647=#&JsonParser.opt1[code]:$p32<&((JsonParser, byte) -> int)>, $0@3508, @3526=#44:int) [O_NO_NULL_CHECK] (uses: @3528)
                @3528:IntEq: bool (@5648, @3513=#-1:int) [O_PURE O_FOLDABLE O_COMMUTATIVE] (uses: @3531)
                @3531:if     (@3528)
                         true  => #3529
                         false => #3530
        block #3524 #3525->#3524
                @3532:ret    (@5640=#G[data]:JsonValue.Null)
        block #3529 #3531->#3529
                @5649:CallAddress<$p32<&((JsonParser, byte) -> int)>>: int (@5638=#&JsonParser.req1[code]:$p32<&((JsonParser, byte) -> int)>, $0@3508, @3533=#93:int) [O_NO_NULL_CHECK] (uses: @3535)
                @3535:IntEq: bool (@5649, @3513=#-1:int) [O_PURE O_FOLDABLE O_COMMUTATIVE] (uses: @3538)
                @3538:if     (@3535)
                         true  => #3536
                         false => #3537
        block #3530 #3531->#3530
                @5650:CallAddress<$p32<&(JsonParser -> JsonValue)>>: JsonValue (@5641=#&JsonParser.parse_value[code]:$p32<&(JsonParser -> JsonValue)>, $0@3508) [O_NO_NULL_CHECK] (uses: @5651 @3540)
                @5651:CallAddress<$p32<&((Vector, JsonValue) -> void)>>: void (@5643=#&Vector.put[code]:$p32<&((Vector, JsonValue) -> void)>, @5635, @5650) [O_NO_NULL_CHECK]
                @5652:PtrAdd: JsonParser ($0@3508, @5634=#4:int) [O_PURE O_FOLDABLE] (uses: @5653)
                @5653:PtrLoad: bool (@5652) [O_NO_NULL_CHECK] (uses: @3543)
                @3543:if     (@5653)
                         true  => #3523
                         false => #3542
        block #3536 #3538->#3536
                @3544:ret    (@5640=#G[data]:JsonValue.Null)
        block #3537 #3538->#3537
                @5655:CallAddress<$p32<&(Vector -> Array)>>: Array (@5654=#&Vector.extract[code]:$p32<&(Vector -> Array)>, @5635) [O_NO_NULL_CHECK] (uses: @5661 @3546)
                @5657:Alloc: JsonValue.JArray (@5656=#8:int)  (uses: @5661 @5659)
                @5659:PtrStore: void (@5657, @5658=#20:int) [O_NO_NULL_CHECK]
                @5661:CallAddress<$p32<&((JsonValue.JArray, Array) -> JsonValue.JArray)>>: JsonValue.JArray (@5660=#&JsonValue.JArray.new[code]:$p32<&((JsonValue.JArray, Array) -> JsonValue.JArray)>, @5657, @5655) [V_NON_ZERO] (uses: @3548)
                @3548:ret    (@5661)
        block #3542 #3543->#3542
                @3549:ret    (@5640=#G[data]:JsonValue.Null)

The reverse order goes 3524 -> 3537 -> 3536 -> 3529 -> 3542 -> 3530 -> 3523 -> 3516 -> 3515 -> 3509.

At the start of block 3529, both this and vars (i.e. 3508 (VReg #0) and 5635 (VReg #1)) are alive, but at the end of block 3542, both are dead. So finishBlock adds two LsraStart points to start the live range of these variables (the LsraStarts at pos 54). These variables are dead at the start of block 3542, but they are also dead at the end of 3530 so no LsraEnds are added.

Block 3530 actually generates the LsraEnds for the two variables, but it's the loop body, so all LsraStart and LsraEnds are removed when running finishLoop. Then since 3530 is the last block of the loop and is the pred of 3542, no LsraEnds are added as the live ranges at the end of 3530 and start of 3542 already match so the function returns early.

Blocks 3523 and 3516 have this and vars as live at their start and end, but they are dead at the end of 3515. So finishBlock adds two LsraStart points again (the LsraStarts at pos 18) creating an outer nested live range for the two variables.

I think it's finishLoop that is broken. I tried a fix which I thought should have worked, but didn't (or well something is still broken). The livein.row(0).apply here is skipped because of the early return. So I commented out the early return to see what happens and some of my other assertions failed/compilation is still broken. But I think this early return without adding the LsraEnds is the cause of the bug.

@titzer does this make sense to you? What do you think the solution should be?

EDIT: Yeah so I special cased that exact finishLoop to make it not return early and it works. No infinite loop.

@k-sareen
Copy link
Contributor

Yeah so I special cased that exact finishLoop to make it not return early and it works. No infinite loop.

The question essentially is what are the conditions under which we should not return early and we should insert the LsraEnds. I can think of ways of tackling this when doing forward data-flow (simple bit for if a range is active or not), but liveness is a backwards data-flow analysis.

@titzer
Copy link
Owner

titzer commented Dec 12, 2024

@titzer does this make sense to you? What do you think the solution should be?

EDIT: Yeah so I special cased that exact finishLoop to make it not return early and it works. No infinite loop.

finishLoop was one of the trickiest, buggiest pieces of junk I ever wrote. The bug is almost certainly in there but I haven't had time to sit down and reason it out.

@k-sareen
Copy link
Contributor

k-sareen commented Dec 12, 2024

I've fixed the bug in my branch but it's a bit ugly and definitely can be cleaned up: https://github.com/k-sareen/virgil/tree/fix/lsra-ks. I can make a PR and clean up all the debugging code if you want. See #312 for more information.

PS I'll kinda be AWOL until Tuesday IST.

k-sareen added a commit to k-sareen/virgil that referenced this pull request Dec 12, 2024
In certain edge cases, `finishLoop` would not insert `LsraEnd`s for
active variables resulting in nested live ranges which manifested as an
infinite loop in the register allocator (see discussion here [1]).

This commit fixes this bug by keeping track of whether a variable will
be made active in the future or not (i.e. if it will have a live range).
If a variable will have a live range in the future, then `finishLoop`
will make sure to insert `LsraEnd`s for them to avoid the above bug.

[1]: titzer#308
@titzer titzer closed this Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants