Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow method idx + 1 index #6425

Merged
merged 3 commits into from
Apr 5, 2019
Merged

allow method idx + 1 index #6425

merged 3 commits into from
Apr 5, 2019

Conversation

dsyme
Copy link
Contributor

@dsyme dsyme commented Apr 3, 2019

Fixes #6417 , which happens when we write a binary with exactly 65535 methods.

I checked the binary writer code in Mono.Cecil and our computations for sizes are correct, as is our token writing code. The issue happens because an empty type definition (no methods) emits a "method table idx + 1" value, but that is expected.

https://github.com/jbevain/cecil/blob/ab5075b3a885ffd5bee5d347b79ce71aabeb32d1/Mono.Cecil.Metadata/Buffers.cs#L99

I'm testing the compiler referenced in the bug by hand

// idx 0x10000 is allowed for method table idx + 1 for just beyond last index of method table
if idx > 0x10000 then
System.Diagnostics.Debug.Assert (false, "EmitZUntaggedIndex: too big for small address or simple index")
buf.EmitInt32AsUInt16 idx
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So when an idx of 0x10000 is passed in, 0x0000 is written to buffer by this, which is identical to passing an idx of 0x0000. Is that actually what we want?

@dsyme
Copy link
Contributor Author

dsyme commented Apr 3, 2019

So when an idx of 0x10000 is passed in, 0x0000 is written to buffer by this, which is identical to passing an idx of 0x0000. Is that actually what we want?

Yeah it is weird isn't it.

The above is certainly what Mono.Cecil does, at least as far as I can tell. We can check a few other binary writers too, e.g. create a binary with exactly 65535 methods using ILASM.

I was hoping to verify by simply executing but we are currently hitting this problem on small variations of FSHarp.Compiler.Service.dll and executing that's a little awkward except in tests. Also testing it that way isn't really good enough is it.

[ I'm still searching for where the end+1 index is coming from... ]

Another approach would be to refuse to emit a binary with 65535 methods and emit a dummy.

I'll create a fake C# binary where one class has 65535 methods and compare.

@dsyme
Copy link
Contributor Author

dsyme commented Apr 3, 2019

Actually I think the problem can only occur when

  1. there are precisely 65535 methods (or fields) and
  2. the last type definition has no methods (or fields)

Sure enough, the last type we're emitting is <PrivateImplementationDetails$FSharp-Compiler-Service>, which has no methods because there are no private implementation details for that file in the compilation.

In the ECMA 335 spec, it says that each TypeDef row has the following MethodList value:

 MethodList (an index into the MethodDef table; it marks the first of a continguous
run of Methods owned by this Type). The run continues to the smaller of:
1. the last row of the MethodDef table or
2. the next run of Methods, found by inspecting the MethodList of the next
row in this TypeDef table 

Since there is no next row in the TypeDef table (this is the last type definition), then the second bullet doesn't apply, and the end of the method range will be the last row in the MethodDef table. But what should the start be? No number in the index range 0 - 65535 is valid to represent the absence of methods. We really want MethodDefTable.Length (65536). Any other index is wrong.

I'm actually beginning to believe this is a flaw in the .NET IL binary format.... Wow.... You simply can't represent this case, can you?

[ Update: @jbevian and I have concluded that index 0 is right value to represent the empty list of methods in this case ]

@jbevain
Copy link

jbevain commented Apr 3, 2019

:)

@dsyme
Copy link
Contributor Author

dsyme commented Apr 3, 2019

@jbevain Have I got this wrong?

@jbevain
Copy link

jbevain commented Apr 3, 2019

@dsyme no I think you're spot on, I was just amused. Thinking about it this could show up elsewhere as well, I think properties and events also use ranges in a list.

@dsyme
Copy link
Contributor Author

dsyme commented Apr 3, 2019

@jbevain Should 0 be emitted to represent the absence of methods in this case? It's the only value that seems to make sense to emit, but I'm not sure if binary readers respect it - ilread.fs doesn't appear to. It's not in ECMA 335 AFAICS.

@dsyme
Copy link
Contributor Author

dsyme commented Apr 3, 2019

@dsyme no I think you're spot on, I was just amused. Thinking about it this could show up elsewhere as well, I think properties and events also use ranges in a list.

Yeah, also method parameters. Though maybe the fact that all methods have a return parameter mean that's never hit.

@jbevain
Copy link

jbevain commented Apr 3, 2019

@dsyme that might work. That reminds me of an issue in Cecil that was triggered by the portable pdb support in fsc where we were not expecting 0 for a *List, but where that's actually valid to represent the absence of a list. But I'd need to look into it again to be sure.

@dsyme
Copy link
Contributor Author

dsyme commented Apr 3, 2019

It looks like Mono.Cecil handles a list start range of zero:

https://github.com/jbevain/cecil/blob/ab5075b3a885ffd5bee5d347b79ce71aabeb32d1/Mono.Cecil/AssemblyReader.cs#L971

So I think 0 is the right value to emit here. But Ecma 335 just doesn't document it

@dsyme
Copy link
Contributor Author

dsyme commented Apr 3, 2019

There is this note in ECMA 335 on a different section. So yes, it looks like 0 is the right thing to emit.

18. MethodList can be null or non-null 
If MethodList is non-null, it shall index a valid row in the MethodDef table, where
valid means 1 <= row <= rowcount+1 [ERROR] 

@jbevain
Copy link

jbevain commented Apr 3, 2019

That was added in August 2017:

jbevain/cecil@22b36c3

So yes, not a common case, but 0 should be valid. This was added as a follow up case for:

jbevain/cecil@22b36c3#diff-017638b7843cd5d111c30dbc445d39ec

Where fsc can create a 0 list of constants for a scope, and we'd just never since that.

@dsyme
Copy link
Contributor Author

dsyme commented Apr 4, 2019

OK, I updated the PR so our binary reader respects '0' in the start-of-list entries to mean an empty list.

I think the PR is sound now, but we should really double check the code for the .NET runtime binary reader - I'm sure it will record a zero entry as an empty list to.

@KevinRansom
Copy link
Member

@dsyme, I think you are right that they use 0 as the start row id to indicate 0 methods, fields and properties.

GetMethodRange for a type returns the start and end row in the method table: https://github.com/dotnet/corefx/blob/master/src/System.Reflection.Metadata/src/System/Reflection/Metadata/MetadataReader.cs#L864

In methoddefinitionhandlecollection here: https://github.com/dotnet/corefx/blob/master/src/System.Reflection.Metadata/src/System/Reflection/Metadata/TypeSystem/HandleCollections.TypeSystem.cs#L456

Notice the count property is last - first + 1, (0 - 1 +1) = 0

@KevinRansom
Copy link
Member

@dsyme, this looks ready to pull, are you good with it?

@KevinRansom KevinRansom merged commit 2715f16 into dotnet:master Apr 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants