Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ExtendedCfgNodeMethods: rename type parameter #4936

Merged
merged 1 commit into from
Sep 19, 2024

Conversation

xavierpinho
Copy link
Contributor

@tajobe has been doing some byte code surgery in order to create a Kotlin interoperability layer and found an interesting edge case when type parameters are shadowed.


Taking a simpler example, consider the following Scala code:

class Example[NodeType <: Object](node: NodeType) extends AnyVal{
  def foo[NodeType](x: List[NodeType]): List[NodeType] = x
}

When decompiling the foo method, we'll find the following:

public static <NodeType extends java.lang.Object, NodeType extends java.lang.Object> scala.collection.immutable.List<NodeType> foo$extension(java.lang.Object, scala.collection.immutable.List<NodeType>);
    descriptor: (Ljava/lang/Object;Lscala/collection/immutable/List;)Lscala/collection/immutable/List;
    flags: (0x0009) ACC_PUBLIC, ACC_STATIC
    Code:
      stack=3, locals=2, args_size=2
         0: getstatic     #18                 // Field Example$.MODULE$:LExample$;
         3: aload_0
         4: aload_1
         5: invokevirtual #25                 // Method Example$.foo$extension:(Ljava/lang/Object;Lscala/collection/immutable/List;)Lscala/collection/immutable/List;
         8: areturn
    Signature: #23                          // <NodeType:Ljava/lang/Object;NodeType:Ljava/lang/Object;>(Ljava/lang/Object;Lscala/collection/immutable/List<TNodeType;>;)Lscala/collection/immutable/List<TNodeType;>;

 
Notice how NodeType (the type parameter) occurs twice in the definition of foo$extension. Apparently the JVM (assuming Oracle as canonical) is designed to accept this when loading/linking, but guarantees nothing if the byte code is interpreted through reflection/etc, cf. https://docs.oracle.com/javase/specs/jvms/se23/html/jvms-4.html#jvms-4.7.9:

Oracle's Java Virtual Machine implementation does not check the well-formedness of Signature attributes during class loading or linking. Instead, Signature attributes are checked by methods of the Java SE Platform class libraries which expose generic signatures of classes, interfaces, constructors, methods, and fields. Examples include getGenericSuperclass in Class and toGenericString in java.lang.reflect.Executable.

This byte code poses a problem that is trivially fixed if we prevent shadowing by just renaming one of the type parameters, e.g.

class Example[CfgNodeType <: Object](node: CfgNodeType) extends AnyVal{
  def foo[NodeType](x: List[NodeType]): List[NodeType] = x
}

resulting in

public static <NodeType extends java.lang.Object, CfgNodeType extends java.lang.Object> scala.collection.immutable.List<NodeType> foo$extension(java.lang.Object, scala.collection.immutable.List<NodeType>);
    descriptor: (Ljava/lang/Object;Lscala/collection/immutable/List;)Lscala/collection/immutable/List;
    flags: (0x0009) ACC_PUBLIC, ACC_STATIC
    Code:
      stack=3, locals=2, args_size=2
         0: getstatic     #18                 // Field Example$.MODULE$:LExample$;
         3: aload_0
         4: aload_1
         5: invokevirtual #25                 // Method Example$.foo$extension:(Ljava/lang/Object;Lscala/collection/immutable/List;)Lscala/collection/immutable/List;
         8: areturn
    Signature: #23                          // <NodeType:Ljava/lang/Object;CfgNodeType:Ljava/lang/Object;>(Ljava/lang/Object;Lscala/collection/immutable/List<TNodeType;>;)Lscala/collection/immutable/List<TNodeType;>;

Compiling Joern with -Wshadow:type-parameter-shadow only warns about 2 sites: the one in this PR and another one buried in javasrc2cpg. The latter is not public-facing, so I won't even bother proposing to change it at this point.

This rename is purely syntactical and seems to easily solve a bigger hurdle. Wdyt?

Copy link
Contributor

@mpollmeier mpollmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I absolutely think we should merge this and also enable the type param shadowing warning by default (will create a separate PR shortly), for the sake of readability and disambiguity.

That being said, I'm not sure I understand which specific problem this PR solves, maybe you can elaborate..?

From my POV, the two type parameters in your Example class are not related at all. They happen to have the same name, which only means that within def foo we cannot reference the NodeType type from the class Example, but apart from that there's no connection at all between them.

Semantically it's similar to something like this on the value level: the two bar variables are not related at all, they just happen to have the same name, and therefor we cannot reference the class-level bar variable from within def baz.

class Foo {
    val bar = 42

    def baz: Unit = {
      val bar = "123"
      ()
  }
}

mpollmeier added a commit that referenced this pull request Sep 19, 2024
as brought up in #4936 it can
lead to confusing situations if type parameters are shadowed, so for
the sake of readability and disambiguity alone we should enable this
compiler warning IMO.

That being said, I'd like to stress that it's not something
fundamentally complicated, afaics. Simple example copied from #4936:
```scala
class Example[NodeType <: Object](node: NodeType) extends AnyVal{
  def foo[NodeType](x: List[NodeType]): List[NodeType] = x
}
```

These two type parameters in your Example class are not related at all. They happen to have the same name, which only means that within `def foo` we cannot reference the `NodeType` type from the `class Example`, but apart from that there's no connection at all between them.

Semantically it's similar to something like this on the value level: the two `bar` variables are not related at all, they just happen to have the same name, and therefor we cannot reference the class-level `bar` variable from within `def baz`.
```scala
class Foo {
    val bar = 42

    def baz: Unit = {
      val bar = "123"
      ()
  }
}
```
mpollmeier added a commit that referenced this pull request Sep 19, 2024
as brought up in #4936 it can
lead to confusing situations if type parameters are shadowed, so for
the sake of readability and disambiguity alone we should enable this
compiler warning IMO.

That being said, I'd like to stress that it's not something
fundamentally complicated, afaics. Simple example copied from #4936:
```scala
class Example[NodeType <: Object](node: NodeType) extends AnyVal{
  def foo[NodeType](x: List[NodeType]): List[NodeType] = x
}
```

These two type parameters in your Example class are not related at all. They happen to have the same name, which only means that within `def foo` we cannot reference the `NodeType` type from the `class Example`, but apart from that there's no connection at all between them.

Semantically it's similar to something like this on the value level: the two `bar` variables are not related at all, they just happen to have the same name, and therefor we cannot reference the class-level `bar` variable from within `def baz`.
```scala
class Foo {
    val bar = 42

    def baz: Unit = {
      val bar = "123"
      ()
  }
}
```
@xavierpinho
Copy link
Contributor Author

Indeed, there's no ambiguity at Scala level: it's just another form of good ol' lexical scoping. It's only when it gets down to byte code that we find ourselves in a pickle, as we get 2 same-named type parameters in the same method, causing havoc with (I think) kotlin-reflect.

Briefly, and in the context of #4796, @tajobe has been working on a plugin to automatically generate ergonomic Kotlin bindings for the most relevant APIs. It starts by finding which methods to codegen for (based on their signature), and proceeds emitting a more ergonomic (Kotlin-wise) method definition for it. So for so good... until a wild method signature with 2 same-named type parameters appeared and kotlin-reflect bailed out.

Changing the type parameter name was the trivial workaround. No one was expecting to find a method with 2 same-named type parameters. I couldn't find anything about this edge case in various JVM-related docs online: not sure if it's undefined behaviour or I just missed something. At any rate, if someone knows more about this, I'd love to hear!

@mpollmeier
Copy link
Contributor

Thanks for elaborating, yeah that all makes sense.

@mpollmeier mpollmeier merged commit 236bd8f into master Sep 19, 2024
5 checks passed
@xavierpinho xavierpinho deleted the xavierp/alpha-conversion branch September 19, 2024 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants