Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misbehavior of macro-generated match when matching case object with lowercase name #20350

Open
MateuszKubuszok opened this issue May 7, 2024 · 9 comments
Assignees
Labels

Comments

@MateuszKubuszok
Copy link

Compiler version

3.3.3

Minimized code

repro.scala

//> using scala 3.3.3

import scala.quoted.*

object Macros {

  def matchOnImpl[A: Type](a: Expr[A])(using quotes: Quotes): Expr[Unit] = {
    import quotes.*, quotes.reflect.*

    // workaround to contain @experimental from polluting the whole codebase
    object FreshTerm {
      private val impl = quotes.reflect.Symbol.getClass.getMethod("freshName", classOf[String])

      def generate(prefix: String): String = impl.invoke(quotes.reflect.Symbol, prefix).asInstanceOf[String]
    }

    extension (sym: Symbol)
     def isPublic: Boolean = !sym.isNoSymbol &&
          !(sym.flags.is(Flags.Private) || sym.flags.is(Flags.PrivateLocal) || sym.flags.is(Flags.Protected) ||
            sym.privateWithin.isDefined || sym.protectedWithin.isDefined)

    def isSealed[A: Type]: Boolean =
      TypeRepr.of[A].typeSymbol.flags.is(Flags.Sealed)

    def extractSealedSubtypes[A: Type]: List[Type[?]] = {
      def extractRecursively(sym: Symbol): List[Symbol] =
        if sym.flags.is(Flags.Sealed) then sym.children.flatMap(extractRecursively)
        else if sym.flags.is(Flags.Enum) then List(sym.typeRef.typeSymbol)
        else if sym.flags.is(Flags.Module) then List(sym.typeRef.typeSymbol.moduleClass)
        else List(sym)

      extractRecursively(TypeRepr.of[A].typeSymbol).distinct.map(typeSymbol =>
        typeSymbol.typeRef.asType
      )
    }

    if isSealed[A] then {
      val cases = extractSealedSubtypes[A].map { tpe =>
        val sym = TypeRepr.of(using tpe).typeSymbol
        val bindName = Symbol.newVal(Symbol.spliceOwner, FreshTerm.generate(sym.name.toLowerCase), TypeRepr.of[A],Flags.EmptyFlags, Symbol.noSymbol)
        val body = '{ println(${ Expr(sym.name) }) }.asTerm

        if sym.flags.is(Flags.Enum | Flags.JavaStatic) then
          CaseDef(Bind(bindName, Ident(sym.termRef)), None, body)
        else if sym.flags.is(Flags.Module) then
          CaseDef(
            Bind(bindName, Ident(sym.companionModule.termRef)),
            None,
            body
          )
        else
          CaseDef(Bind(bindName, Typed(Wildcard(), TypeTree.of(using tpe))), None, body)
      }
      Match(a.asTerm, cases).asExprOf[Unit]
    } else '{ () }
  }

  inline def matchOn[A](a: A): Unit = ${ matchOnImpl[A]('{ a }) }
}

repro.test.scala

//> using dep org.scalameta::munit:1.0.0-RC1

package test

sealed trait Upper
object Upper {
  case object A extends Upper
  case object B extends Upper
  case object C extends Upper
}

sealed trait lower
object lower {
  case object a extends lower
  case object b extends lower
  case object c extends lower
}

class Test extends munit.FunSuite {

  test("should print its own name") {
    Macros.matchOn[Upper](Upper.A)
    Macros.matchOn[Upper](Upper.B)
    Macros.matchOn[Upper](Upper.C)

    Macros.matchOn[lower](lower.a)
    Macros.matchOn[lower](lower.b)
    Macros.matchOn[lower](lower.c)
  }
}
scala-cli test .

Output

A$
B$
C$
a$
a$
a$

Expectation

A$
B$
C$
a$
b$
c$

When case object with lowercased names is used match seem to fall through on the first case. For the same macro code the behavior is correct if the name of case object starts with an upper case.

I haven't observed such issue with enums.

@Gedochao
Copy link
Contributor

Gedochao commented May 9, 2024

behaviour still present on 3.5.0-RC1-bin-20240508-b10d64e-NIGHTLY

@mbovel mbovel added the Spree Suitable for a future Spree label May 20, 2024
@mbovel
Copy link
Member

mbovel commented Jul 1, 2024

@hamzaremmal do you think that would be appropriate for a spree?

@mbovel
Copy link
Member

mbovel commented Sep 23, 2024

Still not sure if this is appropriate for a spree and how hard that would be to fix.
cc @hamzaremmal

@hamzaremmal
Copy link
Member

Sorry @mbovel, I haven't seen the first ping. I don't think it's suitable for a spree.

@hamzaremmal hamzaremmal self-assigned this Sep 23, 2024
@mbovel mbovel removed the Spree Suitable for a future Spree label Sep 23, 2024
@dos65
Copy link
Contributor

dos65 commented Sep 23, 2024

It's not a bug in compiler. The actual issue is in macros. This version ( without a branch for module) works fine

Fixed version
//> using scala 3.3.3

import scala.quoted.*

object Macros {

  def matchOnImpl[A: Type](a: Expr[A])(using quotes: Quotes): Expr[Unit] = {
    import quotes.*, quotes.reflect.*

    // workaround to contain @experimental from polluting the whole codebase
    object FreshTerm {
      private val impl = quotes.reflect.Symbol.getClass.getMethod("freshName", classOf[String])

      def generate(prefix: String): String = impl.invoke(quotes.reflect.Symbol, prefix).asInstanceOf[String]
    }

    extension (sym: Symbol)
     def isPublic: Boolean = !sym.isNoSymbol &&
          !(sym.flags.is(Flags.Private) || sym.flags.is(Flags.PrivateLocal) || sym.flags.is(Flags.Protected) ||
            sym.privateWithin.isDefined || sym.protectedWithin.isDefined)

    def isSealed[A: Type]: Boolean =
      TypeRepr.of[A].typeSymbol.flags.is(Flags.Sealed)

    def extractSealedSubtypes[A: Type]: List[Type[?]] = {
      def extractRecursively(sym: Symbol): List[Symbol] =
        if sym.flags.is(Flags.Sealed) then sym.children.flatMap(extractRecursively)
        else if sym.flags.is(Flags.Enum) then List(sym.typeRef.typeSymbol)
        else if sym.flags.is(Flags.Module) then List(sym.typeRef.typeSymbol.moduleClass)
        else List(sym)

      extractRecursively(TypeRepr.of[A].typeSymbol).distinct.map(typeSymbol =>
        typeSymbol.typeRef.asType
      )
    }

    if isSealed[A] then {
      val cases = extractSealedSubtypes[A].map { tpe =>
        val sym = TypeRepr.of(using tpe).typeSymbol
        val bindName = Symbol.newVal(Symbol.spliceOwner, FreshTerm.generate(sym.name.toLowerCase), TypeRepr.of[A],Flags.EmptyFlags, Symbol.noSymbol)
        val body = '{ println(${ Expr(sym.name) }) }.asTerm

        if sym.flags.is(Flags.Enum | Flags.JavaStatic) then
          CaseDef(Bind(bindName, Ident(sym.termRef)), None, body)
        else
          CaseDef(Bind(bindName, Typed(Wildcard(), TypeTree.of(using tpe))), None, body)
      }
      Match(a.asTerm, cases).asExprOf[Unit]
    } else '{ () }
  }

Explanation:

The removed from above code:

CaseDef(
    Bind(bindName, Ident(sym.companionModule.termRef)),
    None,
    body
)

generates the following code:

b match
  case a$$macro1 @ a => println("a")
  case b$$macro1 @ b => println("b")

which is equivavlent to case _ => println("a"). See this section about varid at https://www.scala-lang.org/files/archive/spec/3.4/08-pattern-matching.html#variable-patterns

Example in scastie - https://scastie.scala-lang.org/s0mcdSvpTqOJ3fPXDpnEtQ

@jchyb
Copy link
Contributor

jchyb commented Sep 23, 2024

So in summary, we might have thought that this macro generates this (which would work correctly):

(lower.b: lower) match
    case amacro @ lower.a => (...)
    case bmacro @ lower.b => (...)
    case bmacro @ lower.c => (...)

but it actually interprets it as this:

(lower.b: lower) match
    case amacro @ a => (...)
    case bmacro @ b => (...)
    case bmacro @ c => (...)

This would make a lot of sense, as that explains the different behavior of Upper and lower, however the worrying part are the compiler printouts showing what gets generated:

lower.b match
        {
          case a$$macro$5 @ lower.a => println("a$")
          case b$$macro$5 @ lower.b => println("b$")
          case c$$macro$5 @ lower.c => println("c$")
        }

which looks correct.
I guess there may be a node missing needed for PatternMatcher that does not usually get printed out in the compiler, but gets added in Typer or other phases (like the Typed node) - which is why the macro example doesn't work, but the regular example does.

@dos65
Copy link
Contributor

dos65 commented Sep 23, 2024

@jchyb

however the worrying part are the compiler printouts showing what gets generated:

Created Ident in macros goes with type lower.a - that's why it's printed like a Select.

However, it's an Ident and at PatternMatcher stage it sees Bind(" a$$macro$5", Ident("a")) and then it falls into WildcardPattern -

case x: Ident => x.name.isVarPattern && !isBackquoted(x)

@dwijnand
Copy link
Member

dwijnand commented Sep 23, 2024

Yeah, we did tackle this at some point - I think it was a spree. And basically there's some code branch (I think it was within Quotes) that considers that lower is stable, so the tree representation can be folded down into just an Ident with the correct type. So on the one hand we can't write that correctly in source, on the other hand the prefix is indeed pure, so folding it isn't wrong per se... So I wasn't sure where the correct fix is; that is from the assumption that the macro user's code is reasonable - the workaround is changing the macro code.

@jchyb
Copy link
Contributor

jchyb commented Sep 26, 2024

@dos65 @dwijnand Thank you for the explanations!

I imagine the best course of action would be to add a check for incorrect/unexpected Idents in -Xcheck-macros (the main fear here is there could be a lot of those in the already published libraries), alternatively we could sanitize Idents into correct forms in Quotes IdentModule.apply() depending on the termRef (this might be unpopular/cause other unexpected issues)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants