Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong junction generated when using OR with EXISTS subquries #3305

Closed
Incanus3 opened this issue Jan 17, 2024 · 11 comments · Fixed by #3308
Closed

Wrong junction generated when using OR with EXISTS subquries #3305

Incanus3 opened this issue Jan 17, 2024 · 11 comments · Fixed by #3308
Assignees
Labels
Milestone

Comments

@Incanus3
Copy link
Contributor

Incanus3 commented Jan 17, 2024

Expected behavior

When I do a query like this

val childSubQuery = QEaObject(database).stereotype.eq("car")
val containedSubQuery = QEaObject(database).stereotype.eq("car")
val containerPackageSubQuery = QEaPackage(database).alias("container").exists(
    containedSubQuery.alias("contained")
        .raw("contained.PACKAGE_ID = container.PACKAGE_ID").query(),
)

QEaObject(database)
    .alias("parent")
    .or()
    .exists(
        childSubQuery.alias("child")
            .raw("child.PARENTID = parent.OBJECT_ID").query(),
    )
    .eaGuid.isIn(containerPackageSubQuery.select(QEaPackage._alias.eaGuid).query())
    .endOr()
    .select(QEaObject._alias.id)
    .findList()

I expect the two WHERE clauses to be joined using OR junction, but AND is generated instead.

Actual behavior

The actual generated query is

select parent.OBJECT_ID from T_OBJECT parent
where (
  parent.EA_GUID in (
    select container.EA_GUID from T_PACKAGE container where exists (
      select 1 from T_OBJECT contained where contained.STEREOTYPE = 'car' and contained.PACKAGE_ID = container.PACKAGE_ID
    )
  )
) AND exists (
  select 1 from T_OBJECT child where child.STEREOTYPE = 'car' and child.PARENTID = parent.OBJECT_ID
);

As you can se, it has AND instead of OR.

Steps to reproduce

I reproduced this issue in a minimal test repo here https://github.com/Incanus3/ebean-test/blob/main/ea/db/src/test/kotlin/cz/sentica/qwazar/ea/db/ExistSubqueryJunctionsTest.kt#L33

Rationale behind this use-case

Our application accesses a database, whose schema is managed by Enterprise Architect, and is thus out of our control. In EA there are two kinds of parent-child relationships:

  1. direct relationship where the child EaObject references the parent EaObject using child.parent_id = parent.object_id
  2. containment in a package where
    1. the containing EaPackage is referenced by the contained EaObject.package_id
    2. each EaPackage has an associated EaObject which has the same EA_GUID as the EaPackage (yeah, no pkey relationship there)

so if we want to find all EaObjects wich "contain" the given EaObject (either directly as parents or though a package), we must do it using a query like this. I concede that ideally I shouldn't need a query like this - we could create a view that would precompute these parent-child and package-contained object relationship, nevertheless I still think this is a bug and you should know about it.

Side question

Is there a way to avoid using raw when doing these exists subqueries where the subquery needs to reference the "parent" query?

@rob-bygrave
Copy link
Contributor

Did you try moving the .alias("parent") to be outside the OR ? Looks like this is ending the OR ...

QEaObject(database)
  .alias("parent") // <!-- HERE - outside the OR
  .or()
    .exists( ...
    .eaGuid.isIn(containerPackageSubQuery.select(QEaPackage._alias.eaGuid).query())
  .endOr()

@Incanus3
Copy link
Contributor Author

yeah, that was one of the first things that occurred to me, but the result is the same

@rob-bygrave
Copy link
Contributor

You need to move the .alias("parent") because it is conceptually incorrect where it is in this example and in the test at the moment (so we need to move it so that people do not read this and make the same mistake / same misunderstanding) or remove it completely of course.

Once you do that ... then you need to produce 2 other tests with 1 not using exists and 1 swapping the order of the exists and isIn. Then report back on those 3 tests and update the description on this bug report to reflect what works/passes and what does not pass.

@Incanus3
Copy link
Contributor Author

updated both the original comment and the test file. did I get it right?

@rob-bygrave
Copy link
Contributor

Yes - awesome.


Now this bug is in TQRootBean (Type Query Root Bean) in that we see:

  public R exists(Query<?> subQuery) {
    query.where().exists(subQuery);   // <!-- The query.where() is the bug here !!
    return root;
  }

And the fix for this would be for it to be instead:

  public R exists(Query<?> subQuery) {
    this.peekExprList().exists(subQuery);  // <!-- Fix uses peekExprList()
    return root;
  }

The fix is to replace query.where() with this.peekExprList(). The query.where() is returning the top level ExpressionList ... and the peekExprList() is returning the current ExpressionList.

So the bug here that the exists(subQuery) expression is always added to the top level ExpressionList when we instead want it added to the current ExpressionList that represents the OR in this case. [An ExpressionList is a list of expressions added together by either AND or OR]

We can also note that the notExists(subQuery) also has this same bug and that these bugs are specifically for query beans / these 2 bugs are on TQRootBean.

@rob-bygrave
Copy link
Contributor

Now, do you want to have a go at doing a PR for this one or would you rather leave it to me?

If you do a PR that just has the fix I'll add a failing test case to the PR (we want the failing test - typically we would assert on the generated SQL asserting that the where clause has the expected clause with the exists inside the OR).

@Incanus3
Copy link
Contributor Author

I can definitely give it a try. I'm just not sure I understand about the failing test - do you mean it should fail before I apply the fix (and pass afterwards), or should it actually assert the current behavior and fail after I apply it (as some sort of paper trail of the development)?

@rob-bygrave
Copy link
Contributor

it should fail before I apply the fix (and pass afterwards)

Yes - this one.

@Incanus3
Copy link
Contributor Author

Incanus3 commented Jan 18, 2024

So, I tried to do this fix including the test (only for .exists(Query<?>) for now. Applying the actual fix, per your instructions, was no problem of course, but adding the test and making it pass was a different story. I found this test file, which seems like one I could add this case to, but...

First, I didn't manage to succeed running the test from IDEA (using the "play" button). First I had to add the slf4j-api dependency to the ebean-querybean module, because ebean-api declares this as an optional dependency, so I was getting ClassNotFoundError for LoggerFactory. After I got past this, I started getting this error

this error and couldn't get rid of it. I tried to pass --add-exports io.ebean.core/io.ebeaninternal.api=ALL-UNNAMED and even --illegal-access=permit argLine to the maven-surefire-plugin, but it didn't help. Finally I decided to quit and just run mvn test from console, which works fine.

When I was able to run the test file, I added this test case, which after applying your fix, almost passes - the order is correct and the right junction is used - but it seems I'm hitting some other strange issue - although I specify the "parent-child" join condition as .raw("contact.customer_id = customer.id"), in the generated query this gets changed to contact.customer_id = contact.customer_id as you can see here.

@rbygrave
Copy link
Member

rbygrave commented Jan 18, 2024

running the test from IDEA

Ah yeah. With ebean we run all the tests using classpath and not using module-path. [Long story but yes using module-path when running tests is more pain and we basically don't do that, sorry you went through that pain there].

For running ebean tests with IntelliJ IDEA we have:

IntelliJ IDEA

We want to get IntelliJ to run tests using classpath similar to Maven Surefire. To do this set: JUnit -> modify options -> Do not use module-path option

To set this option as the global default for IntelliJ use:

Run - Edit Configurations -> Edit configuration templates -> JUnit -> modify options - Do not use module-path option

... which is hidden at the very bottom of the README at https://github.com/ebean-orm/ebean#intellij-idea

hitting some other strange issue

This is that there is a name collision in that .raw("contact.customer_id = customer.id") the customer.id path is a valid logical property path for Contact - and this clashes with the physical table alias of customer. We can change the test so that the table alias does not clash with the logical property name there.

Great work, well done.

@rbygrave rbygrave added this to the 13.25.3 milestone Jan 18, 2024
@rbygrave rbygrave linked a pull request Jan 18, 2024 that will close this issue
@rbygrave rbygrave added the bug label Jan 18, 2024
@Incanus3
Copy link
Contributor Author

Incanus3 commented Jan 18, 2024

With ebean we run all the tests using classpath and not using module-path

Oh, ok, I'm glad there's a way to fix this then. I went through the howto-test.md, but it didn't occur to me to look in the general readme.

name collision

Makes sense. I keep forgetting that even though it's called raw, it's actually still pre-processed by ebean to get these niceties. But if I guess correctly, if I qualify the names using alias, this property name -> column name is no longer done (I mean when the alias doesn't clash with actual property name), right?

well done

Glad to be marginally useful, maybe in a few years, I'll actually be able to come up with the solution myself :) For now, I'm very grateful for your guidance.

BTW, do you have any ideas concerning this?

Is there a way to avoid using raw when doing these exists subqueries where the subquery needs to reference the "parent" query?

I know I can replace the exists condition with in test, e.g. instead of

select from x parent where exists (select 1 from x child where child.parent_id = parent.id and ...)

I can do

select from x where id in (select parent_id from x where ...)

which I can build easily with the typed querybean API, but I'm not sure this is as performant as the exists variant. In any case it would be nice if it was possible to avoid using raw for these "outer-query-referencing" subqueries (I don't mean this as a feature request, just as a question if there's already a way that I don't know of).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants