Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General SIZE transformation enhancements #90

Closed
Mobe91 opened this issue Oct 13, 2014 · 4 comments
Closed

General SIZE transformation enhancements #90

Mobe91 opened this issue Oct 13, 2014 · 4 comments
Assignees
Milestone

Comments

@Mobe91
Copy link
Contributor

Mobe91 commented Oct 13, 2014

At the moment we generate subqueries as soon as we have more than one SIZE or any other collection in the query. We could replace these subqueries by using COUNT(DISTINCT elem).

@Mobe91 Mobe91 added this to the 1.1.0 milestone Oct 13, 2014
@beikov
Copy link
Member

beikov commented Oct 14, 2014

This is related to #92
Beware that when having composite ids we need some special handling.

@beikov beikov modified the milestones: 1.2.0, 1.1.0 Feb 3, 2016
@beikov
Copy link
Member

beikov commented Feb 9, 2016

Also note that using count distinct to subquery transformation is only necessary for very few DBs like for example MySQL.

@beikov
Copy link
Member

beikov commented Feb 9, 2016

Please also help me with defining what the transformation should do. So far I came up with the following analogous to what was defined in #188.

Instead of creating a subquery for the SIZE function when appearing in the select clause, we can transform it into

  • A newly generated JOIN node
  • GROUP BY parent JOIN node
  • and replace the SIZE with a COUNT(newAlias)

as long as there are no group bys present.

Multiple SIZE usages require to use COUNT(DISTINCT newAlias) which is not possible for some DBs when the relation uses a composite id.

This tranformation should be deactivatable via a property.

@Mobe91
Copy link
Contributor Author

Mobe91 commented Feb 9, 2016

I would like to close #188 and have this issue for the planned SIZE transformation enhancements in general.

I came up with the following

General

  • Generate subquery if at least one explicit group by exists and if there exists no group by that is equal to the group by that would be generated by a transformation (which is a group by the IDs along the path of the collection)
  • If the query contains multiple join roots, generate a subquery
  • If for a collection an explicit join already exists, a separate join must be generated and COUNT DISTINCT should be used
  • Multiple uses of SIZE with different collections requires COUNT DISTINCT

JOIN

  • Use only subqueries in ON clauses

WHERE

  • SIZE uses in the WHERE clause can be transformed to COUNT and moved to the HAVING clause
  • A separate collection join has to be generated in order to compensate for records filtered out by the remaining WHERE clause predicates
  • Only allowed for atomic top-level disjuncts
  • Other attributes used in the predicate expression have to be added to the group by

@Mobe91 Mobe91 changed the title Use count for multiple SIZE-selects General SIZE transformation enhancements Feb 9, 2016
@Mobe91 Mobe91 closed this as completed in 49155a6 Feb 10, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants