Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the minus operation #45

Closed
wants to merge 3 commits into from
Closed

Add the minus operation #45

wants to merge 3 commits into from

Conversation

felixyz
Copy link
Collaborator

@felixyz felixyz commented May 14, 2024

  • No optimizations apart from the existing ones for n-ary operations

  * No optimizations apart from the existing ones for n-ary operations
Copy link
Contributor

@blambeau blambeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Major comment: we should probably add tests for summarize.union and that kind of chaining, that requires introducing a CTE/SELECT-FROM-SELECT construction in SQL


def all?
false
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why all?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was under the impression some part of translation of nary operations expected it, but you're right it's not needed, will remove.

initial = operands[0].to_a
tuples = operands.drop(1).inject(initial) do |agg, op|
agg - op.to_a
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely not the optimal implementation, but I suppose we don't really care about it (?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe using Ruby's Set? Which I presume uses object hashes? That should be reasonably good.
I'll do a perf test to check that it's actually faster (although the code will be very similar).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed this. My benchmark went from ~25s to ~1s.

See: https://gist.github.com/felixyz/7151d6acc1ef3d0d674caeacc9538862

Slight worry: the reported remaining number of tuples went down from ~55k to ~42k.

The optimization commit is here: d75847b

check_union_compatible(other, "Minus")
dup.tap{|x|
### attrlist stays the same
x.predicate = self.predicate & predicate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure that's correct.

  • self.predicate is certainly correct and safe (but not very strong).
  • Intuitively, I would say self.predicate & !predicate but it should be further checked though...

dup.tap{|x|
### attrlist stays the same
x.predicate = self.predicate & predicate
x.keys = self._keys.union(self, x, other) if knows_keys?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's correct. We simply keep the keys of the left operand I would say, and gain no other one.

@@ -2,7 +2,7 @@ module Bmg
module Version
MAJOR = 0
MINOR = 21
TINY = 5
TINY = 6
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking we should upgrade to 0.22.0

@blambeau
Copy link
Contributor

@felixyz no obligation to add optimizations for now, most of the time I do it on a opportunistic way when we discover clearly unoptimal query plans.

Pushing restrictions and projections down the tree come to mind, though, if you want an exercice ;)

@blambeau blambeau closed this May 17, 2024
@blambeau blambeau deleted the add-minus branch May 17, 2024 20:17
@blambeau
Copy link
Contributor

@felixyz cleaned & merged. I had to release 0.22.0 for a customer project, and took that as an opportunity to include minus.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants