Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(hog): fix elements matching #24331

Merged
merged 47 commits into from
Sep 5, 2024
Merged

feat(hog): fix elements matching #24331

merged 47 commits into from
Sep 5, 2024

Conversation

mariusandra
Copy link
Collaborator

@mariusandra mariusandra commented Aug 13, 2024

Problem

It started from this:
image

This exposes two problems:

  • we don't have the new materialized elements_chain_* columns in Hog globals
  • we can't run functions that use lambdas, like arrayExists (solved)

Changes

  • Update the "expected volume" calculation to take into account global properties (those not on any one series)
  • Remove some unused code from the plugin server (ParsedClichouseEvent was not used anywhere 🤷)
  • Add the 4 new fields elements_chain_href, elements_chain_texts, elements_chain_ids, elements_chain_elements into Hog filter matching.
  • These fields are stored in the database as materialized columns. We use the same regex that's used by ClickHouse to calculate them in the plugin server.
  • The fields are added lazily, meaning they're calculated only when accessed, or when the bytecode is serialized.
  • Adds functions indexOf, position, positionCaseInsensitive and arrayCount

Extra

  • I'd like to add "elements_chain" higher up on the HogFunctionInvocationGlobals field event, in order to better match the actual database schema. However since it has more discrepancies (name instead of event for the actual event, etc), I decided to leave it for now.

Alternatives considered

  • Always calculating these materialized columns into event instead of lazily getting them. Rejected as this will be wasteful if you have no functions that look for element texts
  • Moving the regex calculation into Hog. This is a lot more work as Hog doesn't (yet) support lazy properties, and I'd then have to inject some "global hog" code before each filter.

How did you test this code?

Added a test in hog-executor.test.ts, poked around in the UI.

@mariusandra mariusandra changed the title feat(hogvm): add "has" feat(elements): use "has" for simplest element text match Aug 13, 2024
Base automatically changed from hogvm-has to master August 13, 2024 09:57
Copy link
Contributor

github-actions bot commented Aug 13, 2024

Size Change: 0 B

Total Size: 1.12 MB

ℹ️ View Unchanged
Filename Size
frontend/dist/toolbar.js 1.12 MB

compressed-size-action

@PostHog PostHog deleted a comment from posthog-bot Aug 13, 2024
@PostHog PostHog deleted a comment from posthog-bot Aug 13, 2024
@PostHog PostHog deleted a comment from posthog-bot Aug 13, 2024
@mariusandra mariusandra changed the title feat(elements): use "has" for simplest element text match feat(hog): fix elements matching Aug 13, 2024
@posthog-bot
Copy link
Contributor

📸 UI snapshots have been updated

2 snapshot changes in total. 0 added, 2 modified, 0 deleted:

  • chromium: 0 added, 2 modified, 0 deleted (wasn't pushed!)
  • webkit: 0 added, 0 modified, 0 deleted

Triggered by this commit.

👉 Review this PR's diff of snapshots.

@mariusandra mariusandra removed the stale label Aug 22, 2024
@mariusandra mariusandra requested review from benjackwhite and a team September 4, 2024 07:40
if (elementsChain) {
Object.defineProperties(response, {
elements_chain_href: {
get: () => getElementsChainHref(elementsChain),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this solution. I'm wondering though if we should also cache these values? Maybe not worth it for now...

Also a comment here would be helpful for future travellers to understand what this is all for (I barely get it right now)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also its not awesome that there is only a test for one path...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I found a rather neat way to cache this: e2e1bc7

More tests coming soon...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wooohhhhhh self destructing function :o

@mariusandra
Copy link
Collaborator Author

Ready for a re-review

@mariusandra mariusandra merged commit 70c6ce8 into master Sep 5, 2024
97 checks passed
@mariusandra mariusandra deleted the element-text-match branch September 5, 2024 10:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants