Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a "/" at the end can reduce the amount of "OR" queries #563

Closed
KaviiSuri opened this issue Dec 24, 2021 · 4 comments
Closed

Adding a "/" at the end can reduce the amount of "OR" queries #563

KaviiSuri opened this issue Dec 24, 2021 · 4 comments

Comments

@KaviiSuri
Copy link

It is pretty common in my experience to get all descendants of a record in the tree.
Currently, a descendant of a record can have "ancestry" column of the following forms:

  • "{id}"
  • "{id}/{another_id}/.."
    Because of this behaviour, the query to get descendants of id 3 is
SELECT "{table_name}".* FROM "{table_name}" WHERE ("{table_name}"."ancestry" LIKE '3/%' OR "{table_name}"."ancestry" = '3')

The "OR" in the query seems unnecessary and can be easily removed if we just add "/" at the end of ancestry always, effectively changing the forms to

  • "{id}/"
  • "{id}/{another_id}/.."
    This way, the query is just
SELECT "{table_name}".* FROM "{table_name}" WHERE "{table_name}"."ancestry" LIKE '3/%'

Is there a specific reason ancestry doesn't do this and has decided to not have a "/" at the end of single ancestors?

@kbrock
Copy link
Collaborator

kbrock commented Jan 4, 2022

Outstanding resources

The research materials/presentations for tree structures in a number of programming languages all had the same implementation. And that implementation without the trailing slash looks obviously wrong.

Implementation

I have been pulling the materialized path implementation into a separate class so we could support multiple tree implementations: traditional materialized path, the trailing slash, postgres arrays, ltree, path with the id in it.

Check out: https://github.com/kbrock/ancestry/tree/materialized_path2
I just rebased so it should be in a usable state.

Results

When I ran the numbers for trailing slash it was not any faster.

Issues I see:

  • we need to stop using unicode encoding for the column.
  • We still have way too many OR clauses
  • We do not have good data sets to show if we actually have improved anything.

Future steps

While I totally agree that the current implementation that everyone uses is probably wrong, I would like to actually know which actions/queries use indexes and which table scan.

So I keep getting derailed by wanting a test suite to actually test this stuff out.

@kbrock kbrock mentioned this issue Jan 5, 2022
@dimvic
Copy link

dimvic commented Feb 11, 2022

@kbrock Were the results using the pg array approach better? In fact, I was looking into this not because of speed, but because having a trailing and maybe a leading slash would make implementing custom scopes a much cleaner and easier task.

@kbrock
Copy link
Collaborator

kbrock commented Feb 25, 2022

@dimvic Yes. I too want a way to determine the root_id, parent_id, ancestor_ids, and child_ancestry values in pure sql without resorting to using a CASE. The current implementation sure leans towards wanting to use ruby to parse.

I want to use a serializer to parse/create the ancestry value. If we do that, then the value of ancestry would be an array - basically the same as ancestor_ids. It would remove a lot of complexity. But I don't know how much this breaks the code that uses ancestry. I know in our code, there are some assumptions that ancestry is a string with a / delimiter.

Very tempted to breaking this functionality in the next major version of ancestry. But I'm hesitant too

@kbrock
Copy link
Collaborator

kbrock commented Jun 10, 2022

Let me know if you end up trying out the strategy of materialized_path2 and if it works alright for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants