Skip to content

Add has_closure_tree_roots (plural) to allow has_many in related AR models #446

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

sergioisidoro
Copy link

Having has_closure_tree_root is quite useful, when there is a root instance that collects all children.

However I have the use case where a related model has multiple roots. This is not a case of multi-parent hierarchy, but instead a related object having multiple roots. The example in the tests is one of them. Post has many comments, and those comments can have children. With has_closure_tree_root we would be forced to create a root comment to each post, from which all members would be decendents from to use the query helpers.

This allows a related model to have multiple associated roots, and have the same perfomance boost in the queries as with the has_closure_tree_root..

NOTE I was assisted by an AI agent while doing this work, and some of the code here is generated by a model. It has been reviewed and iterated by me.

@kbrock
Copy link
Contributor

kbrock commented May 22, 2025

This seems like a lot of change.

Have you tried either of these ways?: STI or ignoring the post in the comment model.

STI would have Comment as a child class of Post. Not sure that I like this, but some people like STI.

Treat the Comment as a tree form, but just have a belongs_to :post?
So you just bring back all comments arranged, via:

post.comments.roots
Comment.where(:post_id => post.id).roots

Besides roots, you can call any scope/method that works on Comment

@sergioisidoro
Copy link
Author

sergioisidoro commented May 23, 2025

This seems like a lot of change.

Well, the challenging thing for this change was tests. I would say 90% of this PR is setting up the test project + specs. Otherwise we're using almost the same code with a bit of changes. But because I don't want to break any backwards compatibility there was a bit of copy paste.

About STI, I would really like to avoid it at all cost. It has cause me a lot of pain the past.

About your other solution, I'm not sure if it will provide the same benefits of n+1 query avoidance. If I have 100 roots (eg. 100 root comments on a post), it will result in 100 queries for the children of each of the root nodes? Perhaps I'm missing something from the eager loading methods, like root_comment_including_tree

If anyone would like to help me add documentation on how to handle eager loading with multi root has_many, I'm happy to participate in whatever way I can.

@sergioisidoro
Copy link
Author

sergioisidoro commented May 23, 2025

To summarize -- There are 300+ lines but the core of the change are just these lines

          # For multiple roots, fetch all descendants at once to avoid N+1 queries
          root_ids = roots.map(&:id)
          klass = roots.first.class
          hierarchy_table = klass._ct.quoted_hierarchy_table_name

          # Get all descendants of all roots including the roots themselves
          # (generations 0 = self, > 0 = descendants)
          descendant_scope = klass.
            joins("INNER JOIN #{hierarchy_table} ON #{hierarchy_table}.descendant_id = #{klass.quoted_table_name}.#{klass.primary_key}").
            where("#{hierarchy_table}.ancestor_id IN (?)", root_ids).
            includes(*assoc_map).
            distinct

          descendant_scope

@kbrock
Copy link
Contributor

kbrock commented May 23, 2025

About STI, I would really like to avoid it at all cost. It has cause me a lot of pain the past.

lol - yes. My teammates no longer listen when I say "lets get rid of the STI in model X." Different people are in different places on that debate. Seems many of the users of ancestry rely upon STI in the darndest places. (sorry, ancestry is more my point of reference - just trying to help a little over here, too.)

I'm not sure if it will provide the same benefits of n+1 query avoidance

I think we agree that the perfect case requires 2 queries: Post and Comment. And the perfect case would be to state it as a single query on Post with the Comment query being driven by an includes(:comments) or closure tree specific scope/method stating that.

Wish it were easier to say to active record: "I've fetched these records because you didn't know how to fetch them for includes(). Please put them into these objects as if it were fetched using includes()

I was thinking:

posts = Post.take(10)
comments = Comment.where(:post_id => posts.map(&:id)).hash_tree

posts.each do |post|
  puts "#{post.id}: #{post.name}"
  print_comments(comments.keys.select { |comment| comment.post_id == post.id})
end

def print_comments(comments_hash, indent: 0)
  comments_hash.each do |comment, children|
    puts " " * indent + comment.text
    print_comments(children, indent: indent + 2) unless children.blank?
  end
end

@kbrock
Copy link
Contributor

kbrock commented May 23, 2025

Yea. This is a great idea. Someone proposed something similar in ancestry. I'll try and circle back and give any feedback I can. reminder: I can not merge this stuff, so listening to me is at your own discretion. you know?

@seuros
Copy link
Member

seuros commented May 28, 2025

Hey, thanks for the PR and detailed write-up. I missed the notification for this one.

That said, we need to talk.

Because what you’ve built here is clever in development but also cursed in production (I tried this pattern before).

Let’s break it down with something familiar: Reddit.

Your Post is a Reddit thread.
Your Comments are the replies.
And each top-level comment is the start of a mini warzone, branching arguments, nested conspiracies, and one guy confidently explaining how he built 1 billion MVPs using Supabase and Replit instead of Rails.
Because Rails is dead, bro.

That’s what closure_tree is made for:
Comments replying to comments, forming one tree per post.

But what you’are doing is trying to make the Post part of that tree as if the post is a parent node and all the comments are its children and sub-children.

That’s not how it works.

What you’ve really got here is a forest.

Multiple independent root comments, each forming its own tree.
And you're trying to tell closure_tree:

"Hey, just pretend they’re all one big happy hierarchy because they all live under the same post_id, okay?"

Nope. That is like expecting closure_tree to moderate /r/AskReddit just because it can handle /r/trees.


Why has_closure_tree_root is not the correct solution:

This macro is for a single-rooted hierarchy inside one model, it boosts retrival speed by anchoring the tree.
You are trying to scope by foreign key (post_id), not by ancestry. That’s not a tree. That’s a filter.

Unless you plan to:

  • Create a fake “root comment” per post ( and make it invisible (Post has_one root_comment),

  • or assign post_id to every single comment in the tree via before_create

you’ll end up with deeply nested replies that don’t know which post they belong to. (because their position is on the other table)

Congratulations, you have orphaned your comments. That is how moderation nightmares start. And will start receiving hate mails because you deleted something or banned them without reading the context


You want to avoid N+1 queries across multiple trees. I will show you how i did it:

Post::Comment.where(post_id: 42, parent_id: nil).includes(:descendants)

Yes i nest my models, you can see that in my other repos

Or wrap it like a civilized Sangoma:

def self.comment_forest_for(post_id)
  where(post_id: post_id, parent_id: nil).includes(:descendants)
end

This keeps your model clean, the hierarchy intact, and the SQL efficient. You don’t need to mutate closure_tree to get there.

I'm currently upgrading this gem after cleaning up with_advisory_lock, removing legacy hacks (yes, the ones that still magically work), and modernizing the interface, so others like you don’t have to write tests around features deprecated back in Rails 4.2. (if defined?)

The next version will include native support for multi-DB setups and drop a lot of the patching we had to carry over the years.

I’d appreciate it if you could give the new version a spin once it’s out, especially with your use case. Fresh eyes, sharp feedback. I will only support rails 7.1+ and ruby 3.3+, so i hope your app is not running on fumes.

@sergioisidoro
Copy link
Author

First off, thanks for for taking the time for writing this, and honnestly for the one of the best replies I've had the pleasure to read :)

I understand your argument, and I also understand that the library is made with a main use case in mind, and if this doens't align or doesn't bring value to that use case, it's fair game to reject it.

My use for tree structure is flatter than most - ie don't expect a lot of depth, but we do expect to have many roots with 1 or 2 levels deep, and we do require a significant amount of joins with related models. Because we don't have too much nesting, we return the objects nested in the API response (ie we would like to preserve the tree structure on returned queries)

I think the starting point of this came from reading the secion on eager loading -- https://github.com/ClosureTree/closure_tree?tab=readme-ov-file#eager-loading -- which perhaps confused me leading me to think that .includes(:descendants) was not the way to go, because it does not preserve the tree, as mentioned in:

However, the return value is an array of Comments, and the tree structure is thus lost, which makes it difficult to walk the tree using elegant recursive algorithms.

Also there was something very "Rails like" to use something similar to has_many. So if has_closure_tree_root would be a "has_one", has_closure_tree_roots would be a has_many.

But we will find a way around these things, no worries 👍🏽

I’d appreciate it if you could give the new version a spin once it’s out, especially with your use case.

Absolutely! We do our best to keep on the lastest release of RoR, and we are on Ruby 3.3, so no issues there.
Looking forward, and let me know if there's anything we can do to help.

Thank you keeping this gem alive 💎

@seuros
Copy link
Member

seuros commented May 28, 2025

Thanks for the kind words - really appreciate your openness and how you approached the discussion 🙏

Your use case (lots of shallow trees scoped under a parent model) is actually a pretty common pattern, especially in API-backed UIs where the nesting is serialized directly. I totally get the temptation to reach for a “has_many :roots”-style DSL. i use representable gem.

But here’s the thing: ClosureTree lives at the database layer. Your use case is in the presentation layer. Mixing the two just adds maintenance burden for everyone. Imagine if this gem had built-in output helpers for XML or JSON-RPC, every change would turn into a breaking change hunt across five layers. That’s how gems die or become unmaintainable. That’s how you end up with that one developer quietly monkey-patching everything in /lib/hacks and never opening a PR upstream, because he could not make the change for postgresql or sqlite3.

I’ve been there. I had the same experience with acts_taggable_on. It became a burden, every merge broke someone's fragile app built around undocumented edge behavior. Many feature owners drifted out of Ruby entirely. Some are probably farming now. So I rewrote the core idea cleanly in a gem focused purely on the DB layer contriboss/no_fly_list

Maybe down the line, if we can formalize a solid, predictable pattern for multi-root scoped trees without muddying the core, we could offer an extension or companion module (optional require), something like acts_as_forest or scoped_closure_roots, to better match the mental model without violating closure_tree’s core contract.

But that would require:

  • A clearly defined spec
  • Probably some DSL sugar
  • And yes, docs, examples, blog posts — so LLMs can learn the Right Way™ and stop hallucinating gems that never existed.

For now though, .includes(:descendants) + grouping or nesting in your serializer layer is the most robust way especially since your trees are mostly 1–2 levels deep.

I really appreciate your commitment to staying current, and I’ll ping you once the next version lands on master. Would be awesome if you could run it against your test suite and flag any edge regressions.
Took me nearly a year of careful cleanup and planning just to avoid having my name end up on Twitter or buried in some Jira/Trello board with the ticket: "Bug: Seuros broke our app." 😅

Let’s keep pushing Rails forward, one recursive join at a time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants