Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monte-Carlo Tree Search #329

Open
jafioti opened this issue Aug 4, 2024 · 7 comments
Open

Monte-Carlo Tree Search #329

jafioti opened this issue Aug 4, 2024 · 7 comments

Comments

@jafioti
Copy link

jafioti commented Aug 4, 2024

Has anyone applied MCTS to the saturation phase, rather than trying to achieve full saturation?

I'm dealing with large expressions like

((z/(((((((((((((w+153)/2)/2)/2)/2)/2)+-5)*4)+9)/2)/2)+-5)*((((((((((((h+153)/2)/2)/2)/2)/2)+-5)*4)+9)/2)/2)+-5)))%64)

where the optimal form is something like

(z % ((h - 95) * (w - 95) / 16))

but it's nearly impossible to reach full saturation (in under an hour) because the graph explodes.

However if most of the branches are discarded, and only the N best are searched at each step, we can instead reach this result in under a second. MCTS isn't likely good for all possible languages, but for simple math I think it would work well.

I'll start working on an implementation and do a PR if it interests anyone.

@oflatt
Copy link
Member

oflatt commented Aug 5, 2024

@ezrosent is interested in this direction!

@mwillsey
Copy link
Member

mwillsey commented Aug 5, 2024

Some folks in my class did a final project in this direction! They say mixed results, but there are likely more improvements to be seen. You may want to reach out to them about their experience.

@yihozhang
Copy link
Contributor

yihozhang commented Aug 5, 2024

@ezrosent
Copy link

ezrosent commented Aug 6, 2024

I posted some in-progress work here. Seems very similar to what Max's students were up to, though I think the emphasis is a bit different. I'm more interested in using this method for when we don't have access to good per-node costs (e.g. when no linear cost model is suitable for the domain). The final note at the bottom hits on the idea in this post, though, I think.

I think the proposal is to use MCTS to find a good set of candidates to then use to prune the search space and continue searching a la beam search. Maybe this would be better in extraction-gym? That way we could potentially use it in egglog too without having to reimplement it.

Of course @jafioti if you have a reason to make this egg-specific one of us could go back and port it to extraction-gym as needed.

@jafioti
Copy link
Author

jafioti commented Aug 6, 2024

Thanks for all the suggestions! I'm doing a very simple MCTS implementation right now which sort of combines the saturation and extraction phases together to do a form of beam search.

For context, this work is for a deep learning compiler Luminal to simplify indexing expressions in GPU kernels. The goal for Luminal is to also use a similar technique applied across the entire graph of operators, where rewrites can take place on the whole graph, and some MCTS-based equality saturation is done to search for the fastest graph, and the cost function is either an approximate runtime estimator or the actual profiled runtime.

Right now I'm just going to make it the simplest implementation I could, specific to my index-expression use case, but I think generalizing it from there will be straightforward.

@jafioti
Copy link
Author

jafioti commented Aug 19, 2024

@ezrosent I've been looking in to egglog a lot. Its super cool and amazing how it generalizes egg and runs faster. I'd like to implement MCTS right on it, but it's pretty complex and doesn't have a good rust API yet. Would you recommend I just stick with egg for now and bring it over to egglog once it's pretty polished?

@ezrosent
Copy link

Great question! Took some time to talk to other egglog devs on this, and the answer is "egglog is the future but it's not super stable." If you're writing production code, I'd recommend egg for now, but if you need something (e.g. multipatterns) that egglog handles a lot better you are welcome to use egglog and we will work on supporting your use-case.

If you do want a good set of more stable APIs I'd recommend the python bindings; they're more stable and polished than interacting with it directly in Rust.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants