-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[query] eliminate optimization that can blow RAM #13619
[query] eliminate optimization that can blow RAM #13619
Conversation
CHANGELOG: On some pipelines, since at least 0.2.58 (commit 23813af), Hail could use essentially unbounded amounts of memory. This change removes "optimization" rules that accidentally caused that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These optimizations are too important to just disable. I think a better quickfix is to do
case ToStream(ToArray(s), false) if s.typ.isInstanceOf[TStream] => s
case ToStream(Let(name, value, ToArray(x)), false) if x.typ.isInstanceOf[TStream] =>
Let(name, value, x)
The right fix is probably to add a requiresMemoryManagementPerElement
flag to StreamMap
---which could be generally useful when the producer didn't care about memory management, but the map body allocates a lot and wants to free after each row---and then use that to make these smarter:
case ToStream(ToArray(s), false) if s.typ.isInstanceOf[TStream] => s
case ToStream(ToArray(s), true) if s.typ.isInstanceOf[TStream] =>
StreamMap(s, uid, Ref(uid), requiresMemoryManagementPerElement = true)
case ToStream(Let(name, value, ToArray(x)), false) if x.typ.isInstanceOf[TStream] =>
Let(name, value, x)
case ToStream(Let(name, value, ToArray(x)), false) if x.typ.isInstanceOf[TStream] =>
Let(name, value, StreamMap(x, uid, Ref(uid), requiresMemoryManagementPerElement = true))
And I still don't fully understand why these were deoptimizing the gnomad pipeline.
FWIW, I finally found a simpler reproducer. It really takes some doing to convince the simplifier to apply this rule. This operation should use a constant ~1GiB of RAM (in reality, in a non-broken pipeline it uses closer to 8GiB, but, still, a constant amount of RAM), but in reality memory use grows with each row processed import hail as hl
ht = hl.utils.range_table(1)
ht = ht.key_by()
ht = ht.select(rows = hl.range(10))
ht = ht.explode('rows')
ht = ht.annotate(garbage=hl.range(1024 ** 3))
ht.write('/tmp/foo.ht', overwrite=True) The simplifier cannot simplify the pipeline if the key is still present so this pipeline is sufficient to restore normal memory usage: import hail as hl
ht = hl.utils.range_table(1)
ht = ht.select(rows = hl.range(10))
ht = ht.explode('rows')
ht = ht.annotate(garbage=hl.range(1024 ** 3))
ht.write('/tmp/foo.ht', overwrite=True) The "bad"
The "good" IR looks like this:
Notice, in particular, that the |
Issue for StreamMap memory management: #13623 |
@patrick-schultz bump for tomorrow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a very clear example, thanks!
CHANGELOG: On some pipelines, since at least 0.2.58 (commit 23813af), Hail could use essentially unbounded amounts of memory. This change removes "optimization" rules that accidentally caused that.
Closes #13606