-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Describing cut-over phase via UDF wait condition #50
Comments
Known catch: what if
|
I've read through this once and it hurts my brain. I know this isn't helpful, but I don't want you coming in in the morning thinking no one has commented. I think I'm gonna need to read through it a few more times to get my head around it. My only concern so far is related to having multiple migrations going at once and having them end near the same time and having the UDF unlock one before it's time. But as I said, my brain, it hurts. So I might be missing the reason why it is not an issue. I'll give it another look in the morning. |
I'm so glad to share the pain. If ever I get a burn-out, you can tell the psychiatrist the gh-ost cut-over phase is a major contributor. |
This is actually OK. Reason: |
This is where the UDF code resides, for now: https://github.com/openark/udf-ghost-wait-condition |
@shlomi-noach in chat you discussed fancy charts and pretty graphs* to help with this. I'm 💯 behind this idea. I think it would be very helpful.
|
A few more words on how this idea came to be. As per #26 , there are two connections, the premature death of any would cause a premature cut-over:
The solution depicted here answers these two problems:
|
Thank you for spending the time reviewing this. I'm closing this issue as #65 came up, which I believe to solve the cut-over without UDF. |
I wrote these UTF functions (will cross link once in proper repo):
create_ghost_wait_condition()
destroy_ghost_wait_condition()
ghost_wait_on_condition()
For now, they use a singular, global wait condition (maybe in the future we will support multiple).
The wait condition is a lock which is not bound by a connection. So if I:
Then the lock is taken, and is kept taken even if my connection dies. Anyone can, at any time:
And release the lock.
The function
ghost_wait_on_condition()
returns immediately if condition is free, or blocks if condition is taken. Multipleghost_wait_on_condition()
can run concurrently and they will all wait. Oncedestroy_ghost_wait_condition()
is called they all get released.Cut-over via wait condition
We have these tables:
tbl
- original table_tbl_gst
- ghost tableSequence of events is:
create view _tbl_gst_v as select * from _tbl_gst where ghost_wait_on_condition() is not null with check option
Breakdown:
ghost_wait_on_condition()
is alwaysnot null
with check condition
means everyinsert
,delete
,update
on existing rows will validate that the view definition is met, i.e. thewhere
clause is satisfied, i.e. we wait on lock.select create_ghost_wait_condition()
rename table tbl to _tbl_old, _tbl_gst_v to tbl
view
instead of the original table. The view reads/writes to_tbl_gst
insert|delete|update
queries operating on_tbl_gst_v
are blockedupdate
ordelete
that operate on non-existent rows (hence make no change, hence not visible in RBR anyhow, hence irrelevant)Working on backlog, applying all those last changes read from the binary log onto
_tbl_gst
_tbl_gst
itself is not blocked in any way.select destroy_ghost_wait_condition()
tbl
(our view). It still reads/writes to_tbl_gst
.rename table tbl to _tbl_gst_v_old, _tbl_gst to tbl
drop view _tbl_gst_v_old
🍕
Will you please run this through your virtual interpreter in your brains? Let's assume the UDFs work perfectly well (and they're simple enough to support that assumption).
The text was updated successfully, but these errors were encountered: