-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
osm-logical: store needs_commit in own context #12
osm-logical: store needs_commit in own context #12
Conversation
ALLOCSET_SMALL_INITSIZE, | ||
ALLOCSET_SMALL_MAXSIZE); | ||
#endif | ||
|
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
First: You should not do unrelated changes (here the function renaming) in the same PR, it just makes everything hard to review. At least put it into a separate commit. Okay for now, I have read through it, but something to remember for next time. Second: That seems all hugely complicated. Can't we just malloc/free a bool and put the pointer into |
Come to think of it, we don't even need malloc. We can store the single bit we need in the pointer itself. |
First: unrelated changes: ok, noted.
If you check out the other logical replication implementation on https://wiki.postgresql.org/wiki/Logical_Decoding_Plugins, you'll notice that at least two other implementations use exactly the same approach to store a single boolean value in the context:
To be honest, as this code is running directly in the Postgresql process, I'd be extremely reluctant to use anything which isn't exactly following those published examples. If this doesn't work for any reason, we can go back to the Postgresql folks and ask them to fix their samples or help us with debugging, etc.. If we're using some clever shortcuts, chances are that no one will be able to help us out, if something goes wrong, or changes between versions. |
Frankly, it all seems like cargo-cult to me, but if your change works, we can just merge it and be done with it. Do you think this is okay now or do you want to ask some PostgreSQL people? |
As we don't have to move this to production in the next few weeks, I wouldn't mind leaving this PR open for a bit. In the meantime I can try finding someone from the PostgreSQL community who could review the code. I know this code looks terribly complicated. At the time this feature was introduced in PostgreSQL back in 2014, about 10-15 developers implemented/reviewed the logical decoding feature (which happens to include |
Quick update on the recent code changes: I got rid of some of the code duplication by using a Come to think about wal2json, this plugin is available as Ubuntu package and would probably fulfill all of our requirements: you can filter on a list of tables, get a JSON entry for each change (with format=2) that includes all fields + field values, optionally include the LSN, etc. It's a bit unfortunate that I haven't investigated this plugin in more detail earlier on, but I guess that ship has sailed now. |
(Note: As this pull request includes multiple topics now, I will split them up again later into smaller chunks and multiple PRs). |
I have now implemented that "simple version" of this in 75b08df, storing the single bit we need in the pointer itself. This way we have something that should work. If this turns out to be not enough, we can come back to this PR. |
I did some debugging the other day and have a better understanding about the memory context now. The idea is to instantiate our own global memory manager, which manages the memory that the Postgres functions we’re calling inside our plugin need (in case they need any memory on their own). After we’re done we’re resetting the context and all potentially allocated memory is released again. I guess this is all done to avoid nasty memory leaks. Plugins typically store a pointer to that memory context in their own defined and allocated structure, which happens to double as a good location to store some other stuff, like Boolean flags. So the overall motivation looks quite different now, and it’s difficult to find some good explanation about those details. I found this blog post quite helpful: https://blog.pgaddict.com/posts/introduction-to-memory-contexts - it gives a good overview about the different concepts in place. I hope this explanation made some sense... |
@mmd-osm Have you found some PostgreSQL people to review the code and/or explain why my simpler solution doesn't work? I still think this cargo cult and will close this unless somebody convinces me this is needed. |
I'm not sure if you read my previous comment I wrote on July 14. Do you have some questions on it, or like to add some comments? The current version of code doesn't handle the requirement to manage memory for Postgresql methods, which are being called from the plugin. Did you consider this part when calling this stuff "cargo cult"? I pinged two of the plugin authors in openstreetmap/operations#154 (comment), but got no reply. Unfortunately, I don't have any time to drive this topic any further at the moment. |
Yes, I have read that comment but I don't understand how this applies. We are not allocating memory. Maybe PostgreSQL does allocate memory somewhere behind our backs, but if it does it uses whatever memory context it already has and that's not our concern. The code we currently have does work, so its not that there is no memory context setup already, otherwise we would see some kind of crash. |
Right, Postgresql would already have a "Logical decoding context" in place when calling our plugin code (see https://github.com/postgres/postgres/blob/master/src/backend/replication/logical/logical.c#L154-L158 + later calls of MemoryContextSwitchTo). I think the reason why lots of other plugins used their own memory context is to isolate the plugin code from the rest of the Postgresql core. Any (potential) memory leak in the plugin code would be contained to the plugin itself, and not impact a memory context which we don't own. It's a bit like everyone keeping their own space tidy, it reduces the risk that the whole database will have an issue. |
I am closing this for now as it isn't actionable. We can reopen this if needed. |
This commit moves the
needs_commit
flag to an own context, based on the following example provided in the Postgresql sources: https://github.com/postgres/postgres/blob/master/contrib/test_decoding/test_decoding.cBefore moving this to production next year, we should try to find a plugin expert on the Postgres dev mailing list to review this code.