Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osm-logical: store needs_commit in own context #12

Closed
wants to merge 1 commit into from
Closed

osm-logical: store needs_commit in own context #12

wants to merge 1 commit into from

Conversation

mmd-osm
Copy link
Contributor

@mmd-osm mmd-osm commented Jul 7, 2020

This commit moves the needs_commit flag to an own context, based on the following example provided in the Postgresql sources: https://github.com/postgres/postgres/blob/master/contrib/test_decoding/test_decoding.c

  • I changed some of the function names to make it clearer that they belong to this plugin. This might come in handy, if this code is somehow involved in a memory dump (which we hope will not happen in the future).
  • Added new function calls to MemoryContextDelete(data->context); / MemoryContextSwitchTo(old); / MemoryContextReset(data->context); ... I'm not 100% happy with the level of code duplication introduced.

Before moving this to production next year, we should try to find a plugin expert on the Postgres dev mailing list to review this code.

ALLOCSET_SMALL_INITSIZE,
ALLOCSET_SMALL_MAXSIZE);
#endif

This comment was marked as outdated.

@mmd-osm mmd-osm marked this pull request as ready for review July 8, 2020 16:19
@joto
Copy link
Collaborator

joto commented Jul 8, 2020

First: You should not do unrelated changes (here the function renaming) in the same PR, it just makes everything hard to review. At least put it into a separate commit. Okay for now, I have read through it, but something to remember for next time.

Second: That seems all hugely complicated. Can't we just malloc/free a bool and put the pointer into output_plugin_private? Its not that we are storing lots of data. I grepped through the postgres source code and output_plugin_private is never touched outside any plugins, so chances are this will just work. If you can find a PostgreSQL expert great, but I don't think we want to wait for that for too long.

@joto
Copy link
Collaborator

joto commented Jul 8, 2020

Come to think of it, we don't even need malloc. We can store the single bit we need in the pointer itself.

@mmd-osm
Copy link
Contributor Author

mmd-osm commented Jul 8, 2020

First: unrelated changes: ok, noted.

Can't we just malloc/free a bool and put the pointer into output_plugin_private?

If you check out the other logical replication implementation on https://wiki.postgresql.org/wiki/Logical_Decoding_Plugins, you'll notice that at least two other implementations use exactly the same approach to store a single boolean value in the context:

To be honest, as this code is running directly in the Postgresql process, I'd be extremely reluctant to use anything which isn't exactly following those published examples. If this doesn't work for any reason, we can go back to the Postgresql folks and ask them to fix their samples or help us with debugging, etc.. If we're using some clever shortcuts, chances are that no one will be able to help us out, if something goes wrong, or changes between versions.

@joto
Copy link
Collaborator

joto commented Jul 8, 2020

Frankly, it all seems like cargo-cult to me, but if your change works, we can just merge it and be done with it. Do you think this is okay now or do you want to ask some PostgreSQL people?

@mmd-osm
Copy link
Contributor Author

mmd-osm commented Jul 8, 2020

As we don't have to move this to production in the next few weeks, I wouldn't mind leaving this PR open for a bit. In the meantime I can try finding someone from the PostgreSQL community who could review the code.

I know this code looks terribly complicated. At the time this feature was introduced in PostgreSQL back in 2014, about 10-15 developers implemented/reviewed the logical decoding feature (which happens to include test-decoding.c we're using as a baseline). I can only assume that the way it was implemented was for a good reason, and everyone involved didn't object to it.

@mmd-osm
Copy link
Contributor Author

mmd-osm commented Jul 10, 2020

Quick update on the recent code changes: I got rid of some of the code duplication by using a switch statement, similar to https://github.com/eulerto/wal2json/blob/master/wal2json.c#L1470. I added an empty REORDER_BUFFER_CHANGE_DELETE branch (we don't delete anything in those 3 tables) to avoid triggering an assert in that highly unlikely "delete" case.

Come to think about wal2json, this plugin is available as Ubuntu package and would probably fulfill all of our requirements: you can filter on a list of tables, get a JSON entry for each change (with format=2) that includes all fields + field values, optionally include the LSN, etc. It's a bit unfortunate that I haven't investigated this plugin in more detail earlier on, but I guess that ship has sailed now.

@mmd-osm
Copy link
Contributor Author

mmd-osm commented Jul 10, 2020

(Note: As this pull request includes multiple topics now, I will split them up again later into smaller chunks and multiple PRs).

@joto
Copy link
Collaborator

joto commented Jul 14, 2020

I have now implemented that "simple version" of this in 75b08df, storing the single bit we need in the pointer itself. This way we have something that should work. If this turns out to be not enough, we can come back to this PR.

@mmd-osm
Copy link
Contributor Author

mmd-osm commented Jul 14, 2020

I did some debugging the other day and have a better understanding about the memory context now. The idea is to instantiate our own global memory manager, which manages the memory that the Postgres functions we’re calling inside our plugin need (in case they need any memory on their own). After we’re done we’re resetting the context and all potentially allocated memory is released again. I guess this is all done to avoid nasty memory leaks.

Plugins typically store a pointer to that memory context in their own defined and allocated structure, which happens to double as a good location to store some other stuff, like Boolean flags. So the overall motivation looks quite different now, and it’s difficult to find some good explanation about those details.

I found this blog post quite helpful: https://blog.pgaddict.com/posts/introduction-to-memory-contexts - it gives a good overview about the different concepts in place.

I hope this explanation made some sense...

@joto
Copy link
Collaborator

joto commented Sep 23, 2020

@mmd-osm Have you found some PostgreSQL people to review the code and/or explain why my simpler solution doesn't work?

I still think this cargo cult and will close this unless somebody convinces me this is needed.

@mmd-osm
Copy link
Contributor Author

mmd-osm commented Sep 23, 2020

I'm not sure if you read my previous comment I wrote on July 14. Do you have some questions on it, or like to add some comments? The current version of code doesn't handle the requirement to manage memory for Postgresql methods, which are being called from the plugin. Did you consider this part when calling this stuff "cargo cult"?

I pinged two of the plugin authors in openstreetmap/operations#154 (comment), but got no reply.

Unfortunately, I don't have any time to drive this topic any further at the moment.

@joto
Copy link
Collaborator

joto commented Sep 23, 2020

Yes, I have read that comment but I don't understand how this applies. We are not allocating memory. Maybe PostgreSQL does allocate memory somewhere behind our backs, but if it does it uses whatever memory context it already has and that's not our concern. The code we currently have does work, so its not that there is no memory context setup already, otherwise we would see some kind of crash.

@mmd-osm
Copy link
Contributor Author

mmd-osm commented Sep 23, 2020

Right, Postgresql would already have a "Logical decoding context" in place when calling our plugin code (see https://github.com/postgres/postgres/blob/master/src/backend/replication/logical/logical.c#L154-L158 + later calls of MemoryContextSwitchTo).
The Postgresql functions we're calling from within the plugin code might allocate some memory on their own using exactly that context, hence the code doesn't crash.

I think the reason why lots of other plugins used their own memory context is to isolate the plugin code from the rest of the Postgresql core. Any (potential) memory leak in the plugin code would be contained to the plugin itself, and not impact a memory context which we don't own. It's a bit like everyone keeping their own space tidy, it reduces the risk that the whole database will have an issue.

@joto
Copy link
Collaborator

joto commented Sep 27, 2020

I am closing this for now as it isn't actionable. We can reopen this if needed.

@joto joto closed this Sep 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants