-
-
Notifications
You must be signed in to change notification settings - Fork 474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refuse to import data when there is an osm2pgsql_properties table #2085
Conversation
Will this require manually deleting the table or will a |
@rustprooflabs As the moment there is no I realize that this is quite a change compared to earlier behaviour, so maybe it is the wrong move here. I created this because I have been bitten twice already with this recently. The reason this is a bit more of a problem now than it was before is the |
@joto, I am not sure if this is the right move. This actually kind of reverts an earlier fix. Yes, with the increasing complexity of osm2pgsql, it will inevitably be easier to clobber things up. But I am afraid with all the new stuff, including the new "Themepark" I saw announced, this is virtually unavoidable. We're way past the point of osm2pgsql being a tool you can just "hit and run" (if it ever was with amount of command line options). (Desirable) Flexibility comes at a price... If you do intend to change this, I second @rustprooflabs 's request for some |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm uncertain on this, but if we do it, we need at the least a command line option to remove the table to do a new import, and possibly a --force
option.
A lot of my transform development involves reimporting over an existing DB.
I'm not too worried about other people's habits. 😄 Seriously though, I think if there's a good error message (Error: Cannot import because of existing data. Run with
I ran into this exact problem in rustprooflabs/pgosm-flex#313, I do like the default-safe option. I think this type of protection is a good idea. However, if it's decided not to add this type of check I am also okay. I run essentially all of my OSM imports via PgOSM Flex and have already added the protection I want there. If it's added here, I just need to update to pass that new option along to osm2pgsql appropriately. |
So it looks like people would be okay with this if we have a command line option to get the old behaviour, i.e. do not do this check. But What should we name this option? I am not sure the behaviour of osm2pgsql is sufficiently clear (either with or without that option) so maybe we need a different solution? |
Why is If |
It is just much too unspecific. What is being "forced" here? I know having a |
Not sure what to do with this one. I very much agree that the current bahaviour of osm2pgsql is dangerous and it should rather be conservative about overwriting existing data. But we've had this behaviour for a long time and I'm fully aware that many users rely on it. The properties table does change the game a little bit: you cannot do two imports with different prefixes on the same database anymore without breaking something. That was not the case before. The recommended work-around is to use schemas but, again, we do have some historic baggage here. If we do change the behaviour, then I'd rather do it without introducing an additional command line. We already have far too many of them and you can simply overwrite the check by running a |
This checks stops users from clobbering an existing database.
I guess I'm not aware of the baggage with schemas. How does this interact with flex styles that save tables to schemas? A good portion of what I've found useful lately is loading a variety of There are already limitations around multiple sources within one DB with replication (e.g. osmcode/pyosmium#214), I'd prefer not to see multiple sources further limited. If I have to drop the properties tables to load into multiple schemas I can, but that seems to dilute the purpose of that table in this scenario. Would it be possible to define which schema the properties table is saved in? This seems like it would still allow supporting multiple imports to a single database w/out losing the addition of safety against accidentally doing bad things. Another possible approach could be changing the table from a tall table (2 columns, many rows) into wide structure. Either via dedicated columns or a JSON/JSONB column? This would allow adding a
Understandable, I'm okay with your suggestion to manually delete the table when needed. |
(Ups. What I wrote here before was wrong.) You can use the |
Looks like we are not 100% sure to go forward with this and the PR is out of date by now. I am closing this PR maybe we'll revisit this issue later on. |
This checks stops users from clobbering an existing database.