Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PostalCode city FK #144

Open
caullla opened this issue Oct 27, 2016 · 8 comments
Open

PostalCode city FK #144

caullla opened this issue Oct 27, 2016 · 8 comments

Comments

@caullla
Copy link

caullla commented Oct 27, 2016

Hi,

Are you sure you FK is enough? For example in Germany, cities with same postal code, can exist.
I think 2m2 will be better fit.
Thanks.

https://github.com/coderholic/django-cities/blob/master/cities/models.py#L181

@blag
Copy link
Collaborator

blag commented Oct 27, 2016

Can you give me an example of two cities with the same postal code?

And assuming that's true, then yes, a M2M relationship would be a better fit. That's not difficult to code, but that would be...involved...to transition all of our users to. Is that something you could look into creating a PR for?

@caullla
Copy link
Author

caullla commented Oct 27, 2016

I made inspection, and found cities which are present as separate instances in City model and have same postal_code.

[<City: Westerland>, <City: Sylt-Ost>] # postal_code = 25980
[<City: Klausdorf>, <City: Kiel>] # postal_code = 24147

But now I'm not sure about m2m. Yea, postal_code the same, but those PostalCode records isn't equal for 100%. And until PostalCode.code is not unique in database schema, FK could be enough.
So we can have duplicated code with different location or district_name.

What do you think about?

@blag
Copy link
Collaborator

blag commented Oct 28, 2016

I would rather have M2M models between the appropriate models than duplicate data. How far up the hierarchy do we need to switch to M2M (or, to ask it a different way: are those two cities in different Subregions or Regions?

If we really wanted to go wild with it, we would basically model the entire hierarchy as a graph instead of a tree, and then model the entire history of all places (eg: cities in the Ottoman Empire or Austro-Hungary, Berlin belonging to both East and West Germany).

All of that is possible, the real issue is the cities import command. That just gets more convoluted the more data model changes we make. 😖

@caullla
Copy link
Author

caullla commented Oct 28, 2016

m2m is not a silver bullet. I would say, best fit looks like this

City <- PostalCode -> Code

class City(models.Model):
     postal_codes = models.ManyToMany(Code, through=PostalCode)

One more question/option. Do we have a polygons for region, subregion, city. We can use geometry instead of graph. Make sense?

@blag
Copy link
Collaborator

blag commented Oct 28, 2016

What data goes in the PostalCode model and what data goes in the Code model?

Right now we import points for cities, districts, and postal codes.

Polygons exist. They are distributed in per-country zip files from Geonames, and we do nothing with them because they would presumably take forever to import and few users have requested them. There was an issue (#125) where somebody was going to add importing polygons.

@caullla
Copy link
Author

caullla commented Oct 28, 2016

Code will have only code(currently in PostalCode.code). That's why I don't see reason to add m2m.
Right now PostalCode can have duplicated value in code field, but in same time values of other fields isn't equal.

If we find a way to import polygons, we can use them to fill up any relations. Using polygons we can query all PostalCode in Region or all PostalCode in SubRegion. We also can use geometry on import, I don't know how you build relations right now, but with geometry you don't need to check anything, except Point in Polygon. You can build, City -> Region, City-> Country, PostalCode-> City, etc.. And import order doesn't matter.

ps. Assuming that Point is in the middle of Polygon(Point if far enough of Polygon border), it should works even if geometry have not 100% accuracy.

@blag
Copy link
Collaborator

blag commented Oct 28, 2016

Using polygons we can query all PostalCode in Region or all PostalCode in SubRegion

I suspect that would be incredibly slow but I'm willing to try it. Could you code up a PR that implements that?

@caullla
Copy link
Author

caullla commented Oct 28, 2016

I'll start with polygon import.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants