Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing canadian postal addresses is not accurate #128

Open
jsvachon2 opened this issue Sep 19, 2019 · 0 comments
Open

Parsing canadian postal addresses is not accurate #128

jsvachon2 opened this issue Sep 19, 2019 · 0 comments

Comments

@jsvachon2
Copy link

Some Canadian (especially french) postal addresses are not correctly parsed. One of the reason is the lack of support for some of the french vocabulary (like 'rue' for street or 'boul' for boulevard). These are quite easy to add to the vocabulary but even then, some addresses are just not parsed correctly. Here are a few examples after I've added support for some of the french keywords.

  • 609 boulevard des Bois Francs Sud, Victoriaville, QC
    {'word': '609', 'tag': 27}
    {'word': 'boulevard', 'tag': 30}
    {'word': 'qc', 'tag': 14}
    In this example, "des Bois Francs" is the full street name, "Sud" is the direction, "Victoriaville" is the city and finally "QC" is the state (tagges as abbreviation above)

  • 181 rue Principale, Gatineau, QC
    {'word': '181', 'tag': 27}
    {'word': 'street', 'tag': 30}
    {'word': 'qc', 'tag': 14}
    "Principale" is the street name, "Gatineau" the city and "qc" the state/province (tagged as abbreviation above)

  • 1016 Bouvier, Québec, QC
    {'word': 'qc', 'tag': 14}
    Here the street name "Bouvier" is not prefixed by anything and 'qc' is tagged as an abbreviation not a state

These are only a few examples but it seems like street names that are not prefixed are a problem as well as street names composed of multiple words. I can help adding support for french keywords in some areas if required. Also, the fact that the order of some words is different in french vs english is likely to cause issues as "Main street" would become "rue Main" in french and the code seem to look for the different addresses components in a given order.

Is there anything that can be done to better support french postal addresses like these above?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant