Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using district names in Canadian Federal Electoral District OCD-IDs #323

Open
evannjw opened this issue Nov 1, 2022 · 18 comments
Open

Using district names in Canadian Federal Electoral District OCD-IDs #323

evannjw opened this issue Nov 1, 2022 · 18 comments

Comments

@evannjw
Copy link
Contributor

evannjw commented Nov 1, 2022

A new set of OCD-IDs are created every time there is redistricting creating a problem for how we represent districts contiguously across time. Rather than use the numeric federal district ids, which only represent a district’s identity for 10 years, we should use district names as they are more closely represent a district’s identity. Some concerns regarding the switch to using names as identifiers are that they can change between redistricting cycles (see https://www12.statcan.gc.ca/census-recensement/2011/ref/FED_name_changes-changements_noms_CEF-eng.cfm and https://openparliament.ca/bills/42-1/C-402/). When names are changed, new ids and aliases can be created so that identity is maintained through the canonical OCD-ID. When redistricting occurs, districts created after redistricting that don’t map to an old one (not the same name or renaming of a district) will be considered new districts and ids will be created for them. Old districts which no longer exist will have a validThrough date to indicate it is no longer active.

@jpmckinney
Copy link
Member

jpmckinney commented Nov 2, 2022

To understand the proposal:

  • When X is renamed to Y, which is the canonical ID and which is the alias?
  • If X is canonical: how far back do we go to establish the canonical ID? (1867? Easy enough with the source below.)
  • How are collisions resolved across time (same name for different districts at different times)? For example, Ahuntsic from 1968 to 1979 vs 1988 to 2015. There are 213 names that repeat, up to 6 times (Victoria).
  • Ahuntsic: 2
  • Algoma: 2
  • Argenteuil: 3
  • Argenteuil--Deux-Montagnes: 2
  • Argenteuil--Papineau--Mirabel: 2
  • Bas-Richelieu--Nicolet--Bécancour: 2
  • Battle River: 3
  • Beauharnois: 3
  • Beauharnois--Salaberry: 3
  • Beauséjour: 2
  • Berthier: 2
  • Berthier--Maskinongé: 3
  • Bourassa: 2
  • Bow River: 3
  • Bramalea--Gore--Malton: 2
  • Brampton Centre: 2
  • Brant: 2
  • Brantford: 2
  • Brome--Missisquoi: 3
  • Bruce South: 2
  • Bruce--Grey--Owen Sound: 2
  • Burin--St. George's: 2
  • Burrard: 2
  • Calgary: 2
  • Calgary Centre: 2
  • Calgary East: 3
  • Calgary West: 2
  • Cape Breton South: 2
  • Cape Breton--East Richmond: 2
  • Cape Breton--The Sydneys: 2
  • Cariboo: 2
  • Carleton: 3
  • Carleton--Gloucester: 2
  • Central Nova: 2
  • Chambly: 2
  • Charlesbourg: 2
  • Charlevoix: 2
  • Charlevoix--Montmorency: 3
  • Charlotte: 2
  • Châteauguay: 2
  • Chicoutimi: 2
  • Chicoutimi--Le Fjord: 2
  • Cochrane: 2
  • Comox--Alberni: 2
  • Compton: 2
  • Cumberland--Colchester: 2
  • Delta: 2
  • Digby--Annapolis--Kings: 2
  • Don Valley North: 2
  • Durham: 3
  • Edmonton: 2
  • Edmonton Centre: 2
  • Edmonton East: 2
  • Edmonton West: 3
  • Essex: 3
  • Fraser Valley: 2
  • Fredericton: 2
  • Frontenac: 2
  • Gatineau: 2
  • Glengarry: 2
  • Guelph: 2
  • Haldimand: 2
  • Haldimand--Norfolk: 2
  • Halton: 2
  • Hastings--Frontenac: 2
  • Hastings--Frontenac--Lennox and Addington: 2
  • Hochelaga: 2
  • Humber--St. Barbe--Baie Verte: 2
  • Humboldt: 2
  • Huron North: 2
  • Joliette: 2
  • Jonquière: 2
  • Kamloops: 2
  • Kent: 4
  • Kent--Essex: 2
  • King's: 2
  • Kingston: 2
  • Kitchener--Conestoga: 2
  • Kootenay East: 2
  • La Prairie: 2
  • Lac-Saint-Jean: 2
  • Lachine: 2
  • Lakeland: 2
  • Lambton--Kent--Middlesex: 2
  • Laprairie: 2
  • Lasalle: 2
  • Laurier: 2
  • Laurier--Sainte-Marie: 2
  • Laval: 4
  • Lethbridge: 2
  • Lincoln: 2
  • Longueuil: 2
  • Lunenburg: 2
  • Mackenzie: 2
  • Macleod: 2
  • Maisonneuve: 3
  • Maisonneuve--Rosemont: 2
  • Marc-Aurèle-Fortin: 2
  • Markham: 2
  • Matane: 2
  • Matapédia--Matane: 2
  • Mégantic: 2
  • Mercier: 2
  • Missisquoi: 3
  • Mississauga Centre: 2
  • Mississauga East--Cooksville: 2
  • Montcalm: 2
  • Montmorency: 2
  • Moose Jaw: 2
  • Moose Jaw--Lake Centre: 2
  • Mount Royal: 4
  • Muskoka: 2
  • Nanaimo--Alberni: 2
  • Nepean: 2
  • Nepean--Carleton: 2
  • New Brunswick Southwest: 2
  • New Westminster--Burnaby: 2
  • New Westminster--Coquitlam: 2
  • Niagara Centre: 2
  • Norfolk: 2
  • North Battleford: 2
  • North Island--Powell River: 2
  • North Okanagan--Shuswap: 2
  • Northumberland: 3
  • Northwest Territories: 2
  • Notre-Dame-de-Grâce: 2
  • Nunavut: 2
  • Okanagan--Shuswap: 3
  • Outremont: 2
  • Palliser: 2
  • Papineau: 2
  • Perth: 2
  • Pontiac: 3
  • Port Moody--Coquitlam: 2
  • Prince Albert: 2
  • Prince Edward: 3
  • Prince Edward--Hastings: 2
  • Prince George--Peace River: 2
  • Qu'Appelle: 3
  • Quebec West: 2
  • Queen's: 2
  • Queens--Lunenburg: 2
  • Regina East: 2
  • Regina--Qu'Appelle: 2
  • Regina--Wascana: 2
  • Renfrew--Nipissing--Pembroke: 2
  • Restigouche: 2
  • Richelieu: 3
  • Richmond: 3
  • Richmond--Wolfe: 2
  • Rimouski--Témiscouata: 2
  • Rimouski-Neigette--Témiscouata--Les Basques: 2
  • Rivière-du-Loup--Témiscouata: 2
  • Saanich--Gulf Islands: 2
  • Sackville--Eastern Shore: 2
  • Saint John: 2
  • Saint-Hyacinthe--Bagot: 2
  • Saint-Jacques: 2
  • Saint-Laurent: 2
  • Sainte-Marie: 2
  • Sarnia: 2
  • Sarnia--Lambton: 2
  • Saskatoon: 2
  • Saskatoon West: 2
  • Saskatoon--Humboldt: 2
  • Sault Ste. Marie: 2
  • Selkirk: 2
  • Selkirk--Interlake: 2
  • Shelburne--Yarmouth--Clare: 2
  • Simcoe South: 2
  • St. Ann: 3
  • St. Antoine--Westmount: 2
  • St. Boniface: 3
  • St. James: 2
  • St. John's East: 4
  • St. John's West: 3
  • St. Lawrence--St. George: 3
  • St. Mary: 2
  • St. Paul's: 3
  • Stormont: 3
  • Strathcona: 3
  • Surrey--White Rock: 2
  • Témiscamingue: 2
  • Témiscouata: 2
  • Terrebonne: 2
  • Three Rivers: 2
  • Timiskaming: 2
  • Toronto Centre: 2
  • Toronto West: 2
  • Trois-Rivières: 2
  • Vancouver Kingsway: 2
  • Vancouver South: 2
  • Vaudreuil: 2
  • Vaudreuil--Soulanges: 3
  • Verchères: 2
  • Verdun: 2
  • Victoria: 6
  • Waterloo: 3
  • Welland: 2
  • West Nova: 3
  • Westlock--St. Paul: 2
  • Windsor--St. Clair: 2
  • Winnipeg Centre: 2
  • Winnipeg North: 2
  • Winnipeg North Centre: 2
  • Winnipeg South: 2
  • Winnipeg South Centre: 3
  • Yale: 2
  • York Centre: 2
  • York East: 3
  • York West: 2
  • York--Simcoe: 3
  • Yukon: 2
  • How are collisions resolved at the same time (IDs are not scoped by province, but two provinces can have the same district name)? For example, Victoria (Ontario) and Victoria (British Columbia).
  • Edit: What happens if there is a new district that shares the same name as in a previous redistricting? (The official source considers the old one to have been abolished, and the new one created.) Lots of examples like Laval before/after 1979.

All ridings for all time are listed here: https://lop.parl.ca/sites/ParlInfo/default/en_CA/ElectionsRidings/Ridings There are 32 without any dates. If you enable to the "Additional Information" column, you'll see why 27 don't have a start/end date - they were abolished before coming into force. The other 5, however, just lack dates (data quality issue).

Some other examples of renamings between redistrictings from one set of search terms (check for "royal assent on"): https://www.parl.ca/LegisInfo/en/bills?keywords=%22An%20Act%20to%20change%20the%20name%20of%22%20electoral%20district&parlsession=all&sortby=session-desc

@evannjw
Copy link
Contributor Author

evannjw commented Nov 2, 2022

  • When X is renamed to Y, which is the canonical ID and which is the alias?

When X is renamed to Y, a new id will be created for Y and an alias will be created stating X sameAs Y making Y canonical.

  • How are collisions resolved across time (same name for different districts at different times)? For example, Ahuntsic from 1968 to 1979 vs 1988 to 2015. There are 213 names that repeat, up to 6 times (Victoria).

Collisions will be resolved by removing the validThrough date and adding a validFrom date. The same id will be used for both periods that the district existed since the identity of the district is maintained by its name. https://en.wikipedia.org/wiki/Ahuntsic_(electoral_district)
There is also the case where X is renamed Y and Z is renamed X. I think in this case, the aliases between X to Y and Z to X should be removed, effectively creating new district Y and dissolving district Z.

  • How are collisions resolved at the same time (IDs are not scoped by province, but two provinces can have the same district name)? For example, Victoria (Ontario) and Victoria (British Columbia).

That’s a good point, we should include province in the id to disambiguate (ocd-division/country:ca/province:on/ed:victoria)

  • Edit: What happens if there is a new district that shares the same name as in a previous redistricting? (The official source considers the old one to have been abolished, and the new one created.) Lots of examples like Laval before/after 1979.

I may be misunderstanding you here, but a new district which shares the same name as a previous redistricting should be resolved consistent to collisions across time. https://en.wikipedia.org/wiki/Laval_(electoral_district)

I'm not certain what the effect of some of the renaming bills have or if they just haven't passed yet but maybe we can use https://www.elections.ca/content.aspx?section=res&dir=cir/list&document=index338&lang=e as the source of truth for name changes that should be reflected in the repository?

@jpmckinney
Copy link
Member

jpmckinney commented Nov 2, 2022

That’s a good point, we should include province in the id to disambiguate (ocd-division/country:ca/province:on/ed:victoria)

Okay, now instead of 213 IDs that repeat across time, we'll have 202:

  • Ahuntsic Quebec: 2
  • Algoma Ontario: 2
  • Argenteuil Quebec: 3
  • Argenteuil--Deux-Montagnes Quebec: 2
  • Argenteuil--Papineau--Mirabel Quebec: 2
  • Bas-Richelieu--Nicolet--Bécancour Quebec: 2
  • Battle River Alberta: 3
  • Beauharnois Quebec: 3
  • Beauharnois--Salaberry Quebec: 3
  • Beauséjour New Brunswick: 2
  • Berthier Quebec: 2
  • Berthier--Maskinongé Quebec: 3
  • Bourassa Quebec: 2
  • Bow River Alberta: 3
  • Bramalea--Gore--Malton Ontario: 2
  • Brampton Centre Ontario: 2
  • Brant Ontario: 2
  • Brantford Ontario: 2
  • Brome--Missisquoi Quebec: 3
  • Bruce South Ontario: 2
  • Bruce--Grey--Owen Sound Ontario: 2
  • Burin--St. George's Newfoundland and Labrador: 2
  • Burrard British Columbia: 2
  • Calgary Centre Alberta: 2
  • Calgary East Alberta: 3
  • Calgary West Alberta: 2
  • Cape Breton South Nova Scotia: 2
  • Cape Breton--East Richmond Nova Scotia: 2
  • Cape Breton--The Sydneys Nova Scotia: 2
  • Cariboo British Columbia: 2
  • Carleton Ontario: 2
  • Carleton--Gloucester Ontario: 2
  • Central Nova Nova Scotia: 2
  • Chambly Quebec: 2
  • Charlesbourg Quebec: 2
  • Charlevoix Quebec: 2
  • Charlevoix--Montmorency Quebec: 3
  • Charlotte New Brunswick: 2
  • Châteauguay Quebec: 2
  • Chicoutimi Quebec: 2
  • Chicoutimi--Le Fjord Quebec: 2
  • Cochrane Ontario: 2
  • Comox--Alberni British Columbia: 2
  • Compton Quebec: 2
  • Cumberland--Colchester Nova Scotia: 2
  • Delta British Columbia: 2
  • Digby--Annapolis--Kings Nova Scotia: 2
  • Don Valley North Ontario: 2
  • Durham Ontario: 3
  • Edmonton Centre Alberta: 2
  • Edmonton East Alberta: 2
  • Edmonton West Alberta: 3
  • Essex Ontario: 3
  • Fraser Valley British Columbia: 2
  • Fredericton New Brunswick: 2
  • Gatineau Quebec: 2
  • Glengarry Ontario: 2
  • Guelph Ontario: 2
  • Haldimand Ontario: 2
  • Haldimand--Norfolk Ontario: 2
  • Halton Ontario: 2
  • Hastings--Frontenac Ontario: 2
  • Hastings--Frontenac--Lennox and Addington Ontario: 2
  • Hochelaga Quebec: 2
  • Humber--St. Barbe--Baie Verte Newfoundland and Labrador: 2
  • Huron North Ontario: 2
  • Joliette Quebec: 2
  • Jonquière Quebec: 2
  • Kamloops British Columbia: 2
  • Kent Ontario: 3
  • Kent--Essex Ontario: 2
  • Kingston Ontario: 2
  • Kitchener--Conestoga Ontario: 2
  • Kootenay East British Columbia: 2
  • La Prairie Quebec: 2
  • Lac-Saint-Jean Quebec: 2
  • Lachine Quebec: 2
  • Lakeland Alberta: 2
  • Lambton--Kent--Middlesex Ontario: 2
  • Laprairie Quebec: 2
  • Lasalle Quebec: 2
  • Laurier Quebec: 2
  • Laurier--Sainte-Marie Quebec: 2
  • Laval Quebec: 4
  • Lethbridge Alberta: 2
  • Lincoln Ontario: 2
  • Longueuil Quebec: 2
  • Lunenburg Nova Scotia: 2
  • Macleod Alberta: 2
  • Maisonneuve Quebec: 3
  • Maisonneuve--Rosemont Quebec: 2
  • Marc-Aurèle-Fortin Quebec: 2
  • Markham Ontario: 2
  • Matane Quebec: 2
  • Matapédia--Matane Quebec: 2
  • Mégantic Quebec: 2
  • Mercier Quebec: 2
  • Missisquoi Quebec: 3
  • Mississauga Centre Ontario: 2
  • Mississauga East--Cooksville Ontario: 2
  • Montcalm Quebec: 2
  • Montmorency Quebec: 2
  • Moose Jaw Saskatchewan: 2
  • Moose Jaw--Lake Centre Saskatchewan: 2
  • Mount Royal Quebec: 4
  • Muskoka Ontario: 2
  • Nanaimo--Alberni British Columbia: 2
  • Nepean Ontario: 2
  • Nepean--Carleton Ontario: 2
  • New Brunswick Southwest New Brunswick: 2
  • New Westminster--Burnaby British Columbia: 2
  • New Westminster--Coquitlam British Columbia: 2
  • Niagara Centre Ontario: 2
  • Norfolk Ontario: 2
  • North Battleford Saskatchewan: 2
  • North Island--Powell River British Columbia: 2
  • North Okanagan--Shuswap British Columbia: 2
  • Northumberland Ontario: 2
  • Northwest Territories Northwest Territories: 2
  • Notre-Dame-de-Grâce Quebec: 2
  • Okanagan--Shuswap British Columbia: 3
  • Outremont Quebec: 2
  • Papineau Quebec: 2
  • Perth Ontario: 2
  • Pontiac Quebec: 3
  • Port Moody--Coquitlam British Columbia: 2
  • Prince Albert Saskatchewan: 2
  • Prince Edward Ontario: 3
  • Prince Edward--Hastings Ontario: 2
  • Prince George--Peace River British Columbia: 2
  • Qu'Appelle Saskatchewan: 2
  • Quebec West Quebec: 2
  • Queens--Lunenburg Nova Scotia: 2
  • Regina East Saskatchewan: 2
  • Regina--Qu'Appelle Saskatchewan: 2
  • Regina--Wascana Saskatchewan: 2
  • Renfrew--Nipissing--Pembroke Ontario: 2
  • Restigouche New Brunswick: 2
  • Richelieu Quebec: 3
  • Richmond--Wolfe Quebec: 2
  • Rimouski--Témiscouata Quebec: 2
  • Rimouski-Neigette--Témiscouata--Les Basques Quebec: 2
  • Rivière-du-Loup--Témiscouata Quebec: 2
  • Saanich--Gulf Islands British Columbia: 2
  • Sackville--Eastern Shore Nova Scotia: 2
  • Saint John New Brunswick: 2
  • Saint-Hyacinthe--Bagot Quebec: 2
  • Saint-Jacques Quebec: 2
  • Saint-Laurent Quebec: 2
  • Sainte-Marie Quebec: 2
  • Sarnia Ontario: 2
  • Sarnia--Lambton Ontario: 2
  • Saskatoon Saskatchewan: 2
  • Saskatoon West Saskatchewan: 2
  • Saskatoon--Humboldt Saskatchewan: 2
  • Sault Ste. Marie Ontario: 2
  • Selkirk Manitoba: 2
  • Selkirk--Interlake Manitoba: 2
  • Shelburne--Yarmouth--Clare Nova Scotia: 2
  • Simcoe South Ontario: 2
  • St. Ann Quebec: 3
  • St. Antoine--Westmount Quebec: 2
  • St. Boniface Manitoba: 3
  • St. James Quebec: 2
  • St. John's East Newfoundland and Labrador: 4
  • St. John's West Newfoundland and Labrador: 3
  • St. Lawrence--St. George Quebec: 3
  • St. Mary Quebec: 2
  • St. Paul's Ontario: 3
  • Stormont Ontario: 3
  • Surrey--White Rock British Columbia: 2
  • Témiscamingue Quebec: 2
  • Témiscouata Quebec: 2
  • Terrebonne Quebec: 2
  • Three Rivers Quebec: 2
  • Timiskaming Ontario: 2
  • Toronto Centre Ontario: 2
  • Toronto West Ontario: 2
  • Trois-Rivières Quebec: 2
  • Vancouver Kingsway British Columbia: 2
  • Vancouver South British Columbia: 2
  • Vaudreuil Quebec: 2
  • Vaudreuil--Soulanges Quebec: 3
  • Verchères Quebec: 2
  • Verdun Quebec: 2
  • Victoria British Columbia: 2
  • Waterloo Ontario: 3
  • Welland Ontario: 2
  • West Nova Nova Scotia: 3
  • Westlock--St. Paul Alberta: 2
  • Windsor--St. Clair Ontario: 2
  • Winnipeg Centre Manitoba: 2
  • Winnipeg North Manitoba: 2
  • Winnipeg North Centre Manitoba: 2
  • Winnipeg South Manitoba: 2
  • Winnipeg South Centre Manitoba: 3
  • Yale British Columbia: 2
  • York Centre Ontario: 2
  • York East Ontario: 3
  • York West Ontario: 2
  • York--Simcoe Ontario: 3
  • Yukon Yukon: 2

Collisions will be resolved by removing the validThrough date and adding a validFrom date.

So the old district essentially disappears from the listing? If I have a historical application, how do I refer to the old district?

The same id will be used for both periods that the district existed since the identity of the district is maintained by its name.

According to the Library of Parliament, the old district is abolished and the new district is created. In other words, according to the best authority on this subject, they are not identical.

In cases like Algoma, there's a 64-year span between when each existed.

The idea of districts being "re-established" seems to be an invention of the Wikipedia user Earl Andrew who authored most of the pages.

There is also the case where X is renamed Y and Z is renamed X. I think in this case, the aliases between X to Y and Z to X should be removed, effectively creating new district Y and dissolving district Z.

Have you found a real example, or are you hypothesizing?

When X is renamed to Y, a new id will be created for Y and an alias will be created stating X sameAs Y making Y canonical.

Isn't constantly shifting canonical IDs kind of unusual?

maybe we can use https://www.elections.ca/content.aspx?section=res&dir=cir/list&document=index338&lang=e as the source of truth for name changes that should be reflected in the repository?

https://www.elections.ca/content.aspx?section=res&dir=cir/list&document=index338&lang=e is very incomplete.

@evannjw
Copy link
Contributor Author

evannjw commented Nov 3, 2022

We can also resolve collisions across time by appending the year that the new district was created in the case that an old district with the abolished, the way collisions with Canadian OCD-IDs are currently resolved. However if we do adopt the idea that names are how we identify districts, treating districts with the same name, even if there is a 64 year span between them, as having the same identifier seems most natural.

There is no perfect methodology for how we create these ids but the current way of using federal electoral district codes treats every district as being abolished and recreated after every redistricting. This doesn't work well for representing districts that aren't new. @jdmgoogle maybe you can speak more about this?

https://www.elections.ca/content.aspx?section=res&dir=cir/list&document=index338&lang=e is very incomplete.

It may be incomplete, but isn't this the set of district ids/names used in elections? Is the issue that name changes that aren't captured here can surface as new districts after redistricting?

@jdmgoogle
Copy link
Contributor

Okay, now instead of 213 IDs that repeat across time, we'll have 202:

I'm curious why the new solution has to go back to 1867 but the the current solution only has to go back to ~2010.

@jdmgoogle
Copy link
Contributor

Isn't constantly shifting canonical IDs kind of unusual?

Renaming districts outside of a redistricting cycle is rather unusual, so this seems like something that should be taken up with the Canadian government first and foremost. :)

@jdmgoogle
Copy link
Contributor

(Sorry for the multiple comments here, but)

The idea of districts being "re-established" seems to be an invention of the Wikipedia user Earl Andrew who authored most of the pages.

The district you reference is also described in the French-language Wikipedia as "reappearing" in the 1960s.

https://fr.wikipedia.org/wiki/Algoma_(ancienne_circonscription_f%C3%A9d%C3%A9rale)

The edit history to that page does not include any edits by "Earl Andrew", so this is clearly not a one-off invention of a single person.

If the goal of OCD-IDs is simply to provide a context-free catalog of THE CURRENT official names and numbers associated with locations and districts around the world, with no attempt made to provide a coherent view of the "identity" of a given district over time, then that should be made explicit.

It should also be made clear, then, that these identifiers are likely incompatible and conflict with the ones being curated on Wikidata.

The documentation should state that an OCD-ID is only valid in the context of an (identifier, startDate, endDate) tuple, and that there are zero guarantees or information available about the relationship between e.g., (ocd-division/country:ca/ed:00123, 2013, 2017) and (ocd-division/country:ca/ed:00123, 2017, present). The tooling which compiles identifiers should be updated to reflect this, and all clients which expect canonical identifiers to be unique must be updated to only use (identifier, startDate, endDate) as the canonical definition of a location's identity, since identifier alone is no longer canonically unique.

@jpmckinney
Copy link
Member

Renaming districts outside of a redistricting cycle is rather unusual, so this seems like something that should be taken up with the Canadian government first and foremost. :)

Okay, good luck changing a foreign country's legislature, Google ;)

I'm curious why the new solution has to go back to 1867 but the the current solution only has to go back to ~2010.

We're discussing how to improve the current solution.

If the goal of OCD-IDs is simply ...

I don't understand this section. What is your position? Phrasing everything as conditionals makes it hard to know what you actually want to be the case.

@jdmgoogle
Copy link
Contributor

We're discussing how to improve the current solution.

If that's the case then I propose we focus on solving the problem for the modern era (e.g., make sure we work reasonably well for the date ranges already covered by the data set) and not worry about going back to 1867.

I don't understand this section. What is your position?

I'm trying to get clarity on your position. You've dismissed Wikipedia as a valid data source (which is fine) and have said we should limit ourselves to only what the government is currently publishing. My section tries to lay out the implications of that position in the broader ecosystem and what it would mean for downstream clients.

So ... what is your position here on what should be in this repository, how it should relate (or not) to other data sources, and how we should be updating our documentation and tooling to enforce that position?

@jpmckinney
Copy link
Member

jpmckinney commented Nov 3, 2022

OCD-IDs should track the identity of divisions across time, including historical divisions where there is the capacity to do so. Due to data quality issues, etc. this goal is not always met. This issue is about improving the situation in Canada.

It is not a goal of OCD-IDs to be consistent with Wikidata. Wikidata is nowhere mentioned in any documentation about OCD-IDs. That said, consistency across databases/standards (ISO 3166, etc.) is preferred where appropriate.

The OCD-ID is sufficient to identify a division even in the current iteration. There is no conflict within the current dataset that requires startDate validFrom or endDate validThrough to disambiguate.

Wikipedia and Wikidata do not have good coverage of non-current divisions. For example, there is a page for Algoma—Manitoulin—Kapuskasing but not for Algoma or Algoma—Manitoulin, which are not the same district according to the proposed "name = identity" method or according to the Library of Parliament (and to be 100% clear, the LOP is the greater authority, not Wikidata or any other secondary sources).

Edit: For even greater clarity, it is Parliament that creates, renames and abolishes divisions. Elections Canada uses those divisions. It does not create, rename, abolish or otherwise affect their lifecycle. As they are concerned with elections, they do not care if a division is renamed 3 times in a parliamentary session. They care about the name at the time of an election. So Parliament is always the preferred source if they are publishing data of sufficient quality.

@jpmckinney
Copy link
Member

We can also resolve collisions across time by appending the year that the new district was created in the case that an old district with the abolished, the way collisions with Canadian OCD-IDs are currently resolved.

I agree that this is a reasonable solution.

However if we do adopt the idea that names are how we identify districts, treating districts with the same name, even if there is a 64 year span between them, as having the same identifier seems most natural.

The OCD-ID project works with a simple lifecycle model of divisions being created (validFrom), abolished (validThrough) and renamed (sameAs): https://github.com/opencivicdata/docs.opencivicdata.org/blob/master/proposals/0002.rst The model does not allow for "re-establishment" of divisions. We can of course consider changes to the model, but I think that is a separate issue. If we want to work within the current model in this issue for Canada, then the only possible events are creating, renaming and abolishing divisions. We don't have the tools to properly model "re-establishment" (if this is even a thing – the Library of Parliament doesn't think so...).

@jdmgoogle
Copy link
Contributor

OCD-IDs should track the identity of divisions across time, including historical divisions where there is the capacity to do so.

I would agree with that.

It is not a goal of OCD-IDs to be consistent with Wikidata

Then the documentation should be updated to say that. We should also reach out to people at Wikidata to make it clear that these two projects may provide overlapping identifiers but are not related and not guaranteed to be compatible.

The OCD-ID is sufficient to identify a division even in the current iteration. There is no conflict within the current dataset that requires startDate or endDate to disambiguate.

I would disagree with that. For example:

Using the number to identify the districts even over a single redistricting cycle simply won't work. The names change, the shapes change, and they do so in a subtly very wrong way: the vast majority of (47012, 2003, 2013) is is now (47013, 2013, present).

Given this example, what set of OCD-IDs and other metadata would you propose to represent this kind of information? What information is associated with the "canonical" OCDs ocd-division/country:ca/ed:47012 and ocd-division/country:ca/ed:47013? What are listed as aliases? In short, which entries in which files with which columns conveys the correct set of information to clients?

@jpmckinney
Copy link
Member

Then the documentation should be updated to say that. We should also reach out to people at Wikidata to make it clear that these two projects may provide overlapping identifiers but are not related and not guaranteed to be compatible.

Are we to enumerate and reach out to every project with which we don't have an official policy of being consistent with? That doesn't make sense. Is there some undocumented agreement or discussion with Wikidata that I'm missing?


I'm confused.

  • Souris–Moose Mountain in the 2013 Distribution Order is ocd-division/country:ca/ed:47013-2013
  • Souris–Moose Mountain in the 2003 Distribution Order is ocd-division/country:ca/ed:47012
  • ocd-division/country:ca/ed:47013-2013 is not equal to ocd-division/country:ca/ed:47012

Anyway, can we focus on the proposal that @evannjw kindly suggested rather than going on about the deficiencies with the existing approach?

@jdmgoogle
Copy link
Contributor

Are we to enumerate and reach out to every project with which we don't have an official policy of being consistent with?

Wikipedia is a large enough project that we should at least attempt to clarify our relationship to them.

One last feedback here is that under the approach mentioned above it seems we'd be randomly appending years onto identifiers, which seems a bit chaotic.

Unfortunately election-related issues are pulling me away so I can't comment a bunch more at the moment. Hope to get back to the mid/late next week.

@jpmckinney
Copy link
Member

It's not random – when films are remade, years are appended. So, on Wikipedia, there's True Grit (1969 film) and True Grit (2010 film). Most films don't have these suffixes, because they have not been remade, or were remade under a different title.

Similarly, if a district is abolished, and then a future district uses the same name, a year is suffixed to distinguish them.

--

Please open a new issue to discuss the OCD-ID projects relationship to other databases/projects, so that we don't overload a conversation about Canada with those separate concerns.

@jpmckinney
Copy link
Member

I made a draft PR based on @evannjw's comments. See #324 for comment.

@jpmckinney
Copy link
Member

jpmckinney commented Nov 9, 2022

Food for thought on appropriate sources of information for political divisions: https://mitpress.mit.edu/9780262046299/writing-the-revolution/

Book review (paywalled) https://www.ft.com/content/872b64f5-e735-4feb-9580-1e362ba85b7b

@jpmckinney
Copy link
Member

I can't remember why I created these files, but uploading here in case they are relevant to history.

ParlinfoRidings.csv
ParlinfoRidings.xlsx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants