Skip to content
This repository has been archived by the owner on Mar 15, 2024. It is now read-only.

Comparator to monitor primary tags on OpenStreetMap #96

Open
bkowshik opened this issue Mar 2, 2017 · 8 comments
Open

Comparator to monitor primary tags on OpenStreetMap #96

bkowshik opened this issue Mar 2, 2017 · 8 comments
Assignees
Milestone

Comments

@bkowshik
Copy link
Contributor

bkowshik commented Mar 2, 2017

There are 26 recognized primary tags on OpenStreetMap.

aerialway, aeroway, amenity, barrier, boundary, building, craft, emergency, geological,
highway, historic, landuse, leisure, man_made, military, natural, office, place, power,
public_transport, railway, route, shop, sport, tourism, waterway

The primary-osm-tag compare function will:

  • Flag when a primary tag of a feature is deleted
    • Ex: highway
  • Flag when a new primary tag is added to an existing feature
    • Ex: waterway is added to a building feature
  • Flag features with more than 1 primary tags
    • Ex: A name translation is added to a feature with both natural and places tags

Uncertainties

  • @amishas157 is there an existing compare function doing something similar?
  • What are the other potential interesting cases with primary feature tags?

cc: @geohacker @planemad @bsrinivasa

@amishas157
Copy link
Contributor

@bkowshik

is there an existing compare function doing something similar?

Though we do have few compare functions which flags the features having these primary tags, being deleted. For example , this looks for features deleted which had place tag and values as city, town, village, suburb, hamlet, island

Yes, it would be great to have primary-osm-tag compare-function. But I think if we could filter out few significant tags out of this list, to flag instead.

As discussed w/ @bkowshik another interesting case could be flagging those features where primary tag value falls out of set of valid combination.

Ref: https://taginfo.openstreetmap.org/taginfo/apidoc#api_4_key_values

@planemad
Copy link
Contributor

planemad commented Mar 7, 2017

@bkowshik this looks like a good start. The scope of this function should be limited to Features having an unpopular combination of primary keys

@amishas157 would prefer we have a comparator for tag values separate. We should keep the focus of every function very narrow so that its easy to test.

@geohacker
Copy link
Contributor

@amishas157 @bkowshik are we going organise this per primary tag? what's next?

@geohacker geohacker modified the milestone: cleanup Mar 27, 2017
@bkowshik bkowshik self-assigned this Mar 27, 2017
@bkowshik
Copy link
Contributor Author

The scope of this function should be limited to Features having an unpopular combination of primary keys

We have this with the invalid-tag-combination compare function. A tag combination is unpopular if less than 1% features with one primary tag also have another primary tag.

The tag combination percentages downloaded from TagInfo is part of tag-combinations.csv

@ImreSamu
Copy link

What are the other potential interesting cases with primary feature tags?

TLDR:

  • detecting (new) relations/ways/pois without a primary tags -> probably a tagging problems.
  • my (relation) primary tag list / name spaces is bigger ( but not perfect yet )

Hi,
I Have created a simple script for detecting strange osm tag combinations ( multipolygons/relations) ( for correcting Old-Style multipolgons )

my current not perfect - (relation) Primary list - for detecting strange osm tag combinations with osm relations :

# Primary osm keys 
osm_primary_keys1="aeroway|amenity|attraction|barrier|boundary|building|craft|emergency|emergency_service|highway|historic|indoor|landuse|leisure"
osm_primary_keys2="man_made|military|natural|network|office|place|power|railway|restriction|route|shop|sport|tourism|waterway|wetland"
osm_primary_keys3="addr:street|area:highway|building:part|building:wall|roof:edge|roof:ridge"

# Primary osm Key+values
osm_primary_keyvalue1="public_transport=platform|public_transport=stop_area"

# Osm keys start with ..   ( Primary Name Spaces )
osm_primary_namespaces1="abandoned:|proposed:|planned:|removed:|razed:|disused:|demolished:|seamark:|was:"

My algorithm is very simple ,

Example:

FREQ: Analyze OSM Relations with role=outer without primary OSM keys

us-midwest-latest.osm.pbf ( 2017-03-09T21:43:02Z ) http://download.geofabrik.de/north-america/us-midwest-updates [Rv:0.1b]

count osm tag combinations
4632 type=multipolygon,
373 type=multipolygon, source=,
89 name=, type=multipolygon,
52 ref=, golf=, type=multipolygon,
32 type=multipolygon, created_by=,
29 type=multipolygon, designation=,
14 area=, type=multipolygon,
13 type=multipolygon, source=, designation=,
5 area=, name=, type=multipolygon,
5 ,
3 name=, type=multipolygon, source=,
3 golf=, type=multipolygon,
2 type=multipolygon, surface=, baseball=,
2 type=multipolygon, color=,
2 name=, type=multipolygon, wikipedia=,
1 url=, name=, type=multipolygon, source=, burned:date=, burned:natural=,
1 type=multipolygon, water=,
1 type=multipolygon, surface=,
1 type=multipolygon, operator=, designation=,
1 type=multipolygon, linncounty:objectid=,
1 type=multipolygon, layer=, source=, tunnel=,
1 type=multipolygon, layer=,
1 type=multipolygon, artist=,
1 type=building,
1 par=, name=, type=multipolygon, operator=, slope_rating:middle=, course_rating:middle=, slope_rating:forward=, course_rating:forward=, slope_rating:championship=, course_rating:championship=,
1 par=, name=, type=multipolygon, operator=, slope_rating:back=, course_rating:back=, slope_rating:middle=, course_rating:middle=, slope_rating:forward=, course_rating:forward=, slope_rating:championship=, course_rating:championship=,
1 name=, type=multipolygon, website=,
1 name=, type=multipolygon, water=,
1 name=, type=multipolygon, religion=, denomination=,
1 name=, type=multipolygon, phone=, website=,
1 name=, type=multipolygon, owner=, website=, operator=, ownership=, gnis:feature_id=,
1 name=, type=multipolygon, operator=,
1 name=, type=multipolygon, health_facility:type=, medical_system:western=,
1 name=, type=multipolygon, description=,
1 name=, type=multipolygon, admin_level=,
1 name=, type=enforcement,
1 name=, type=boundary, is_in=, source=, wikidata=, tiger:CPI=, wikipedia=, tiger:LSAD=, tiger:NAME=, admin_level=, border_type=, is_in:state=, tiger:MTFCC=, is_in:country=, tiger:CLASSFP=, tiger:PCICBSA=, tiger:PLACEFP=, tiger:PLACENS=, tiger:PLCIDFP=, tiger:STATEFP=, tiger:FUNCSTAT=, tiger:NAMELSAD=, tiger:PCINECTA=, is_in:iso_3166_2=, is_in:state_code=, is_in:country_code=,
1 name=, type=boundary, is_in=, admin_level=,
1 name=, timezone=, type=boundary,
1 name=, source=,
1 name=,
1 golf=, type=multipolygon, source=,
1 golf=, ref=, type=multipolygon,
1 golf=, note=, type=multipolygon,
1 ele=, type=multipolygon, gnis:created=, gnis:state_id=, gnis:county_id=, gnis:feature_id=,
1 ele=, note=, type=multipolygon, gnis:created=, gnis:state_id=, gnis:county_id=, gnis:feature_id=,
1 ele=, name=, type=multipolygon,
1 area=, name=, type=multipolygon, website=,

@bkowshik
Copy link
Contributor Author

Interesting @ImreSamu, thank you for sharing! 👍

@ImreSamu
Copy link

maybe related:

the "OSM Inspector" - has a 'feature keys' - list and a similar report:

  • "If an object has a non-feature key but no feature key it is flagged as an error."

see blog post: https://blog.geofabrik.de/?p=425 ( 4.09.2017 | Michael Reichert )

As I see - the actual list of 'feature keys' contains a lot of keys - and probably battle-tested
https://github.com/geofabrik/osmi_simple_views/blob/master/src/tagging_view_handler.cpp#L360

@willemarcel
Copy link
Collaborator

@bkowshik @ImreSamu I started a pull request to flag objects without a primary tag: #210

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants