Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add deprecation rules for typos in sidewalk-related tags #1278

Merged
merged 1 commit into from
Jul 15, 2024

Conversation

waldyrious
Copy link
Contributor

@waldyrious waldyrious commented Jul 8, 2024

p.s. - Also fix indentation errors in data/deprecated.json. Extracted to #1282

@tordans
Copy link
Collaborator

tordans commented Jul 9, 2024

@waldyrious thanks for the PR and the indirect ping on #222 which I just merged.

Here are a few unsorted thoughts on this PR:

But most importantly…

  • I am not sure what our threshold or guidelines are on deprecations. How often do those typos happen? Sometimes it is better to create a MapRoulette Challenge to cleanup the typos than to have a deprecation which might never trigger.
    I am also not sure if adding many small deprecations has any performance impacts (in the long run).

Lets wait for input from others on this first. And lets look at some numbers as well.

@waldyrious
Copy link
Contributor Author

waldyrious commented Jul 9, 2024

@tordans thanks for the review. I've extracted the indentation fixes to #1282, and rebased this PR following the merge of #222.

As for the prevalence of these values: I recently fixed a bunch of typos in the values for the sidewalk:* tags, and there was a lot of variety, but these two (none and seperated) were the only ones that iD prominently suggested in the autocomplete box when I started typing them. I don't think there's any question about the prevalence of none, as documented above (and besides, this PR just completes the entry already added in #222, to cover the associated subkeys); and as for seperate, it's a well-known common misspelling of "separate", so it's bound to keep happening.

That said, I'm happy to heed any consensus that might emerge form additional community discussion.

@tordans
Copy link
Collaborator

tordans commented Jul 10, 2024

Thanks for the update. I don't see any relevant reason to not merge this. Will wait for a bit for some feedback and merge in a few days.


There might be better solutions for the seperated case that would also cover bicycle*. We could even consider silently changing this, which iD does in one or two situations AFAIK. But all this would need to happen in iD, which we could still do later …

Copy link

🍱 You can preview the tagging presets of this pull request here.

@tordans tordans merged commit 062459e into openstreetmap:main Jul 15, 2024
5 checks passed
@waldyrious waldyrious deleted the sidewalk-typos branch July 15, 2024 21:03
@tyrasd
Copy link
Member

tyrasd commented Aug 8, 2024

I don't think the addition of sidewalks to sidewalk was a good idea. The only two values this tag currently has are 1.70 and 1,70: how is it better to have sidewalk=1,70 instead of sidewalks=1,70? This tag is not even mentioned in the list of possible tagging mistakes for the sidewalk tag. 🤷

@tyrasd
Copy link
Member

tyrasd commented Aug 8, 2024

Similarly, the rule for the value seperate value is also questionable: it is rather rare, occuring less than a dozen times each in OSM.

IMO, the list of deprecated tags is not meant to include every possible typo imaginable (otherwise the list would be unbearably long).

@matkoniecz
Copy link
Contributor

IMO, the list of deprecated tags is not meant to include every possible typo imaginable (otherwise the list would be unbearably long).

See https://codeberg.org/matkoniecz/OpenStreetMap_cleanup_scripts/src/branch/master/recurrent_bot_edits/surface_fix_bad_values.py or https://codeberg.org/matkoniecz/OpenStreetMap_cleanup_scripts/src/branch/master/recurrent_bot_edits/shops_fix_bad_values.py for how it would end looking like :)

(approved bot edits, currently not running due to oauth2, all this tags existed in OSM at some point)

@waldyrious
Copy link
Contributor Author

IMO, the list of deprecated tags is not meant to include every possible typo imaginable (otherwise the list would be unbearably long).

Just FYI, I too don't think listing every typo imaginable is a good strategy. My motivation to mark "seperate" as a typo was to (hopefully) prevent iD from suggesting it to people, because it was precisely common enough that it was indeed being suggested in the dropdown for the sidewalk tag.

the rule for the value seperate value is also questionable: it is rather rare, occuring less than a dozen times each in OSM.

I'm not sure I understand what you mean by "less than a dozen times each". Here are the top entries that I see in taginfo at the time of writing:

Count Key Value
148 sidewalk:right seperate
102 ramp seperate
46 barrier seperate
40 sidewalk:left seperate
32 sidewalk seperate
32 cycleway seperate
27 crossing seperate
27 parking:both seperate
21 parking:left seperate
20 parking:right seperate
19 sideway seperate
15 ramp:wheelchair seperate
15 cycleway:both seperate
12 crossing:signals seperate
11 cycleway:left seperate

I suspect things might have been cleaned up shortly before when you looked at the stats — which, if that's the case, shows how prevalent that typo is given that it's already sprung back up this much since your comment and reinforces (IMHO) the need for a deprecation rule.

If this is not the right mechanism to at least prevent iD from encouraging popular typos to spread out even further, can you point me in the right direction to ensure certain key-value combinations are not shown in the tag dropdowns?

@tordans
Copy link
Collaborator

tordans commented Sep 29, 2024

@waldyrious iD/Rapid will always show a list of most used taginfo values. The cut of rules for that are specified in the iD codebase AFAIK. The only way to work around this AFAIK is to set autoSuggestions:false

But this will only work when there is a preset and field specified which we don't have for many of those cases.

AFAIK there is no warning system for typos in tag keys and tag values in general. I suggest to open a issue in the iD repo so we can collect a cases that would work for this, like maybe color.

@matkoniecz
Copy link
Contributor

I have https://codeberg.org/matkoniecz/OpenStreetMap_cleanup_scripts/src/branch/master/script_assisted_cleanup/obviously_mistagged_tags_using_trivial_tag_fixes.py and https://codeberg.org/matkoniecz/OpenStreetMap_cleanup_scripts/src/branch/master/script_assisted_cleanup/obviously_mistagged_tags_using_wrong_keys.py that detect some obvious typos

So far they do it using iD presets, but found more typos than I can process in my free hobby time.

If anyone would be interested in helping (and by "helping" I do not mean "blindly mass retag", rather "carefully investigate and retag or make bot edit where it is obvious") and what is in the repo is not making clear how it can be done then feel free to ping me via codeberg issues at that repo or by https://www.openstreetmap.org/message/new/Mateusz%20Konieczny

Or just steal ideas from there :)

@waldyrious
Copy link
Contributor Author

@tordans Thanks for the context and pointers! That said, I'm still not sure why there's reluctance to use the existing mechanism of deprecation for common, recurrent typos. (Again, I'm not suggesting a full listing of every possible typo regardless of their tendency to show up organically, only those that do keep popping up, as seems to be the case with "seperate").

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants