Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Statistics about StreetComplete usage - how many elements were edited by each quest? #1749

Closed
matkoniecz opened this issue Mar 2, 2020 · 22 comments

Comments

@matkoniecz
Copy link
Member

matkoniecz commented Mar 2, 2020

For start - thanks to all mappers who edited together over 5 million elements in total!

I made a list of the popularity of every StreetComplete quest - what is the most popular quest? how many elements were modified by AddOpeningHours quest? Have you added quest and were curious how many elements it edited? Scroll below for list or read some about additional detail.

I made it by a little number crunching of metadata of all changesets from the first changeset to 2020-02-24 (date of Weekly Changeset Metadata file on the mirror that I used). Unlike full planet file that requires a beefy computer to load, this file containing metadata of all changesets ever made can be easily processed.

The list below includes all edits made by apps using StreetComplete changeset tag format done more than 36 times. It means that it includes

  • quest names that were later changed
  • quests that were never released due to being bad ideas
  • quest being currently prepared for release
  • quest in private forks using StreetComplete name in changeset tags (so it excludes edits made with my private fork)

If someone is interested in code used to do that - look at https://github.com/matkoniecz/StreetComplete_usage_changeset_analysis

If someone is interested in similar statistics

Note that usefulness and effort needed to solve quest for one element differs significantly depending on a quest. For example AddPlaceName solved 20 thousand times likely took more time and is more useful than many quests that edited more elements in total.

In some cases further research may be interesting about usefulness - how many AddParkingAccess and AddPlaygroundAccess confirmed access=yes and how many helped to mark this objects as private?

QuestCode Total modified elements
AddRoadSurface 1722k
AddWayLit 683k
AddBuildingType 533k
AddBuildingLevels 412k
AddMaxSpeed 281k
AddPathSurface 216k
AddCycleway 150k
AddRoofShape 149k
AddHousenumber 120k
AddRoadName 106k
AddTactilePavingCrosswalk 105k
AddOpeningHours 95k
AddCrossingType 78k
AddParkingAccess 76k
AddBusStopShelter 65k
AddParkingType 52k
AddTactilePavingBusStop 50k
AddParkingFee 45k
AddSidewalk 38k
AddWheelchairAccessBusiness 33k
AddPlaceName 19k
AddTrafficSignalsButton 18k
AddTrafficSignalsSound 18k
AddRailwayCrossingBarrier 17k
AddBenchBackrest 16k
AddBikeParkingCover 12k
AddRecyclingType 11k
AddPlaygroundAccess 10k
AddCyclewaySegregation 10k
AddBikeParkingCapacity 10k
AddTracktype 8k
AddMaxHeight 8k
AddPowerPolesMaterial 8k
AddBikeParkingType 7k
AddHandrail 7k
AddVegetarian 6k
MarkCompletedHighwayConstruction 6k
AddSport 6k
AddForestLeafType 5k
AddPostboxCollectionTimes 5k
AddOneway 5k
AddProhibitedForPedestrians 4k
AddCyclewayPartSurface 2875
AddBusStopName 2838
AddFootwayPartSurface 2625
AddAccessibleForPedestrians 2440
AddFireHydrantType 2365
AddVegan 2322
AddBridgeStructure 1888
AddToiletsFee 1781
AddMaxWeight 1467
AddWheelChairAccessPublicTransport 1190
MarkCompletedBuildingConstruction 1163
AddToiletAvailability 1002
AddReligionToPlaceOfWorship 794
AddBabyChangingTable 743
AddOrchardProduce 740
IsBuildingUnderground 679
AddInternetAccess 634
AddIsBuildingUnderground 586
AddWheelchairAccessPublicTransport 534
AddCarWashType 532
AddWheelChairAccessToilets 528
AddWheelchairAccessToilets 524
AddRecyclingContainerMaterials 371
AddMotorcycleParkingCover 357
AddMotorcycleParkingCapacity 271
DetermineRecyclingGlass 251
AddReligionToWaysideShrine 238
AddGeneralFee 209
AddWheelchairAccessOutside 141
AddSelfServiceLaundry 141
AddPaymentGirocard 135
AddPaymentMastercard 118
AddPaymentContactless 117
AddPaymentVisa 113
AddFerryAccessMotorVehicle 75
AddPostboxRef 65
AddSidewalks 42
AddFerryAccessPedestrian 37
@Discostu36
Copy link
Contributor

What is the difference between "Add Sidewalk" and "Add Sidewalks"?

@matkoniecz
Copy link
Member Author

matkoniecz commented Mar 2, 2020

Looking at just usage - likely AddSidewalks is the old name, used during developement (looking at git history may be able to confirm this).

It is possible that this name used by quest in some private fork that is also using StreetComplete changeset tags.

It may be investigated by finding specific changesets that used this tags.

@westnordost
Copy link
Member

Nice!

I am balancing when to award certain achievements. For the achievement that is awarded for supplying the road surface, it makes sense to award it later (100s of quests solved) while for things like pedestrian ferry access, it almost makes sense to hand it out after the first one solved.

Another information I'd need is the median (not average) of how many quests a user solved. Then I can calculate how many of each quest type a user typically solved.

@westnordost
Copy link
Member

MarkCompletedHighwayConstruction

That's pretty nice! Didn't expect that much, good job @matkoniecz , this quest is very valuable for keeping the map up-to-date!

So the aliases because of renamed quests are:

AddAccessibleForPedestrians -> AddProhibitedForPedestrians
AddWheelChairAccessPublicTransport -> AddWheelchairAccessPublicTransport
AddWheelChairAccessToilets -> AddWheelchairAccessToilets
AddSidewalks -> AddSidewalk

@matkoniecz
Copy link
Member Author

Another information I'd need is the median (not average) of how many quests a user solved. Then I can calculate how many of each quest type a user typically solved.

I was thinking about what kind of pool of users should be taken. All who solved any quest? All who solved any quest of this type? All who solved >10 quests?

@westnordost
Copy link
Member

"All who completed one session" would be a good measure. So perhaps >=3 or so.

@matkoniecz
Copy link
Member Author

matkoniecz commented Mar 2, 2020

SC has 20430 accounts with >=3 edited elements.

For nearly all quests median is 0, given sample of users who made >=3 edits. See https://gist.github.com/matkoniecz/f5ebdae67317737f008335636eb7c3d3

AddRoadName - median 1
AddRoadSurface - median 16
AddWayLit - median 2

Everything else - median 0.

I suspect that giving some message "It is not only AddRoadSurface app" may be helpful. I suspect that large number of users have never solved any quests except AddRoadSurface.

@matkoniecz
Copy link
Member Author

matkoniecz commented Mar 2, 2020

I am now running script for statistics with users who edited >=3 elements except road surface quest.

EDIT: turned out to have a minimal impact - see https://gist.github.com/matkoniecz/431571939bac223b0770fce02fcdef64

But note that about 3000 of 20 000 accounts solved only road surface quests.

@westnordost
Copy link
Member

Oh, ok

@westnordost
Copy link
Member

westnordost commented Mar 2, 2020

I guess we need another filter to only return users which are really using SC. For example, in the editor usage stats, SC appears consistently with about 7k users, not 20k.

@matkoniecz
Copy link
Member Author

Even after counting median only among people who solved at least 1 quest of given type it is really low - see https://gist.github.com/matkoniecz/11ef437c8a6a4f3edc756831ee3d1bdb

https://gist.github.com/matkoniecz/231c5afce9c6b943ada4493d7aafbb28 is counted with median among top 6000 editors

https://gist.github.com/matkoniecz/2f25f82a4362650cf48852da3dbb3f10 median among top 3000 editors

@smichel17
Copy link
Member

for quest AddOpeningHours it was done by 8129 people out of 3000 users.

How is the number of people sometimes higher than the number of users?

@matkoniecz
Copy link
Member Author

matkoniecz commented Mar 2, 2020

In this case I manually override number of users by specific value (here 3000).

In other words, I put threshold to be counted as user higher than "solve some quests". "Active users" would fit better there, but it was anyway repeated run of hackish script.

@westnordost
Copy link
Member

Hmm, do you also know the median edit count (all quests summed up) for a user? (median of +top 3000, +top 6000)?

@matkoniecz
Copy link
Member Author

Looking at the pivot table output - posted to https://gist.github.com/matkoniecz/73aabf3cbcc0517f82eabe9b5b1aca92

3000th user has 309 edits

6000th one has 117

10000th one has 44

@rugk
Copy link
Contributor

rugk commented Mar 9, 2020

BTW another bias you may have to take into account for the stats in the OP: Of course, the later a quest has been introduced, the less users have seen it, i.e. old quests likely got more answers just because they are old. (though this could be mitigated by the fact that the user base is obviously always growing per time)

@Sequynth
Copy link
Contributor

Is it possible to calculate the 'impact' of SC? E.g. of all the highways with surface=*, what percentage was tagged by a mapper using SC?

@smichel17
Copy link
Member

It should be, although due to the quantity of data it might need to be geofenced to particular areas

@matkoniecz
Copy link
Member Author

matkoniecz commented Mar 11, 2020

Is it possible to calculate the 'impact' of SC? E.g. of all the highways with surface=*, what percentage was tagged by a mapper using SC?

I think that fairly good answer can be achieved by assuming that every SC edit added one case of surface. Ignores splitting ways (reduces SC impact) and ignores undoes (inflates SC impact).

See https://taginfo.openstreetmap.org/keys/surface - 29 564 798 cases, 1722k SC edits, for 5% of tags added by SC - https://duckduckgo.com/?q=1722*1000%2F29564798&t=canonical&ia=calculator

See https://taginfo.openstreetmap.org/keys/surface - 29 564 798 cases, 1722k SC edits, for 17% of total tagged - see https://duckduckgo.com/?q=29564798%2F1722%2F1000&t=canonical&ia=calculator

What is far more than I expected, is there somewhere something off by an order of magnitude?

@smichel17
Copy link
Member

smichel17 commented Mar 11, 2020

Isn't 29M the total, so it should be the denominator? That is, 1.722M/29M, or ~5.8%
https://duckduckgo.com/?q=((1722*1000)%2F29564798)*100&t=canonical&ia=calculator
(could be written with fewer parentheses, but I think they make the calculations clearer)

@HolgerJeromin
Copy link
Contributor

HolgerJeromin commented Mar 11, 2020

See https://taginfo.openstreetmap.org/keys/surface - 29 564 798 cases

https://taginfo.openstreetmap.org/keys/surface#combinations
We have to use 28 595 118, as only highway is tagged with SC. The others are parkings or other stuff.
So we raise it to about 6% (see smichels post)

@matkoniecz
Copy link
Member Author

denominator

Yeah, right - thanks for spotting what went wrong!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants