Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upload bulk of answer in a single changeset #21

Closed
Haikuch opened this issue Mar 21, 2017 · 45 comments
Closed

Upload bulk of answer in a single changeset #21

Haikuch opened this issue Mar 21, 2017 · 45 comments

Comments

@Haikuch
Copy link

Haikuch commented Mar 21, 2017

when working offline and collecting several answers, they will be uploaded each in single changesets. I would be better to collect them in one single upload changeset.

@westnordost
Copy link
Member

westnordost commented Mar 21, 2017 via email

@westnordost
Copy link
Member

westnordost commented Mar 22, 2017

Uploading each change in a single atomic commit is a feature, actually. It allows the app to easily resolve conflicts (which it already does), a feature I wouldn't want to miss for a surveyor tool that claims to work offline.
Also, having one changeset per change is actually also easier to implement.

So, I won't change this.

@it4workflow
Copy link

it4workflow commented Mar 23, 2017

Also, having one changeset per change is actually also easier to implement.

Sorry, but this isn't an argument.

And if you just group one feature (e.g. Add building levels) to one changeset?

Your app is just fine ... for single dedicated tasks. But if you have 500 tasks about "add building levels" in your village, it doesn't really make sense to atomic commits

@westnordost
Copy link
Member

westnordost commented Mar 23, 2017 via email

@it4workflow
Copy link

I know what atomic commits are and what their advantages (conflict handling is a bit easier) are.

But a bulk changeset has it's advantages, too. E.g. metadata (changeset comment, source, created_by, etc.) only one. And in my example of 500 "add building levels" commits, where the metadata of the changeset and the data itself are approximately equal, you pump up the database 499 times with useless informations.

easy stats generation (the number next to the star)

What's the gain of having thousands of stars? This sounds like gamification, what's next, TopTen, Leaderboard and so on? Do we really want to have this in OSM?

@it4workflow
Copy link

Or otherwise, will there be a version for experienced mappers?

And i hope you don't misunterstand me: the app, especially the idea behind (getting things complete) is great, because the workflow is simple. Currently my workflow would be e.g. fieldpapers, surveying, scanning, load and edit in josm and upload the changes ... so a few more steps as with your simple (smartphone ui experienced) app. In the meantime i've also used vespucci and others, but there you have to manually type in so much informations on the small smartphone keyboard .. doesn't really make much fun

@westnordost
Copy link
Member

westnordost commented Mar 23, 2017 via email

@westnordost
Copy link
Member

westnordost commented Mar 23, 2017 via email

@Haikuch
Copy link
Author

Haikuch commented Mar 23, 2017

@westnordost fyi: https://forum.openstreetmap.org/viewtopic.php?id=57783

@westnordost westnordost reopened this Mar 23, 2017
@woodpeck
Copy link

Hm, apparently people can easily generate several changesets per minute (complaint on German forum here: https://forum.openstreetmap.org/viewtopic.php?id=57855&action=new). This impacts the ability of the community to do quality control; the idea of changesets is to combine several similar edits into one. I don't quite buy the "easier to handle conflicts" - are you aware that it is possible to open ONE changeset, then upload several diffs, then close that changeset again? If some uploads fail, that doesn't mean the whole changeset fails. That should be simple enough to implement, and then group several edits in the same region with the same task type.

It may be a little more programming effort, but you don't want a community that rejects edits from StreetComplete because the application is considered "hostile". Playing friendly can be well worth the effort.

@Nakaner
Copy link

Nakaner commented Mar 30, 2017

woodpeck wrote:

I don't quite buy the "easier to handle conflicts" - are you aware that it is possible to open ONE changeset, then upload several diffs, then close that changeset again? If some uploads fail, that doesn't mean the whole changeset fails. That should be simple enough to implement, and then group several edits in the same region with the same task type.

In addition, you can first do a HTTP GET call on /api/0.6/[node|way|relation]/#id to get the current version on the server. If you do so, the possibility that an error 409 (conflict) is returned when doing the HTTP PUT for this object is very low and conflicts will only occur if another user is modifying it at the same moment. This is a ugly workaround.

@NopMap
Copy link

NopMap commented Mar 30, 2017

What is the gain here that justifies to invest the effort to change the working implementation
away from easy conflict handling, easy stats generation (the number next to the star) and
atomic commits?

Actually, the question is rather what is the damage that your application is doing to OSM in order do avoid a little more work for StreetComplete to behave correctly - like every other editor. One could say that you are abusing the change set mechanism - at least you are not using it in the way it is intended, grouping some connected edits together.

By creating a CS for every edit you are disrupting the mechanics of OSM and all tools which are evaluating change sets, tracking changes or comparing different versions. This has been noted by the community and within 7 days your editor has already triggered two forum topics with complaints (already linked above in this issue).

When writing a tool for OSM it is wise to follow the intended patterns of OSM and avoid breaking existing tooling and mechanisms. As woodpeck said, sometimes there was a need to prohibit abusive applications from causing damage to OSM. This had to be done for edits with copyright infringements, tools with excessive accesses to the API and some mass downloaders. But I don't think that you want StreetComplete to join that list.

@westnordost
Copy link
Member

westnordost commented Mar 30, 2017

If some uploads fail, that doesn't mean the whole changeset fails.

Ah. No, that escaped my mind. Well, that makes it a lot easier to implement, actually. (It is still a bigger feature though, from the perspective of a one-man project).
It seems the notion of changeset in OSM is also a different one than in version control systems, I understand that one change uploaded within one changeset is still applied immediately?

The argument about swamping QA tools is certainly a valid one, as the (bulk of) QA tools are built around the assumption that any changeset will have a certain information content, as @NopMap explained. This doesn't hold true currently for StreetComplete.
I reckon that many people are working with the QA tools as they are now, I find it comprehensible that people who use these tools fear to be massively hindered in their work.

So, don't worry, I am on your side, guys. It just takes good arguments, not threats, to convince me.

So, regarding solving this issue, there are several things that should be discussed. I am open for input.

Currently, answered quests are "fire and forget". As soon it is answered, it is uploaded, done. The user neither has to worry about when/if his change gets applied nor somehow look after that it will be applied.
When the changes should be bulked together, it doesn't make sense to only do that for quests that have been answered offline, but it should be done for all quests. This raises the following problems:

  1. How many changes should be collected (or time pass) until the app closes the changeset? Should only quests of the same quest type be put together in one changeset or everything the user is putting in? If everything is put into one changeset, then the commit message would probably be "survey with StreetComplete" rather than "Added street surfaces". Question to the QA people: How much worth to you are commit messages that explain the nature of the changes? Worth more than having bigger changesets, or less?
  2. With this mechanic and assuming that mobile network might be very unstable (it's not only about Europe, after all), there would be a lot of open unclosed changesets around, because the app wasn't able to close the changeset due to network coverage. (Also, the user might decide to go offline without warning any time, the app cannot foresee that). I understand a changeset will be closed automatically after 24h, but in this time, no changeset comments can be added. This also affects QA, so you realize that, right?

@Nakaner
Copy link

Nakaner commented Apr 2, 2017

westnordost wrote:

  1. How many changes should be collected (or time pass) until the app closes the changeset? Should only quests of the same quest type be put together in one changeset or everything the user is putting in? If everything is put into one changeset, then the commit message would probably be "survey with StreetComplete" rather than "Added street surfaces". Question to the QA people: How much worth to you are commit messages that explain the nature of the changes? Worth more than having bigger changesets, or less?
  2. With this mechanic and assuming that mobile network might be very unstable (it's not only about Europe, after all), there would be a lot of open unclosed changesets around, because the app wasn't able to close the changeset due to network coverage. (Also, the user might decide to go offline without warning any time, the app cannot foresee that). I understand a changeset will be closed automatically after 24h, but in this time, no changeset comments can be added. This also affects QA, so you realize that, right?

There is no best answer how large changesets should be. Ask four people and you will get five different responses.

The following is my personal opinion. If the size of the changesets is similar to the size of those generated by MAPS.ME, it is ok. If the edited objects are in the same area, the number of changes per changeset might be larger. Have a look how large the bounding boxes of changeset of normal mappers using iD or JOSM are. Other people might have different opinions so, don't hard-code any thresholds, use variables set in one single source code file to make changing them easier. I currently don't expect you to expose these settings to your users.

Please fix this issue ASAP. People have started writing changeset comments to all the users of your application to stop using it because it disturbs the way quality assurance at OSM works. Here are a few examples:

You can find a full list of these comments here.

Sorry for being harsh but if QA is hindered, I am not amuse anymore. If I see the need to prevent further disturbtion by your application, I will not ask the DWG to block your users because it is not their fault. Instead, I will ask DWG to ask the OWG to block your application in a manner similar to this.

If you don't see any progress in this issue within 48 hours, I will ask the DWG/OWG to disable the access for your application to the OSM API until the release of 0.6.

You are welcome to invite the community at the German forum for doing beta testing with an improved version using the Dev API.

@ax3l
Copy link

ax3l commented Apr 2, 2017

@westnordost I know it sounds hard, but since you are still running a public beta it might be a good idea to add all releases into your blacklist until 0.6 is finished? You are well prepared from that side :)

@ax3l
Copy link

ax3l commented Apr 2, 2017

One idea could be to integrate it as a user-workflow:

  • a user opens the app, show a text "the best way of mapping in OSM is to do a ground survey, add similar regions and submit your changes at the end of a tour/day at once so the community can review it at a glance. -> >>Start a new tour<<"
  • user maps as now, stored all locally
  • new feature: you allow editing newly added, non-comitted meta tags, something that could prevent typos to get commited. just show a list of new entries for that
  • add a new button: "end mapping tour" which asks the user to write a commit message with a help message as known from the Web "iD" editor stating how to phrase a changeset message

I think it's definitely no problem to let users keep a single mapping tour open for several days and decide on their own when to submit it. since submitting takes a little time to write a sentence or two, they (we) won't spam it anymore.

for convienience, you can still add an "auto-commit every 24hours" mode with a generic commit message stating it was auto-commited from the last 24hrs.

@westnordost
Copy link
Member

There is no best answer how large changesets should be. Ask four people and you will get five different responses.

Hm, you are basically saying, however I implement it, there will always be people who scandalize how it is implemented. This is not too encouraging. :-(

Anyway.

The upload process is a critical section of the app and the implementation of the feature request adds complexity right there. It would be irresponsible to hastily bring out a new version, as unsolved bugs there can lead to much worse things than clumsy changesets. Such as the very ticket you link, which actually destroys data.

Proper implementation and testing cannot be done under time pressure, lest in two days. So, I suggest you open a ticket at DWG right away so they can discuss whether this behavior warrants a block until the feature is implemented or not. While I see that it is important that this feature is done, I do not share the same opinion as you about the criticality of it. Anyway, I leave this decision up to DWG. Perhaps, within that discussion, they could also say how they'd like the changesets to look best. It doesn't make sense that I implement one thing now, and that is still not optimal.

As @ax3l mentioned, I implemented an own mechanism to block versions, so users would get a message that is a bit more informative.

@westnordost
Copy link
Member

I think it's definitely no problem to let users keep a single mapping tour open for several days and decide on their own when to submit it

Changesets are force-closed automatically after 24h and after 1h if no further changes have been uploaded.

If there will be no further input, I think I will do it in a way like @ax3l suggested, only simpler. I do not want to expose users to the notion of changesets, changeset comments etc. if not necessary:

  • open a changeset (if none is open yet) for a change, upload it right away, but don't close it
  • collect changes in this way for a maximum duration of 30m (fixed duration can be changed later, but max is 1h, see above), then close the changeset, regardless of how many actual changes were inside
  • always put "survey with StreetComplete" into the changeset comment. Reasons:
    • With a bulk upload, different quest types cannot be distinguished anymore unless each different quest is uploaded as an own bulk, again multiplying the amount of changesets, the more different quests there are (there will be many). It can very well be, that there are then still many changesets with only one change inside. I reckon this is unwanted (, duh!)
    • The people doing the QA seem to be more interested in where a changeset is coming from (a survey, adding something from satellite etc) than what exactly has been changed, as this is the normal "format" in which JOSM/iD changesets come in. I assume they do then look into the changesets and that the QA tools can show these changes in a clear manner
    • it's less complex to implement. Less complexity in the interface between the app and OSM Api is good, because there is less potential for problems

@Nakaner
Copy link

Nakaner commented Apr 2, 2017

@westnordost wrote:

Anyway, I leave this decision up to DWG. Perhaps, within that discussion, they could also say how they'd like the changesets to look best.

I don't expect that the DWG will tell you what's the best way and how large your changesets should be (@woodpeck's posting above was not marked as a posting on behalf of DWG). That's not how they do their work. Their task is to ensure that people treat fair each other, no insults happen, edit wars are solved. They usually don't tell you how to edit data and which tagging scheme to use.

Although, I forwarded this issue to DWG to decide whether a block is reasonable or not.

@ax3l
Copy link

ax3l commented Apr 2, 2017

different quest types cannot be distinguished anymore unless each different quest is uploaded as an own bulk

be aware that from the OSM community point of view, this is not relevant anyway as you stated in the points below.

for others: one of the features of this app is to count the number completed quests of a user (number of added OSM tags). We need to find an other way to get that from the accumulated data then, so he first tries to get the according changesets (with "survey with StreetComplete" in its comment) for a user, then counts number of changes within.

I would still suggest to use the changeset's created_by attribute instead of relying on the changeset comment to contain a string.

@westnordost
Copy link
Member

for others: one of the features of this app is to count the number completed quests of a user (number of added OSM tags). We need to find an other way to get that from the accumulated data then, so he first tries to get the according changesets (with "survey with StreetComplete" in its comment) for a user, then counts number of changes within.

Sorry, I don't know what you want to say. In German, perhaps?
created_by is used, as is source (=survey)

@ax3l
Copy link

ax3l commented Apr 2, 2017

that last part is a note for others/OSM people that follow your thread so they know why you are interested in "number of edited tags" of a user (aka "quests") - this is something not of interest for them but they need to understand your reasoning.

@westnordost
Copy link
Member

Ah, misunderstanding. I meant "cannot be distinguished anymore" in the sense that people reading the history cannot see at first glance what was being changed.

@ax3l
Copy link

ax3l commented Apr 2, 2017 via email

westnordost added a commit to westnordost/osmapi that referenced this issue Apr 2, 2017
@derfasthirnlosenick
Copy link

derfasthirnlosenick commented Apr 2, 2017

If I'm not mistaken, OSMAnd caches any edits and then just opens, fills and immediately closes a changeset on upload. This might be a viable route to go here as well, esp. if network connection is a concern.

edit: once that functionality is in place, there's plenty of flexiblity on how you aggregate edits into the changesets (e.g. by time, type, locale, ...)

@ax3l
Copy link

ax3l commented Apr 2, 2017

Yes, as I described above one could go exactly the same way by exposing a list of closed quests (editable until upload) and even add a changeset summary on upload if the user wishes so.

@westnordost
Copy link
Member

@derfasthirnlosenick Now you could test it if you like. For me, that's it for today. Only did some very rough testing so far.

@derfasthirnlosenick
Copy link

@westnordost Had some crash issues with the debug build (mostly when searching for quests), release seems to work - no idea why.
Changeset creation works like a charm, incl. split by type of quest (tried it out with opening hours and street surface) Also switching between quest-types works. Will test how the auto-closing behaves. Great job!

@westnordost
Copy link
Member

westnordost commented Apr 6, 2017 via email

@derfasthirnlosenick
Copy link

derfasthirnlosenick commented Apr 6, 2017 via email

@westnordost
Copy link
Member

westnordost commented Apr 6, 2017 via email

@derfasthirnlosenick
Copy link

recompiled in debug - can't reproduce the crashes. odd (but not really bad news). will try again later.

@westnordost
Copy link
Member

https://forum.openstreetmap.org/viewtopic.php?id=57943

@ax3l
Copy link

ax3l commented Apr 7, 2017

@westnordost I don't understand why you want to push grouped by changed tags. Just for the changeset message? :) Afaik, the idea about change sets is to adjust, e.g. a building or buildings of an area with several tags at once and then upload those collectively. You could do the same by aggregating your current changeset message(s) into one.
With the number of available quest increasing in the future, your current 0.6 strategy would just result in a unnecessary high number of segregated changesets even if people do a survey in a very localized area (for let's say twice 20min a day).

@Hedaja
Copy link

Hedaja commented Apr 8, 2017

I think what @ax3l wanted to is, is the same I thought when I read your post about changeset commons.
I wouldn't group uploads by the different tasks but make one changeset for a certain time (maybe area). The comment then could start with "created by StreetComplete" and than followed by an automatically generated list of things changed
Just like you do for each change already.

For example: added 3 building levels and one road surface would result in a list with
"- added levels to building(s)

  • added surface to road(s) "

But the app seems like a really fast way to add certain things. I'm looking forward to use it ones the changeset are fixed =)

@westnordost
Copy link
Member

westnordost commented Apr 8, 2017

Phew, actually you are a bit late to the party.
The implementation is finished, I want to wrap this up now.

I spent this week implementing the solution I proposed in the linked German forum thread, after asking if anyone had a strong reason to oppose it. Now that the app is (about to) use the changeset mechanic as it is intended, everything else is a matter of taste. I do not intend to discuss and topple over the whole implementation again because there are people who prefer to have it all in one changeset now.
Obviously, there are reasons and thus (will always be) people that speak out for either way of doing it, but after much deliberation I chose to segregate the changesets along quest types.

Just so that you better understand my decision, here are some reasons for segregating changesets along quest types:
Changesets are inherently segregated and identifiable by location (bbox), by time, application and by data source already with the associated metadata and tags.

  • If the meta information mentioned above is already present, i.e. "created by SC" as comment is duplicate information. But there is no other meta data available, other than the type of the quests.
  • There is also otherwise no added information value from grouping changesets only by their time, because the time is already inherently part of the changeset. The quest (solutions) are not connected to each other in any way, meaning, it is irrelevant that they have been solved on the same survey (i.e. it is irrelevant for comprehending the changes whether I once added opening hours while otherwise adding street surfaces or not)
  • Thus, The only gain I see from grouping all solved quests within a survey together is to have less changesets. But it is not the goal to have as few changesets as possible, but to lay them out in a meaningful, descriptive way. The type of changeset that must be avoided are those that seem to hide certain (very different) changes within a larger set of harmless changes. Not saying that would be the case, my point is that the goal should be to have as meaningful changesets as possible, not as few as possible.

Ok, I talked too much about why "as few changesets as possible" is not the golden way. As promised, my reasons for segregating changesets along quest types:

  • The type of change can be told apart on first glance by people doing QA: they can focus on checking those quests that are known to sometimes be misunderstood and/or critical things such as street names
  • Possibility of programmatically and reliably reconstruct what changes (what =quest type) any user exactly did from the OSM change history alone, without the need to either add "proprietary" tags to elements (god forbid) or somehow transfer analytics data to a third party. A must for statistics and (thus) finding systematically introduced errors (don't need to be bugs, may be systematic misunderstandings).

OK?

@ax3l
Copy link

ax3l commented Apr 8, 2017

Thank you for posting the reasoning! If nobody in the forum subjected then that should be fine indeed.
I just wanted to share an alternative approach in case this workflow would still create "too many changesets" which would be similar as if I would do a manual survey on the ground. Anyway, I am really looking forward to the next release, thanks a lot! ✨

@westnordost
Copy link
Member

Thank you, Axel.

Almost forgot: 0.6 is out now, get it while it is hot!

@ax3l
Copy link

ax3l commented Apr 8, 2017 via email

@westnordost
Copy link
Member

No idea, we'll find out.

@ax3l
Copy link

ax3l commented Apr 9, 2017

I dig a bit into it and opened a PR to their repositories: https://gitlab.com/fdroid/fdroiddata/merge_requests/2170 (the metadata-bot of f-droid finds it too, but seems to visit it with quite a large interval)

@westnordost
Copy link
Member

That's cool, thank you!

@ax3l
Copy link

ax3l commented Apr 14, 2017

btw: the fdroid repo did build and distribute the 0.7 release yesterday

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants