Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use older locations to check whether user has been on site #4947

Merged
merged 14 commits into from
Aug 17, 2023

Conversation

Helium314
Copy link
Collaborator

fixes #4727

Keeps a list of recent locations (within a minute of most recent location update), and uses them for checkIsSurvey.
The most recent one is checked first, so I would not expect any noticeable performance impact for most users.
If a user has been on site recently, a slower check is still considerably faster than clicking ok in the dialog.

Some performance optimizations are possible, but likely not worth the effort. It only could make things worse for users who haven't been on site recently, but even then: the check isn't that slow.

This is not yet tested for the actual intended use case, will do if the PR is acceptable otherwise.

displayedLocation is now unused and could be removed from the listeners.

@matkoniecz matkoniecz marked this pull request as draft April 14, 2023 06:33
@matkoniecz
Copy link
Member

marked as draft per "This is not yet tested"

@matkoniecz
Copy link
Member

matkoniecz commented Apr 14, 2023

In general it looks like a nice idea. Maybe even more than a minute would be nice? But that can be changed later.


For performance: How many positions are there typically? 1000? Millions? AKA, how often onLocationChanged is called?

I think that bigger danger for performance is in constant maintaining such list of recent locations, not in check run on solving quest. But it is kept for GPS trail anyway.

That makes me ask: why separate list of locations is maintained here? Would it be possible to reuse what is kept to draw GPS trail? Or would it be more complicated.

@Helium314
Copy link
Collaborator Author

Helium314 commented Apr 14, 2023

How many positions are there typically?

My device produces at best one GPS location per second, so it would not be more than 60. But maybe more modern devices can do faster updates. Easiest improvement would be ignoring locations close to the previous one as part of isWithinSurveyDistance.

Would it be possible to reuse what is kept to draw GPS trail?

No, because the trail requires accuracy of 20m.

@matkoniecz
Copy link
Member

so it would not be more than 60

Then worrying about performance impact of checking all of them seems not needed

@westnordost
Copy link
Member

Hmm, a few comments regarding this:

  1. this list of recent locations doesn't belong into the the UI code (into the fragments). When these fragments are recreated, boom, it's gone. Saving and restoring the state would only workaround this issue. The next alternative is not to persist the data in the database, I think that would be over the top, but to have a "static" class (i.e., a singleton, we use dependency injection after all) that just keeps a list of recent locations. This very class could also include the logic described in point 2

  2. As noted earlier, sixty seconds is a bit little. It will already be difficult to exit the isWithinSurveyDistance within 60 seconds. The class I wrote about in point 1 could also contain the logic that disregards points fed to it that are too close to itself. (Now, often surveyors go in circles, or a bit back and forth etc., it would be ideal if these could be filtered out as well, but I have no idea how this could be done well, LatLonRaster only works on pre-set bounds)

(3. Location is an android specific data structure but the whole isWithinSurveyDistance is application logic that should be independent of Android. But I realize that the whole implementation here is currently very much Android dependent, so nevermind. Maybe a task for the future.)

@Helium314
Copy link
Collaborator Author

Hmm I see, so there is quite a bit more work necessary than I had expected...
Currently I'm not willing to spend the necessary time on this, so I'll continue a little later, maybe in a few days or maybe weeks or so.

(if anyone is interested, feel free to take over)

@Helium314
Copy link
Collaborator Author

Helium314 commented Apr 19, 2023

The class I wrote about in point 1 could also contain the logic that disregards points fed to it that are too close to itself. (Now, often surveyors go in circles, or a bit back and forth etc., it would be ideal if these could be filtered out as well, but I have no idea how this could be done well, LatLonRaster only works on pre-set bounds)

Current ideas (assuming we want at least 10 m between locations)

  1. when adding a location, remove the previously added location if it's within 10 m of the current one: this is not good if all locations are within 10 m of each other, as then only the most recent one is stored
  2. when fetching recent locations, provide a sequence where a location is added only if it's at least 10 m away from the previous entry: this is better, but will not detect going forth and back
  3. when fetching recent locations, provide a sequence where a location is added only if it's at least 10 m away from all previous entries: best result, but will be slow if there are many locations. May significantly profit from using the approimation https://en.wikipedia.org/wiki/Geographical_distance#Spherical_Earth_projected_to_a_plane, which I think is acceptable here
  4. something like 3., but remove discarded locations from the recent locations list for performance reasons: this might have unintended side effects similar to 1... maybe?
  5. we could (instead or additionally) require at least e.g. 5 seconds between each location, which should work well except maybe when mapping from inside a car.

@westnordost
Copy link
Member

3

Uhh, rather not. Or if at all, then everywhere. And that doesn't change that adding a new latlon would still be O(n) time complexity. Actually, what would work to make time complexity I think O(log n) is to use a quadtree to sort the coordinates (i.e. go one level deeper if more than X points are in the same leaf). But, I don't think it is worth the time it to implement this.

I think the most straightforward solution would be to not add a new latlon to the list of recent locations if it is closer than max-survey-distance/2. Going forth and back - so be it.

@Helium314
Copy link
Collaborator Author

Some performance test with the current version:
Stores locations for 10 minutes, and returns locations that are at least 40 m and 5 sec apart from each other.
Usually only the first location is checked, which is very fast anyway.
With more than 100 locations returned, and checking against a complicated multipolygon building that is too far away (so really all locations are checked), the worst case was ~100 ms.

@westnordost
Copy link
Member

westnordost commented Apr 25, 2023

100ms is definitely far in the acceptable range, since you also tested this on older hardware and this is only done on an UI interaction.

I do wonder if it makes sense to have both an upper limit for minutes and also for number of locations? Just to catch absurdly high number of locations checked when one uses the app on the motorway or something.

@Helium314
Copy link
Collaborator Author

The maximum number of locations for the survey distance check is currently 120 (locations of the last 600 seconds, and at least 5 sec between locations).

@endolith
Copy link

This confirmation dialog is the most annoying part of StreetComplete by far; I wish I could turn it off permanently.

A simpler solution for most of the false positive warnings would be to trigger the warning based on the location I clicked the quest instead of the location I have moved to when I completed the quest.

@mnalis
Copy link
Member

mnalis commented May 28, 2023

@endolith (and other people of course) if you wish to try out how well does this PR solves the issue (to give feedback), you can try this debug APK https://github.com/mnalis/StreetComplete/actions/runs/5101411056 (my fork also changes a few quests slightly to be shown in little more places, but should not affect the experience regarding this issue).

@Atrate
Copy link
Contributor

Atrate commented Jul 23, 2023

Isn't "a list of recent locations" already kind of a thing with the GPS trace that's recorded? #2209

@riQQ
Copy link
Collaborator

riQQ commented Jul 23, 2023

Any reason this is still a draft? All remarks / requests for change seem to be resolved.

@Helium314
Copy link
Collaborator Author

Isn't "a list of recent locations" already kind of a thing with the GPS trace that's recorded? #2209

There was some reason this wasn't usable, maybe because this only records high accuracy locations, iirc 15 m and better. On a train I usually have accuracy much worse than that.

Any reason this is still a draft? All remarks / requests for change seem to be resolved.

"This is not yet tested for the actual intended use case" still applies (I would only ever need that PR on a car or train ride, and still didn't test it there), and that was the reason to convert it to a draft.

@Helium314
Copy link
Collaborator Author

Though @endolith maybe you tested @mnalis' build?

Going through the changes again, I guess there might be issues if you travel faster than ~100 km/h.

@endolith
Copy link

Though @endolith maybe you tested @mnalis' build?

Yeah it seems to be working well.

@Helium314
Copy link
Collaborator Author

@matkoniecz do you consider this tested enough, so this PR can leave draft status?
@westnordost anything that needs to be changed?

Copy link
Member

@westnordost westnordost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry, it seems I missed that you requested my review.

RecentLocationStore should probably rather go into data/ as it has nothing to do directly with the UI (and so does part of CheckIsSurvey but this is not up to you to clean that up 🤷 ).

@Helium314
Copy link
Collaborator Author

I only can remember that I did some performance tests, but not what exactly.
Will go through it again, once I have more time (in a week or two).

@Helium314
Copy link
Collaborator Author

RecentLocationStore should probably rather go into data/...

Just in data/, or some subfolder?

@matkoniecz
Copy link
Member

do you consider this tested enough, so this PR can leave draft status?

Yes

@matkoniecz matkoniecz marked this pull request as ready for review August 14, 2023 05:31
@westnordost
Copy link
Member

westnordost commented Aug 14, 2023

It should have its own folder in data/.

Hm, did you not read my comment?

With the current implementation, not only is the whole list of recorded position copied each time a change is made (even for positions that are basically on top of each other), but these calculations need to be done on each edit. And this on the UI thread IIRC, or at least, blocking further interaction on the UI thread, which should come down to the same.
With time the complexity being O(n²), I'd want to keep the n as small as possible.

Also, I noted another possible performance issue. Once the first locations are old enough for

        recentLocations.removeAll {
            it.elapsedRealtimeNanos <= location.elapsedRealtimeNanos - LOCATION_STORE_TIME_NANOS
        }

to do anything, it will be done on every adding of a new location (each second?). I.e. each second, the whole array is being re-created (because removing element(s) not at the end of the array does that). Re-allocation of a new array is probably cheap, but it could be avoided entirely by changing from a time-based cut-off (10 minutes) to a number-of-locations-based cut-off and using a ring buffer as the underlying data structure instead of an arraylist.
Thinking about it, it is then probably not even necessary to synchronize it on add and getting a reverse iterator and also not necessary to copy the list.

@Helium314
Copy link
Collaborator Author

Hm, did you not read my comment?

Which one? The previous commit was done to address:

Are you sure that this isn't premature optimization, i.e. actually a performance bottleneck? Also, I'd prefer if "FlatEarthMath" (if it is really necessary) wasn't scattered in the code in some private functions but be in a FlatEarthMath.kt and subsequently also tested properly (e.g. 180th longitude)

Now the whole checkIsSurvey, which is suspending because distanceToArcs is slow, should be about 10 times faster. So the worst check time I found (100 ms, and this is already acceptable) should be reduced to 10 ms (did not test on the phone). Or maybe let's say 20 ms if copying the list and other things are really slow.
So if this is the comment you're referring to, I don't understand necessity of further optimization of get. You even questioned the attempt of reducing the 100 ms by ~10% as premature optimization.

Or do you mean the testing part? Isn't it enough that for the planned use case the difference to the spherical counterpart is always less than 1 m?

Or do you mean

Couldn't you do this check in add instead? (Should be faster, too)

I could do this, but then again this strikes me as premature optimization, as you already said the measured 100 ms are acceptable (done without copying the list using reversed, but the time spent copying/reversing shouldn't nearly come close to 10 ms).

With the current implementation, not only is the whole list of recorded position copied each time a change is made (even for positions that are basically on top of each other), but these calculations need to be done on each edit. And this on the UI thread IIRC, or at least, blocking further interaction on the UI thread, which should come down to the same.
With time the complexity being O(n²), I'd want to keep the n as small as possible.

This again is something I don't really understand. Performance even for complex geometries and on old phones is acceptable, but now should be optimized more?
Also I'm not sure which part you're referring to with O(n²)

Also, I noted another possible performance issue. Once the first locations are old enough for [...]

While I agree that things could be optimized, again there is the question why, as the performance was acceptable already before optimizing the by far slowest part.
Though I changed the removeAll to only check the oldest location(s), as this is 20 times faster according to tests, and at the same time add is called much more often than get.

So in summary, I'm quite confused by the reason for demanding further optimizations. Do you plan to make changes like increasing LOCATION_STORE_TIME_NANOS, or other things where those optimizations actually become relevant?

Anyway, all this feels like it's going to to eat more time, and like it's not going to end anytime soon... maybe @mnalis or someone else who is actually interested in this feature wants to take over?

@westnordost
Copy link
Member

Sorry, it looks like you poured much more thought and time into it than I. I wasn't aware that you already tested the different approaches in terms of time. Also, it seems it came across as demanding you change it. That was not my intention, I was just not sure whether you saw the comment I linked to because I answered after you collapsed the thread as resolved.

With 100 positions you came out with the 100ms, so for a session that runs 10 minutes or more, it will be ~120ms. This would be a noticeable delay for the user but I understand the move to the flat earth math made something like 10 times faster, so this is completely fine, then.

Also I'm not sure which part you're referring to with O(n²)

More like O(n*m), Checking if a way of n segments is within survey distance of last m positions.

Will review now.

@Helium314
Copy link
Collaborator Author

More like O(n*m), Checking if a way of n segments is within survey distance of last m positions.

Thanks for clarifying, I had the impression you meant n² with n positions. Large number of segments definitely is what has most potential to slow down things, but it should be much better now.

With 100 positions you came out with the 100ms

Actually this is (of course) on my S4 Mini, and happens only for complex geometries (many arcs for distanceToArcs) and if all checks result in "too far away". I don't have the logs any more, but single digit ms was the common case.
I'll do another "real life" test soon, including logging details like number of segments.

Also, it seems it came across as demanding you change it

That was my feeling, thank you for clarifying.

Assuming the test doesn't reveal any issues, the current approach should be fine for the planned 10 minute storage. But in case you want it extended for longer times / only limit to a number of locations, your suggestion is definitely preferable.

Helium314 and others added 2 commits August 16, 2023 15:36
test updated to also cover this case
Co-authored-by: Tobias Zwick <newton@westnordost.de>
@Helium314
Copy link
Collaborator Author

The survey distance check for 100 locations and a building with 70 segments takes around 35-50 ms. Not as much improvement as I had expected, but I don't know which building I had tested on previously. And maybe the debug version is slower than the release version.

@westnordost
Copy link
Member

Yeah, the debug version is definitely slower. Does sound very good though, I assume only a tiny amount of elements will have that many segments.

@westnordost westnordost merged commit 4c95e64 into streetcomplete:master Aug 17, 2023
@westnordost westnordost deleted the on_site_check branch August 17, 2023 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Less zealous "checked on-site" dialogue
7 participants