Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow /map call #135

Closed
Komzpa opened this issue Dec 16, 2016 · 15 comments
Closed

Slow /map call #135

Komzpa opened this issue Dec 16, 2016 · 15 comments

Comments

@Komzpa
Copy link

Komzpa commented Dec 16, 2016

OpenStreetMap /map call is slow.

This causes hatred when editing something - it loads not at the speed of the rest of the internet. :)

This call takes 22 seconds and transfer speed is capped at 73 kb/s:

% time curl -H "Accept-Encoding: gzip" "http://www.openstreetmap.org/api/0.6/map?bbox=27.61649608612061,53.85379229563698,27.671985626220707,53.886459293813054" > /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1142k    0 1142k    0     0  52213      0 --:--:--  0:00:22 --:--:-- 73027
curl -H "Accept-Encoding: gzip"  > /dev/null  0,01s user 0,01s system 0% cpu 22,404 total

I'm on 100 mbit optical connection, checked several times.

@tomhughes
Copy link
Member

Seriously. It's not like we make that call deliberately slow, or rate limit it, it just has a lot of work to do to find the data you want.

I'm sorry, but unless you have some magic buillet we're not aware of then there's nothing we can do here.

@bhousel
Copy link
Member

bhousel commented Dec 16, 2016

@Komzpa That is a kind of large area to request for an editor. It looks like about a 3km², or about a z13 tile.

For comparison, iD uses z16 tiles (roughly 350m²) when we perform map calls. Even those are starting to get to be slow in dense urban areas. You'll surely get better performance if you request smaller map tiles in parallel.

@Komzpa
Copy link
Author

Komzpa commented Dec 16, 2016

@bhousel "even those are starting to be slow" is the real issue here. Are there statistics on the call's median/p85/p90 response time?

I'm mapping mid-scale features like neighborhoods, it's becoming really painful on that scale.

The URL in ticket is for demonstration: this call is still within API limits.

Will splitting the query into 10 make it faster totally? If yes, here's the magic bullet, if not - it's not improvement on speed, but on latency.

Where exactly that 22 seconds are being spent?

Is that Postgres? (then I'd like to get the slowlog published)
Is that network latency between frontend, backend, and database? (then something could be done about it)
Is that wrongly rolled loop that asks each node/way/relation as separate query, or a similar thing?

@tomhughes
Copy link
Member

Why not just read the code if you want to know how it works?

But really, we're not complete idiots, so how about starting from the assumption that, given @zerebubuth wrote a carefully optimised C++ implementation we might just possibly have given this some though?

@tomhughes
Copy link
Member

@bhousel
Copy link
Member

bhousel commented Dec 16, 2016

@bhousel "even those are starting to be slow" is the real issue here. Are there statistics on the call's median/p85/p90 response time?

Some analysis was done years ago during early development of iD. See here for the thread about adjusting iD's edit zoom level:
openstreetmap/iD#1520

@tyrasd and @woodpeck probably know more, but anyway I would not expect fetching z13 tiles from the editing api to be performant.

@pnorman
Copy link
Collaborator

pnorman commented Dec 16, 2016

there statistics on the call's median/p85/p90 response time?

These aren't useful - when i studied the statistics about a third of queries returned no data which skews the statistics.

@zerebubuth
Copy link
Collaborator

zerebubuth commented Dec 16, 2016 via email

@Komzpa
Copy link
Author

Komzpa commented Dec 16, 2016

After looking into the code @tomhughes suggested I got the following clues:

  • there are two implementations of /map call in cgimap - readonly and readwrite;
  • simonpoole on IRC suggested that readonly is used in API;
  • readonly implementation curretly does a query per object to get object's tags, making number of queries per map call almost equal to number of objects;
  • @tomhughes shared the ping between database and cgimap servers - 0.2ms;
  • API gets 48749 objects in 22 seconds, suggesting latency of 0.45 ms per object, which is ~2x the ping. That suggests two roundtrips per object - one to get object metadata, second to get object tags.

@pnorman
Copy link
Collaborator

pnorman commented Dec 16, 2016

If there are issues with the cgimap implementation, tickets should be opened on https://github.com/zerebubuth/openstreetmap-cgimap/issues. There doesn't seem to be anything within the scope of operations here. 0.2ms ping is reasonable, and it doesn't sound like bandwidth is a limitation.

@Komzpa
Copy link
Author

Komzpa commented Dec 16, 2016

@pnorman in the scope of operations is to know this limitation of cgimap and plan infrastructure accordingly.

For instance, these roundtrips can be made a lot smaller by shifting cgimap to database server/replica machine, even without intervention to cgimap code.

Also, to get even the basic idea where to dig you have to start somewhere.

@tomhughes
Copy link
Member

Well cgimap runs on the replica already - that's why it has to run in read only mode.

In normal operation it is running against a database that is on the same LAN in the same data centre. From an operations point of view we can't do much more than that.

@tyrasd
Copy link
Member

tyrasd commented Dec 17, 2016

readonly implementation curretly does a query per object to get object's tags, making number of queries per map call almost equal to number of objects;

Forgive my possibly stupid question (haven't actually read the code yet), but why does cgimap have to do a query to fetch the tags for each object in the map call? That sounds quite ineffective. Couldn't it be done also by a single query with a join e.g. between the nodes and node_tags tables? Or at least load the tags in bunches of a fixed number of objects at once? 🤔

@Komzpa
Copy link
Author

Komzpa commented Dec 18, 2016

Couldn't it be done also by a single query with a join e.g. between the nodes and node_tags tables? Or at least load the tags in bunches of a fixed number of objects at once? 🤔

No particular reason this can't be done in a single query. I've prepared a proof-of-concept: https://github.com/Komzpa/fastmap - the /map call in question is executed within 1.3 seconds instead of 22 in that implementation.

@gravitystorm
Copy link
Collaborator

Well cgimap runs on the replica already - that's why it has to run in read only mode.

FWIW, I think @Komzpa meant to run cgimap on the database replica machine itself, not to run it against the replica from a web-backend.

I don't think that's a good idea though, since it would involve running large processes on already-rather-large-spec database machines, makes it harder to scale out certain components if everything is co-located, and the problems with waiting for multiple backend<->db rtts can be solved in other ways.

Although this issue is closed here, I note that zerebubuth/openstreetmap-cgimap#122 contains followup discussion on the cgimap code, and thanks to @Komzpa for taking the lead on this topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants