-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding API key support for tile.osm.org? #342
Comments
The fundamental problem is that it requires a distributed database accessible and updatable by 40+ tile caches. Then there's the question of how much extra CPU and I/O overhead will be involved in validating each request and in updating access counters and how much that will reduce our total tile serving capacity by. I know you're going to say that is not what you are proposing, but I don't think that a scheme that relies on log processing is workable - it would take in excess of 24 hours to block and unblock keys as things stand and it's not clear to me that there is an easy path to reducing that. There is also the problem of dealing with the inevitable support load that it will produce - we already effectively ignore the vast majority of enquiries related to tile serving and API keys will only increase that workload. After all of that will it even work? I don't have any experience of running these things but I would assume that the API keys leak like crazy... Especially the OSM one which will be the obvious one to "borrow" and which will presumably be unlimited. |
I'd like to see API keys, but we need plans for handling the engineering, operations, and support load from it. |
If we go for API keys, then the Nominatim API should get them as well. Personally, like @tomhughes, I dread the additional engineering and support work load. Happy to be proven wrong though. |
API Keys could be limited to a certain referer or user agent? |
Speaking as someone who runs a service with key based authentication I can tell you there are many people who will gladly register 10, 20, 200 accounts to get more keys and not have to spend money using a commercial service. People will write software to get the keys for them (ie register fake accounts), or get large groups of people (entire university class, etc) to register. Requiring keys is certainly more of a barrier than nothing, and it at least can prevent accidental high usage from well-intended users, but it alone does not solve the problem of abusive users. |
@freyfogle definitely an issue, but I wonder if we can estimate how much of an issue deliberate abuse is for OSM. My vague impression from chatting about it before is that there are several high level users who are well-intended. Also if there is abuse, then that can lead to domain level blocks. @grischard API keys could be limited like that, all depends on implementation.
@tomhughes makes a key point. I'm asking around how this is managed. |
Free tiles are like banner ads for OSM. I’d be much more interested to see capacity raised and access preserved than limits put in place. Is there any background reading on needs for tile auth that you could link here to give this idea a little more context? |
Yep, there will need to be some way of aggregating the access counts. If we don't want to do log aggregation and count from access logs, we could add the counting to the blocking service. It could aggregate and periodically write to the database.
This is a valid concern, but it seems that there's plenty of spare CPU on the edge caches right now. The API key check would boil down to parsing the HTTP request text (to get query string + headers) and an in-memory map lookup to check if the API key is blocked. If we aggregate on the edge cache, we'd have to store several integers per API key seen in the last N minutes by this edge, but unless we have millions of different API keys per edge I doubt it'll be a problem. If it takes up too much memory we can increase the frequency of the aggregate + dump step.
I think you could continue to ignore the vast majority of support requests. In fact an API key self-service page might reduce support requests because there will be a process for how to use the tiles. Perhaps when you create an account or an API key you have to agree to the tile usage policy, helping clear up some questions? Additionally, we could give more people the ability to adjust limits through the API Key tool to alleviate support request burden from just a few people.
The OSM one would be limited to OSM referrers and could be rotated on a periodic/automatic basis. Tile scrapers and other bad actors will surely find a way around it, but my assumption is that the majority of unwanted traffic comes from people setting up Leaflet and pointing to tile.osm.org for lack of a free alternative.
Yea, good point. The API Key check could sit in front of any HTTP service and act as a reverse proxy to anything that handles HTTP requests. It could be configured at startup time with the name of the service its limiting so that aggregation in the central DB happens on a per-service basis.
Yep, the referrer, origin, and user-agent headers are all sent during the request and could be checked at request time.
Yea, this is something to think about. If this becomes a widespread problem, there are various levers we could use to make it a little harder by switching to login with an account that takes a little more work to create (Google, GitHub, etc.), requiring phone number verification, etc. |
Well I assume in this case it's that @iandees as an American gets poor service because we have very limited capacity in North America especially since we lost one of our caches there a few weeks ago. |
😄 Actually it's been noticeably faster after Grant switched over to Nginx caching.
I agree that free tiles is a great advertisement for OSM. Maybe I'll open another ticket about increasing capacity (especially in the US), but my feeling is that we're at the limit of what we can support in a purely volunteer time/donated hardware scenario. I'm not aware of any prior reading on this topic. I think we're creating it right now! 🎉 |
If you think there's lots of spare CPU capacity then you're clearly being selective in which caches you look at - it varies massively. It's also changed recently since we have in the last week gone multithreaded on squid which has increased the CPU usage most likely.
There will be a whole new class of requests is my concern, namely begging emails asking for capacity increases, expedited releases of blocks, etc, etc. It will all be "you're ruining my mapping party" or "people might die" or...
I'm not sure referer checks are very useful - everybody fakes osm.org as the referer already.
Many of the requests have no, or very poor, referer and/or user agent information even though that's already against policy. In particular mobile app traffic will rarely have any referer and often have very pool user agent information.
I don't think requiring a third party account would go down very well and I have no idea how you expect us to do phone number verification even if we thought it was as reasonable thing to require. In any case gmail accounts are like the one thing every spammer has by the thousand... |
Sigh, broken record mode: First the OSMF needs to decide what the intended audience is for the tiles and if that needs expanding or reducing (hint the load is not caused by people editing), what kind of service level they want to provide and how much its allowed to cost. Then we can decide if API keys are a suitable and efficient way to achieve whatever the goal is. PS: @zerebubuth had some in depth stats on tile usage way back, iirc the conclusion was that while getting rid of the large users was relatively easy and would lead to short term relieve in the end the long tail is getting to more and more of a burden (and that was before google recent price hike). |
Instead of an exasperated response could you point out where previous discussion on this topic has happened? I don't really want to get into the politics, but I disagree that OSMF needs to decide anything new. There is already a tile usage policy and this would be a way to enforce that. |
There are discussions on the topic spread everywhere, for example Darafei Praliaskouski (@Kompza) had expanding tile server capacity as part of his election platform when he stood for the board. |
As a provider of 2 cache and 1 render(osmf provided ssd drives) servers I have mixed feelings about implementing API keys. |
My recollection is that, while there are usually a few heavy (ab)users to ban at any one moment in time, the list changes day-to-day and week-to-week. Keeping on top of that is a considerable effort. Basically, it's a game of whack-a-mole (see #113). It seems to me that there are two separate issues here:
While I think the latter would help, my reckoning is that it'd be at best a constant-factor improvement, and we'd be back here a little later when usage from deserving users eventually exceeds capacity. My personal view is that we need to be able to add capacity in regions where there aren't enough generous donors forthcoming, which means spending OSMF donors' money in a way that we haven't in the past (currently, all tile caches are donated). Since this is a cost that could potentially grow without limit, I don't think it's unreasonable to ask the board what it thinks (although it might be worth constraining the options, so that we don't get something unworkable in response). Having said that, I also think that API keys would be a good idea, if only to be able to more easily track and attribute usage to accounts, rather than the pile of hacky (As an aside, I'd be really interested in whether people who've run large services protected by API keys, and perhaps referrers, see a lot of API key cloning and referrer spoofing? People certainly spoof the User-Agent and Referer headers for requests to Finally, if anyone is reading this and would like to donate a tile cache in North America -- thank you! And please get in touch with |
...
The tile usage policy essentially only outlaws a not-defined (for good reason) "heavy use". As I pointed out @zerebubuth showed that the longer term issue (and this was 2-3 years back) is the long tail where none of the users is remarkable, so either we start lowering the volume that in our understanding is "heavy use" (something that a number of people don't want for marketing reasons) or we try to do something else. But having a crisis every 6 months doesn't make any sense at all. |
...
Given that there are other considerations, for example competing with commercial providers, which will become much more critical once we start providing vector tiles. I can't see how muddling along without a clear plan has any life left (I'm naturally not suggesting the the board should decide this in isolation, that's just a way of saying that the project needs to make its mind up, facilitated by the board). |
One of the issues I have at the moment is there is no realistic method for me to contact a eg: Let say I find example.com is a heavy user, who do I contact? At the moment the best I can do is try find a reasonable contact at example.com and email them. Apps are even more difficult especially if they have a poor or faked user-agent. |
Hi, I can only speak for our geocoding service, which is obviously a different use case than tiles - not least as it is usually not run as a publicly-visible service. We have seen stealing of keys, but the bigger issue is people registering many (at times hundreds) accounts to get the free usage tier on each. These efforts range from basic and easily detected (people manually registering It is an endless arms race without simple solutions, and I have to admit at times the never-ending nature of the battle becomes a bit demoralizing. |
Conflict of interest disclaimer: I am an employee of a company selling tiles using almost the same map style as tile.openstreetmap.org and tile.openstreetmap.de (tile.openstreetmap.de does not permit almost all commercial use). @simonpoole wrote:
A long discussion how a new tile server usage policy should look like can be found at #113 without any result/changes but at least two fundamentally opposing opinions. @Firefishy wrote:
The Tile Usage Policy could require contact info on any website using the tiles. In Germany, almost all websites are required to provide an "imprint" (German "Impressum") at an easy to find location (i.e. any subpage has to link to it using "Impressum" as link text but the font size might be small and the link hidden in the footer). The imprint has to provide a postal address and email. A postal address would not help us really but requiring an easy to find email address could be a sensible requirement of the guideline (and banning if that is missing or difficult to find could be an option). @mmd-osm wrote:
There are two ways of using tiles for advertisment: overlaying a text on a tile or serving a replacement tile like the black German anti EU upload filter map tiles this year. The first requires additional resources (calling an image library such as libgd), the second one fully blocks the view onto the map. The second one could reduce the load on the server because people migrate towards other sources (i.e. we move the load onto other free sources :-(). If we consider any of these options, we should discuss them in a separate ticket.
I presume that both overlays and pure advertisement tiles are easier to implement than a API key solution which interacts with all CDN nodes. |
..
Thanks Michael, that's the thread with the numbers @zerebubuth and myself were referring to. |
This comment has been minimized.
This comment has been minimized.
I see. However that is a low hanging fruit, I guess. |
Adding an overlay to tiles seems a bit far afield given the current situation. Scanning the thread again, there seems to be support from sysadmins and ops for API keys but concern over increased support load. |
I think API keys have several advantages like "usage fairness" and also that users think twice before the implement a heavy requesting bot or something. But API keys introduce a lot of complexity. You do not necessarily need a distributed database, but at least a well performing database that is not required without API keys. Additionally you need an in-memory cache that syncs frequently from this database and an async queue that feeds the database, to avoid hitting the database for every request. Of course, this depends a bit which performance requirements you have and how you want to scale your database cluster. Another problem is the required Email registration and validation cycle, but I guess this is done via the existing OSM account, so no additional burden.
This is a valid argument: what do you do if you have many users, a certain limit for everyone enforced but still not enough capacity? But e.g. from our own experience this "little later" is far in the future.
If this is a problem than API keys is not the solution. One solution is to introduce a main logging service (collecting logs from all servers) or at least a cronjob that does some stats per server and per day.
Why? Free resources will be always misused and should be limited so that people try hard to avoid hitting the limits. Or one could even introduce some capacity raise for an OSMF donation ;)
When they don't care about heavy requests why should you invest time to find them? It should work the opposite: to unblock them - they should come to you. After all it is a nice & free service they are abusing.
We fight those people via a blacklist of temporary-Email providers and block creation of X new accounts if they come from the same IP within Y hours. Stealing keys is a smaller issue and can be mainly solved via allowing the API key an additional limits regarding the IP or enforcing the "http referer". My preference is still pro-"API keys" but be aware of the additional work and problems. |
We do collect logs from all the servers. However, the logs can only contain the information that's available to the servers; in this case, source IP, User-Agent and Referer. The latter two are easily faked by non-browsers (i.e: scrapers & mobile apps) and the former is easily bypassed by using Tor, VPNs, farms of cloud machines, etc... (For an example of the lengths some developers will go to, please see openstreetmap/chef#78) Therefore, while we do collect this information, the logs are mostly useless for the purposes of identifying blockable requests. The caches apply automatic rate limits to clients from the same IP, but this arguably causes as many problems (c.f. mapping parties on a NAT) as it solves. API keys give us a fourth piece of information, which can be tied directly to a contact or account. Which simply pushes the problem "upstream" to the account level. (Where, again, it's very simple to hide source IP using Tor, VPNs, or just wardriving open WiFi.)
I agree - the existing workflow of "read logs → identify bad actors → block them → repeat" hasn't been working, especially as the "long tail" of large numbers of smaller sites/users continues to grow. Any new workflow would have to be more automated, or fully automatic, to be practical. |
I've been wondering if we should write up a doc and look for a contractor for this. The lack of any meaningful by-user breakdown makes it very hard to plan and manage the service, and this is increasingly becoming an issue. |
a automatic-cycling api token (each logged-in user receives a .html with the same current api key, the key changing every X min) works well to give full speed to the contributor and limited speed to others (and also gives people a reason to sign up, which could then make it easier for them to make their first contribution). |
Like we already have you mean (it's just in a cookie, not the url). |
This ops team discussed this a couple meetings ago. Although we are interested in the idea in principle, the tile CDN changes have changed a few things since when it was written
Given this is not causing any particular pain points right now, we don't feel it would be worth the effort. If someone still wants to write this themselves to scratch a technical itch, they can get in touch. |
Hi all!
I'm wondering if you'd be open to the idea of gradually requiring an API key to use tile.osm.org tiles.
I've built a couple simple API key systems (Nextzen tiles being a good public example) and it's not too hard, given access to logs (for counting accesses per API key) and a spot somewhere along the serving chain to block keys. I'd be willing to spend some time working on this, but wanted to get general ideas squared away before I started thinking more about it.
If this seems interesting, please read on. If you think API keys are a non-starter, let me know why.
API Key Website
I'd start with building a service that does simple CRUD of API keys, users, etc. This would be separate from openstreetmap-website, but users would login with their OSM credentials (probably via OAuth). Each user could generate as many API keys as they wanted, give them names, limit the referrers/origins for the key, disable, then delete API keys. This website might also be where metrics on key usage would be displayed.
Access Counter
Once the API keys are generated, a separate piece would grep the logs at an interval and count the number of requests (and probably response code) per API key. It would throw the aggregated stats into the database above.
Blocking Cron
A separate periodic cron would run to pick out API keys that are over their limits and add them to a list of blocked API keys. Normally I'd serve this on a static file store like S3, but perhaps it could be a well-cached API endpoint.
Access Denied
Finally, the edge caches would need to have some piece as close to the beginning of the request cycle as possible that would block the request if the API key didn't exist, was on the list of blocked API keys, or whose Referrer header wasn't on the allowed list for the API key. This piece would keep the list of blocked API keys in memory and update itself every few minutes from the above service.
We'd start by not requiring an API key. Once this system is up and we're happy with it (maybe by testing privately and then with a key for OSM.org) we could announce the change, work with major (friendly) users to get them transitioned, and then several months later turn off anonymous tile access.
The text was updated successfully, but these errors were encountered: