Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of https://photon.komoot.io #614

Open
lonvia opened this issue Jan 2, 2025 · 14 comments
Open

Use of https://photon.komoot.io #614

lonvia opened this issue Jan 2, 2025 · 14 comments

Comments

@lonvia
Copy link

lonvia commented Jan 2, 2025

I'm the administrator for https://photon.komoot.io. While reviewing our server logs recently, I've noticed that dawarich is making rather intense use of the service. Looking at the logs from the past couple of days, I see about 60% of all requests coming from your application. This is slightly beyond what can be called 'fair use'.

I see that you already provide your own Photon instance for Patreon supporters. That's great. I've also seen that you rate limit access to photon.komoot.io. Awesome. Sadly, it doesn't look like it is enough. The server behind https://photon.komoot.io is really just one tiny server and won't be able to keep up with this kind of traffic forever. Any chance to actively discourage use of photon.komoot.io and instead provide people with the instructions to set up their own server?

One observation: It looks like you really only need city/country data from the reverse geocode. That means you could work with a geocoding database that is orders of magnitude smaller than the full Photon database. I've recently experimented with a sqlite-based Nominatim database. This application looks like it could be a good usecase for it. Simply run a tiny Nominatim server with a admin sqlite DB in your docker image and use that.

@solderdot72
Copy link

solderdot72 commented Jan 2, 2025

Hi lonvia,

I'm one of the users of DaWarIch and therefore I feel guilty for imposing part of that load :-(

I'd love to switch to a self-hosted server, but I'm just a hobbyist running a Raspberry Pi. All instructions for setting up such a server I found so far indicate that running a reverse geocoding server requires a big storage and a lot of comuting power - way more than a Raspi can handle. I also found instructions allowing to limit to a single country, thus tremendously reducing the amount of storage required.

Limiting to a single country is no option for me, but I can narrow down the list of countries to 10-20. Unfortunately I did not find a tutorial describing how to set-up a server for a hand-selected list of countries.

Since you state that the information provided by the reverse geocoding server is only partially required, so only a small part of the database is required, this should further reduce the demands.

If I knew how to set-up such a low-profile self-hosted server I will gladly give it a try. Can you show me that or indicate a place on the web where I can educate myself, empowering me to do so, preferably in a Docker/Portainer environment?

@lonvia
Copy link
Author

lonvia commented Jan 3, 2025

This needs a bit of a longer explanation. I'm working under the assumption here that dawarich indeed only uses city/state/country information from the reverse geocoding result. If it needs more fine-grained information then the picture changes slightly.

Creating your own database is not entirely trivial and you indeed need a machine with a bit more computing power. Running the geocoding on a ready-made database, however, only needs a bit of storage space. It is not very expensive in terms of CPU-power and will happily run on a Raspberry Pi, at least for the amount of requests that dawarich will produce. Luckily for us city/state/country information doesn't really change a l lot. You don't really need to update your geodatabase. So, to get your own private geocoder, you need to find a powerful machine once and produce the database. Then copy over to your Pi and use it forever.

To create a geocoding database from OpenStreetMap data, you need to use Nominatim. Photon doesn't do that itself. It only exports data from a Nominatim database. You can install Nominatim easily via Docker. Given that we are only interested in city-level data, you can configure Nominatim to use the admin style. It's so little data, it will be done in a couple of hours and you can probably run it on your laptop if you happen to have 500GB of space left. You can further reduce time and size by running the import with --reverse-only and --no-updates.

As a next step, you could create a Photon database from the Nominatim database. However, I wouldn't recommend that. Photon only imports point geometries, it doesn't know about the whole area of towns and cities. That is really bad for reverse geocoding because it has to make educated guesses about the closest city. Nominatim is much better here. It keeps the whole area geometries. That's why I was suggesting SQLite. Dump Nominatim's database into an sqlite file by following these instructions. Then:

  • copy that file over to your Pi
  • install nominatim with virtualenv nominatim-venv; ./nominatim-venv/bin/pip install nominatim-api falcon uvicorn
  • point Nominatim to your sqlite database: echo NOMINATIM_DATABASE_DSN=sqlite:dbname=mydb.sqlite > .env
  • fire up uvicorn to create a Nominatim server: /nominatim-venv/bin/uvicorn --host 127.0.0.1 --port 8080 --factory nominatim_api.server.falcon.server:run_wsgi
  • point dawarich to the internal service

(Disclaimer: I haven't tested these instructions, so please consult the Nominatim and uvicorn documentation for details.)

The SQLite file will be about 12GB. That should be managable for a Rhasberry Pi. If it is still to large, then Nominatim's style can be streamlined even more.

@Freika
Copy link
Owner

Freika commented Jan 3, 2025

@lonvia thank you for reaching out about the increased usage of photon.komoot.io. First of all, I would like to say huge thanks for it. It's been a really important service for Dawarich to rely on.

I'd be happy to somehow reduce the usage of the service in Dawarich, but I don't really know how to do that if I'm going to keep the reverse geocoding feature. I'm working on some features to utilize more than country and city information, so data on addresses and organizations is also useful for Dawarich.

What's already done:

What can be done:

  • I have some ideas on how to reduce the number of reverse geocoding requests from Dawarich
  • I'll make an announcement in release notes of one of the next releases to highlight the problem and encourage people to set up their own photon instances
  • I'll update the page on reverse geocoding on the website, to highlight the problem there too
  • I'll also research if there are any alternatives to Photon that could be integrated into Dawarich

If you have any other ideas, please let me know, I'd be happy to help reducing load to photon.komoot.io from Dawarich

@hopeseekr
Copy link

Is this why it took 35 days to import my Records.json? If so, i'd gladly host or donate or do whatever to have one running locally...

Also, this is probably a one-time bump because of the Google Timeline shutdown.

@lonvia
Copy link
Author

lonvia commented Jan 5, 2025

If you have a Linux machine with 16GB RAM(*) and 200GB SSD to spare, running your own Photon is super-easy:

  • download and unpack the Photon planet dump
  • download the Photon jar
  • install OpenJVM: sudo apt install openjdk-17-jdk-headless (replace '17' with whatever your OS has to offer)
  • run it: java -jar photon-0.6.1.jar
  • point Dawarich to http:\\localhost:2322 (or whereever your machine is located)

(*) The usual recommendation for a planet-size DB is 128GB RAM but that's in order to get reasonable throughput. When running a private instance that is only supposed to do a bit of reverse geocoding, you can get away with significantly less memory.

@raphpa
Copy link

raphpa commented Jan 5, 2025

There is also a dockerized version:
https://github.com/rtuszik/photon-docker

Works fine for me, takes about 1.4GB of RAM and pretty much no CPU load

@cosmindv
Copy link

cosmindv commented Jan 5, 2025

Hello, first of all an excellent work and documentation, my guess is the huge usage might be related to the fact more people had the free time this holiday to try this alternative to google Timeline. I would like to make use of a separate docker instance and implement https://github.com/rtuszik/photon-docker in order to not contribute to the high load.

I am using Synology to host Dawarich and don't have much experience with it.
I can't find any steps on how to load the modified docker-compose after i add my own PHOTON_API_HOST W/O loosing all the progress and data.

@joaoferreira-git
Copy link

joaoferreira-git commented Jan 6, 2025

There is also a dockerized version: https://github.com/rtuszik/photon-docker

Works fine for me, takes about 1.4GB of RAM and pretty much no CPU load

Just finished installing this and switching Dawarich to use the local photon and its zooming throught requests and its on spinning rust in my unraid server.

If someone is going to do this just make sure to add PHOTON_API_USE_HTTPS=false in the docker envs of the dawarich container otherwise it will try to use HTTPS in a HTTP Endpoint.

@ragnarkarlsson
Copy link

I've spun up a photon sidecar container to do my list and help reduce the influx, but I would like to be confident it's actually using local photon and not komoot.io - can this be easily determined? Should I see it in my sidekiq containers logs?

@Freika
Copy link
Owner

Freika commented Jan 6, 2025

@ragnarkarlsson In the console (https://dawarich.app/docs/FAQ#how-to-enter-dawarich-console), run Geocoder.config to see the configuration. It should look somewhat like this:

{:timeout=>5,
 :lookup=>:photon,
 :ip_lookup=>:ipinfo_io,
 :language=>:en,
 :http_headers=>{"X-Api-Key"=>"xxx"},
 :use_https=>true,
 :http_proxy=>nil,
 :https_proxy=>nil,
 :api_key=>nil,
 :basic_auth=>{},
 :logger=>:kernel,
 :kernel_logger_level=>2,
 :always_raise=>:all,
 :units=>:km,
 :distances=>:linear,
 :cache=>#<Redis client v5.3.0 for redis://dawarich_redis:6379>,
 :cache_prefix=>nil,
 :cache_options=>{:expiration=>1 day},
 :photon=>{:use_https=>true, :host=>"photon.dawarich.app"}}

Then, to test if it is working, you can do the following:

point = Point.last

Geocoder.search([point.latitude, point.longitude])

The response should contain some geocoding data and should not throw an error

@alternativesurfer
Copy link

alternativesurfer commented Jan 6, 2025

If you have a Linux machine with 16GB RAM(*) and 200GB SSD to spare, running your own Photon is super-easy:

  • download and unpack the Photon planet dump
  • download the Photon jar
  • install OpenJVM: sudo apt install openjdk-17-jdk-headless (replace '17' with whatever your OS has to offer)
  • run it: java -jar photon-0.6.1.jar
  • point Dawarich to http:\\localhost:2322 (or whereever your machine is located)

(*) The usual recommendation for a planet-size DB is 128GB RAM but that's in order to get reasonable throughput. When running a private instance that is only supposed to do a bit of reverse geocoding, you can get away with significantly less memory.

No need to even put it on an SSD
I have mine running in docker following these instructions: https://github.com/rtuszik/photon-docker with the data directory pointed at a spinning rust file server mount. The docker container is only using appx 7GB on my SSD datastores while this data can sit on my cheap storage.

The uncompressed planet-size DB is 169GB, for those interested.

Throughput seems decent enough (I don't know how to actually gauge that).

@ragnarkarlsson
Copy link

@ragnarkarlsson In the console (https://dawarich.app/docs/FAQ#how-to-enter-dawarich-console), run Geocoder.config to see the configuration. It should look somewhat like this:

{:timeout=>5,
 :lookup=>:photon,
 :ip_lookup=>:ipinfo_io,
 :language=>:en,
 :http_headers=>{"X-Api-Key"=>"xxx"},
 :use_https=>true,
 :http_proxy=>nil,
 :https_proxy=>nil,
 :api_key=>nil,
 :basic_auth=>{},
 :logger=>:kernel,
 :kernel_logger_level=>2,
 :always_raise=>:all,
 :units=>:km,
 :distances=>:linear,
 :cache=>#<Redis client v5.3.0 for redis://dawarich_redis:6379>,
 :cache_prefix=>nil,
 :cache_options=>{:expiration=>1 day},
 :photon=>{:use_https=>true, :host=>"photon.dawarich.app"}}

Then, to test if it is working, you can do the following:

point = Point.last

Geocoder.search([point.latitude, point.longitude])

The response should contain some geocoding data and should not throw an error

Wonderful, thank you I'm not confident I'm running local 100% 👍🏻

@Freika
Copy link
Owner

Freika commented Jan 7, 2025

@lonvia what's done so far:

  • Announcement posted in release notes, explaining the situation and encouraging using self-hosted photon
  • In-app notification created with the same text
  • Support implemented for Geoapify as an alternative reverse geocoding service
  • Instructions updated on the Reverse Geocoding page on Dawarich website
  • Photon-specific env vars removed from default docker-compose.yml to encourage users to make their own decision

Hopefully, this will help, and I'll work on support for more reverse geocoding providers in the future.

@makanimike
Copy link

I also think this is probably a one-time rush. Christmas break + dawarich users importing years' worth of google maps data.
I have been importing 13 years of google maps data the last couple of weeks.

Regardless, I will aim to fire up my own instance on the weekend to do my part.

Thank you to both parties for the solution-focused communication.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants