Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibly stop 'store'-ing admin strings #349

Open
missinglink opened this issue Feb 28, 2019 · 2 comments
Open

Possibly stop 'store'-ing admin strings #349

missinglink opened this issue Feb 28, 2019 · 2 comments

Comments

@missinglink
Copy link
Member

missinglink commented Feb 28, 2019

Since we enabled the translation service we don't really need to 'store' admin strings in ES any longer.
We still need to 'index' the values so they can be used for searching, but we shouldn't need to retrieve their value when a language is requested.

Translation calls are very common for browser requests but fairly uncommon for programmatic requests.
If we did away with storing the fields, we'd need to be cognizant that many queries which didn't previously hit the translation service would now hit it.

It might also be the case that we are happy to have some default tokens 'store'-d for each admin record and pay the HDD overhead in order to avoid the additional requests to the translation service.

[edit] on reflection, I'm not so sure about what I said, I think that if no language code is specified then we fall back to english, if that's the case then we are already calling the translation service on every request.

@orangejulius
Copy link
Member

orangejulius commented Mar 20, 2019

Yeah, I remember we looked at this as an outcome of pelias/api#1098.

As it stands, we do call the language service on every request. That means we have the following paths available to us:

  • stop storing admin fields in Elasticsearch, save disk and possibly slightly improve performance
  • Ensure the value in Elasticsearch for admin fields is the same as the 'default', and avoid calls to the language service in those cases. This would almost certainly cut down latency by a fair bit.

@orangejulius
Copy link
Member

Another consideration: if we stop storing admin values, it would be extremely important that the Placeholder database and Elasticsearch database are in sync and were created from upstream data at the same time.

As it stands, if a translation is not found in Placeholder, we have the values in Elasticsearch to fall back on. This means that, for example, a record which is in the Elasticsearch index but not Placeholder will still work reasonably well, since the language middleware will use the names in the Elasticsearch record if it can't find any with a call to Placeholder.

Without that data to fall back on, it would be easy for queries to return results with no admin values because the record only existed in Elasticsearch, not Placeholder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants