Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve sql querying of tags, fixes #221 #224

Merged
merged 2 commits into from
Apr 7, 2020
Merged

improve sql querying of tags, fixes #221 #224

merged 2 commits into from
Apr 7, 2020

Conversation

tyrasd
Copy link
Member

@tyrasd tyrasd commented Apr 6, 2020

Reopened #222 (because of revert #223)

This solves the issues in #222 by spicing up the PR with some thread synchronization. Since one TagTranslator only works on a single jdbc connection, and synchronization is only used around the anyway slow DB queries, this should not add much of a performance penalty, I believe. An alternative would be to use thread-local connections, and/or a db connection pool, to speed up tag translation to a maximum, but that would involve significant rewriting of the class and not be downwards compatible.

tyrasd added 2 commits April 4, 2020 11:24
* prepare statements in constructor and reuse them
* make the class autoclosable
* close resultsets after they are used
* fix checkstyle warnings
@tyrasd tyrasd requested review from rtroilo and SlowMo24 April 7, 2020 09:26
Copy link
Contributor

@SlowMo24 SlowMo24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you on the speed-estimate of the synchronisation of the anyway slow db-queries. Do you think opening an 'enhancement'-ticket with your idea for v. 0.6 would be a good idea?
Another minor speedupgrade might be possible through #191 by prefetching the most common, say, 100 key with their respective 100 values. That might potentially save us 10000 db queries for each oshdb query

roleTxtQuery = conn
.prepareStatement("select TXT from " + TableNames.E_ROLE.toString() + " where ID = ?;");
} catch (SQLException e) {
throw new OSHDBKeytablesNotFoundException();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We tested for the existence of the tables above, right? As a nitpicker I would prefer a different error or at least some explanatory string passed to the exception?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is the TagTranslator used with the ohsome-api, does it create a new Translator for every request or does it reuse an existing one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We tested for the existence of the tables above, right? As a nitpicker I would prefer a different error or at least some explanatory string passed to the exception?

True, this would trigger most likely in the case where the keytables database and tables are present, but would not have the correct columns. In which they could be called "corrupted" maybe…. It's a very unlikely error case in normal use, though.

Anyway, it's not really related to the issue this PR is trying to resolve. Can you perhaps open a separate ticket?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is the TagTranslator used with the ohsome-api, does it create a new Translator for every request or does it reuse an existing one?

I'm not sure how this relates to the comment by @SlowMo24… Did you mean to post it as a separate question, @rtroilo ?

In any case: If I'm not mistaken, the ohsome-api ends up having one TagTranslator for "normal" use (i.e. translating incoming tags from request parameters), and many TagTranslators for data extraction (translating tags for the GeoJSON output): one per thread per request. The "normal" tag translator is reused for all requests, if I remember correctly, while the ones used for data extraction requests are thrown away after the respective "parent" request is completed.

Does that matter for this PR?

@tyrasd
Copy link
Member Author

tyrasd commented Apr 7, 2020

your idea for v0.6

Well. Really, my preferred way to solve this problem would be to get rid of this whole concept of separate keytables altogether. I would prefer a solution with small local string tables, as it is solved in other state of the art data schemas. For example like the StringTable in an osm-pbf data block, or like the attributes in MVT. But let's discuss this not in this pull request. ➡️ #225

@tyrasd tyrasd merged commit 2e28443 into master Apr 7, 2020
@tyrasd tyrasd deleted the issue-221 branch April 7, 2020 17:26
@tyrasd tyrasd added the performance Performance optimizations, bottlenecks of the current pipeline, etc. label May 11, 2020
tyrasd added a commit that referenced this pull request May 29, 2020
Such as count(), sum(), …, reduce(), or stream().close(). This avoids dangling prepared statements to lie around after a query was performed. This is a regression/consequence of #224.

This also fixes the fact that in the current implementation, because of a typo/bug the tagInterpreter objects themselves were never persisted when calling `osmTag` methods. :-/

Note that this solution passes all tests, but doesn't work in the following simple example!

  @test
  public void testFoo() throws Exception {
    MapReducer<OSMContribution> bar = createMapReducerOSMContribution()
        .timestamps(timestamps72)
        .osmEntityFilter(entity -> entity.getId() == 617308093)
        .osmTag("highway");
    assertTrue(bar.count() > -1);
    assertTrue(bar.osmTag("highway", "traffic_signals").count() > -1);
  }

Here, a runtime exception will be thrown, because when calling the first `bar.count()`, the internal tragTranslator of `bar` is closed and further tag translation requests cannot be answered anymore (as performed in the last line).
joker234 pushed a commit that referenced this pull request Oct 29, 2020
Such as count(), sum(), …, reduce(), or stream().close(). This avoids dangling prepared statements to lie around after a query was performed. This is a regression/consequence of #224.

This also fixes the fact that in the current implementation, because of a typo/bug the tagInterpreter objects themselves were never persisted when calling `osmTag` methods. :-/

Note that this solution passes all tests, but doesn't work in the following simple example!

  @test
  public void testFoo() throws Exception {
    MapReducer<OSMContribution> bar = createMapReducerOSMContribution()
        .timestamps(timestamps72)
        .osmEntityFilter(entity -> entity.getId() == 617308093)
        .osmTag("highway");
    assertTrue(bar.count() > -1);
    assertTrue(bar.osmTag("highway", "traffic_signals").count() > -1);
  }

Here, a runtime exception will be thrown, because when calling the first `bar.count()`, the internal tragTranslator of `bar` is closed and further tag translation requests cannot be answered anymore (as performed in the last line).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance optimizations, bottlenecks of the current pipeline, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants