improve sql querying of tags, fixes #221 #224

tyrasd · 2020-04-06T10:56:48Z

Reopened #222 (because of revert #223)

This solves the issues in #222 by spicing up the PR with some thread synchronization. Since one TagTranslator only works on a single jdbc connection, and synchronization is only used around the anyway slow DB queries, this should not add much of a performance penalty, I believe. An alternative would be to use thread-local connections, and/or a db connection pool, to speed up tag translation to a maximum, but that would involve significant rewriting of the class and not be downwards compatible.

* prepare statements in constructor and reuse them * make the class autoclosable * close resultsets after they are used * fix checkstyle warnings

SlowMo24

I agree with you on the speed-estimate of the synchronisation of the anyway slow db-queries. Do you think opening an 'enhancement'-ticket with your idea for v. 0.6 would be a good idea?
Another minor speedupgrade might be possible through #191 by prefetching the most common, say, 100 key with their respective 100 values. That might potentially save us 10000 db queries for each oshdb query

SlowMo24 · 2020-04-07T09:50:09Z

oshdb-util/src/main/java/org/heigit/bigspatialdata/oshdb/util/tagtranslator/TagTranslator.java

+      roleTxtQuery = conn
+          .prepareStatement("select TXT from " + TableNames.E_ROLE.toString() + " where ID = ?;");
+    } catch (SQLException e) {
+      throw new OSHDBKeytablesNotFoundException();


We tested for the existence of the tables above, right? As a nitpicker I would prefer a different error or at least some explanatory string passed to the exception?

How is the TagTranslator used with the ohsome-api, does it create a new Translator for every request or does it reuse an existing one?

We tested for the existence of the tables above, right? As a nitpicker I would prefer a different error or at least some explanatory string passed to the exception?

True, this would trigger most likely in the case where the keytables database and tables are present, but would not have the correct columns. In which they could be called "corrupted" maybe…. It's a very unlikely error case in normal use, though.

Anyway, it's not really related to the issue this PR is trying to resolve. Can you perhaps open a separate ticket?

How is the TagTranslator used with the ohsome-api, does it create a new Translator for every request or does it reuse an existing one?

I'm not sure how this relates to the comment by @SlowMo24… Did you mean to post it as a separate question, @rtroilo ?

In any case: If I'm not mistaken, the ohsome-api ends up having one TagTranslator for "normal" use (i.e. translating incoming tags from request parameters), and many TagTranslators for data extraction (translating tags for the GeoJSON output): one per thread per request. The "normal" tag translator is reused for all requests, if I remember correctly, while the ones used for data extraction requests are thrown away after the respective "parent" request is completed.

Does that matter for this PR?

tyrasd · 2020-04-07T17:10:09Z

your idea for v0.6

Well. Really, my preferred way to solve this problem would be to get rid of this whole concept of separate keytables altogether. I would prefer a solution with small local string tables, as it is solved in other state of the art data schemas. For example like the StringTable in an osm-pbf data block, or like the attributes in MVT. But let's discuss this not in this pull request. ➡️ #225

@test

Such as count(), sum(), …, reduce(), or stream().close(). This avoids dangling prepared statements to lie around after a query was performed. This is a regression/consequence of #224. This also fixes the fact that in the current implementation, because of a typo/bug the tagInterpreter objects themselves were never persisted when calling `osmTag` methods. :-/ Note that this solution passes all tests, but doesn't work in the following simple example! @test public void testFoo() throws Exception { MapReducer<OSMContribution> bar = createMapReducerOSMContribution() .timestamps(timestamps72) .osmEntityFilter(entity -> entity.getId() == 617308093) .osmTag("highway"); assertTrue(bar.count() > -1); assertTrue(bar.osmTag("highway", "traffic_signals").count() > -1); } Here, a runtime exception will be thrown, because when calling the first `bar.count()`, the internal tragTranslator of `bar` is closed and further tag translation requests cannot be answered anymore (as performed in the last line).

@test

Such as count(), sum(), …, reduce(), or stream().close(). This avoids dangling prepared statements to lie around after a query was performed. This is a regression/consequence of #224. This also fixes the fact that in the current implementation, because of a typo/bug the tagInterpreter objects themselves were never persisted when calling `osmTag` methods. :-/ Note that this solution passes all tests, but doesn't work in the following simple example! @test public void testFoo() throws Exception { MapReducer<OSMContribution> bar = createMapReducerOSMContribution() .timestamps(timestamps72) .osmEntityFilter(entity -> entity.getId() == 617308093) .osmTag("highway"); assertTrue(bar.count() > -1); assertTrue(bar.osmTag("highway", "traffic_signals").count() > -1); } Here, a runtime exception will be thrown, because when calling the first `bar.count()`, the internal tragTranslator of `bar` is closed and further tag translation requests cannot be answered anymore (as performed in the last line).

tyrasd added 2 commits April 4, 2020 11:24

improve sql querying of tags, fixes #221

66d0b9c

* prepare statements in constructor and reuse them * make the class autoclosable * close resultsets after they are used * fix checkstyle warnings

synchronize on prepared statement usage

cc14920

tyrasd requested review from rtroilo and SlowMo24 April 7, 2020 09:26

SlowMo24 approved these changes Apr 7, 2020

View reviewed changes

tyrasd merged commit 2e28443 into master Apr 7, 2020

tyrasd deleted the issue-221 branch April 7, 2020 17:26

SlowMo24 mentioned this pull request Apr 8, 2020

TagTranslator: add custom error message for corrupt keytables schema #226

Closed

tyrasd added the performance Performance optimizations, bottlenecks of the current pipeline, etc. label May 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improve sql querying of tags, fixes #221 #224

improve sql querying of tags, fixes #221 #224

tyrasd commented Apr 6, 2020 •

edited

Loading

SlowMo24 left a comment

SlowMo24 Apr 7, 2020

rtroilo Apr 7, 2020

tyrasd Apr 7, 2020

tyrasd Apr 7, 2020

tyrasd commented Apr 7, 2020 •

edited

Loading

improve sql querying of tags, fixes #221 #224

improve sql querying of tags, fixes #221 #224

Conversation

tyrasd commented Apr 6, 2020 • edited Loading

SlowMo24 left a comment

Choose a reason for hiding this comment

SlowMo24 Apr 7, 2020

Choose a reason for hiding this comment

rtroilo Apr 7, 2020

Choose a reason for hiding this comment

tyrasd Apr 7, 2020

Choose a reason for hiding this comment

tyrasd Apr 7, 2020

Choose a reason for hiding this comment

tyrasd commented Apr 7, 2020 • edited Loading

tyrasd commented Apr 6, 2020 •

edited

Loading

tyrasd commented Apr 7, 2020 •

edited

Loading