-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve sql querying of tags, fixes #221 #224
Conversation
* prepare statements in constructor and reuse them * make the class autoclosable * close resultsets after they are used * fix checkstyle warnings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you on the speed-estimate of the synchronisation of the anyway slow db-queries. Do you think opening an 'enhancement'-ticket with your idea for v. 0.6 would be a good idea?
Another minor speedupgrade might be possible through #191 by prefetching the most common, say, 100 key with their respective 100 values. That might potentially save us 10000 db queries for each oshdb query
roleTxtQuery = conn | ||
.prepareStatement("select TXT from " + TableNames.E_ROLE.toString() + " where ID = ?;"); | ||
} catch (SQLException e) { | ||
throw new OSHDBKeytablesNotFoundException(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We tested for the existence of the tables above, right? As a nitpicker I would prefer a different error or at least some explanatory string passed to the exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is the TagTranslator used with the ohsome-api, does it create a new Translator for every request or does it reuse an existing one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We tested for the existence of the tables above, right? As a nitpicker I would prefer a different error or at least some explanatory string passed to the exception?
True, this would trigger most likely in the case where the keytables database and tables are present, but would not have the correct columns. In which they could be called "corrupted" maybe…. It's a very unlikely error case in normal use, though.
Anyway, it's not really related to the issue this PR is trying to resolve. Can you perhaps open a separate ticket?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is the TagTranslator used with the ohsome-api, does it create a new Translator for every request or does it reuse an existing one?
I'm not sure how this relates to the comment by @SlowMo24… Did you mean to post it as a separate question, @rtroilo ?
In any case: If I'm not mistaken, the ohsome-api ends up having one TagTranslator for "normal" use (i.e. translating incoming tags from request parameters), and many TagTranslators for data extraction (translating tags for the GeoJSON output): one per thread per request. The "normal" tag translator is reused for all requests, if I remember correctly, while the ones used for data extraction requests are thrown away after the respective "parent" request is completed.
Does that matter for this PR?
Well. Really, my preferred way to solve this problem would be to get rid of this whole concept of separate keytables altogether. I would prefer a solution with small local string tables, as it is solved in other state of the art data schemas. For example like the |
Such as count(), sum(), …, reduce(), or stream().close(). This avoids dangling prepared statements to lie around after a query was performed. This is a regression/consequence of #224. This also fixes the fact that in the current implementation, because of a typo/bug the tagInterpreter objects themselves were never persisted when calling `osmTag` methods. :-/ Note that this solution passes all tests, but doesn't work in the following simple example! @test public void testFoo() throws Exception { MapReducer<OSMContribution> bar = createMapReducerOSMContribution() .timestamps(timestamps72) .osmEntityFilter(entity -> entity.getId() == 617308093) .osmTag("highway"); assertTrue(bar.count() > -1); assertTrue(bar.osmTag("highway", "traffic_signals").count() > -1); } Here, a runtime exception will be thrown, because when calling the first `bar.count()`, the internal tragTranslator of `bar` is closed and further tag translation requests cannot be answered anymore (as performed in the last line).
Such as count(), sum(), …, reduce(), or stream().close(). This avoids dangling prepared statements to lie around after a query was performed. This is a regression/consequence of #224. This also fixes the fact that in the current implementation, because of a typo/bug the tagInterpreter objects themselves were never persisted when calling `osmTag` methods. :-/ Note that this solution passes all tests, but doesn't work in the following simple example! @test public void testFoo() throws Exception { MapReducer<OSMContribution> bar = createMapReducerOSMContribution() .timestamps(timestamps72) .osmEntityFilter(entity -> entity.getId() == 617308093) .osmTag("highway"); assertTrue(bar.count() > -1); assertTrue(bar.osmTag("highway", "traffic_signals").count() > -1); } Here, a runtime exception will be thrown, because when calling the first `bar.count()`, the internal tragTranslator of `bar` is closed and further tag translation requests cannot be answered anymore (as performed in the last line).
Reopened #222 (because of revert #223)
This solves the issues in #222 by spicing up the PR with some thread synchronization. Since one TagTranslator only works on a single jdbc connection, and synchronization is only used around the anyway slow DB queries, this should not add much of a performance penalty, I believe. An alternative would be to use thread-local connections, and/or a db connection pool, to speed up tag translation to a maximum, but that would involve significant rewriting of the class and not be downwards compatible.