-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Contribute Rdf4j Spring Integration #3269
Contribute Rdf4j Spring Integration #3269
Conversation
@jeenbroekstra not sure how you want to proceed. Adding the tests could take some time, maybe you want to start reviewing the rest of the code? |
Thanks for putting this up! I will try and find time to look at it soon (it's mostly weekends for me at the moment though). Before you continue adding tests etc, can you make sure that you've signed the ECA and that your git commit author address is the same address that you used to sign? The verification step of the build is currently not finding your record. Have a look at the contributor guidelines, in particular the points on how to sign the Eclipse ECA. Also, while you're fixing up your commits anyway, can you prefix each commit message with the issue number? Something like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a very quick first look through things, looking very good I think. Couple of inline comments, some editorial stuff. I think the biggest thing might be what to do with the SparqlBuilder extensions. They look very useful, but it's not really related to the spring stuff and getting it all tested and integrated might take some time. I'd suggest taking it out of this PR and submitting it in a separate PR.
spring-components/rdf4j-spring/src/main/java/org/eclipse/rdf4j/model/impl/MultiIRI.java
Outdated
Show resolved
Hide resolved
import org.eclipse.rdf4j.sparqlbuilder.graphpattern.GraphPattern; | ||
import org.eclipse.rdf4j.sparqlbuilder.util.SparqlBuilderUtils; | ||
|
||
public class Bind implements GraphPattern { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, interesting, there is actually a separate issue for that: GH-2767. This should probably be moved to the sparqlbuilder module though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're keeping this as part of this PR (which I'm fine with) we should move it to the sparqlbuilder module I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll gladly move the code. However, there is much more in the constraint
package in this module, including the whole propertypath
subpackage. I suggest we move all of that to the sparqlbuilder module, keeping the package structure as is. Ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, I should have been clearer sorry, I meant this comment for the whole package, not just this particular class. Thanks!
.../main/java/org/eclipse/rdf4j/sparqlbuilder/constraint/propertypath/InversePredicatePath.java
Outdated
Show resolved
Hide resolved
.../rdf4j-spring/src/main/java/org/eclipse/rdf4j/spring/dao/exception/Rdf4JSpringException.java
Outdated
Show resolved
Hide resolved
...nts/rdf4j-spring/src/main/java/org/eclipse/rdf4j/spring/dao/exception/Rdf4JDaoException.java
Outdated
Show resolved
Hide resolved
One more thing I almost forgot, kinda important: I will likely need to pass this by the Eclipse IP team as well, as it's quite a large contribution. I will take care of that, but just be aware that at some point there may be some questions coming your way regarding code ownership etc. |
Thanks for the heads-up. I have notified our legal people that this is coming when starting the whole process. |
I thought we'd squash everything in the process of merging, no? If not, are you ok with the commits made so far being amended? |
We don't squash-merge anymore, no - we used to do that for a while but there are technical issues with the IP attribution on the squashed commit that is created that way. So we're back to regular merge-commits for now unfortunately.
Absolutely! Or if you prefer you can also manually squash and fix it that way, that's fine too. |
Due to the way this spring integration library was extracted from a bigger project, there are virtually no tests, as this library was tested via domain-specific integration tests. ### Acknowledgements * The initial work on this was done by Gabriel Pickl (later taken up and continued by me). * This code has been developed by members of the Research Studio Smart Applications Technologies (https://www.researchstudio.at/studio/smart-applications-technologies/) in the project 'BIM - interoperables Merkmalservice' led by the Österreichische Bautechnik Vereinigung (ÖBV, https://bautechnik.pro). * This work was supported by Basisprogramm of the Austrian Research Promotian Agency (FFG, https://ffg.at).
* Adds license header * Uses `RDF4J` instead of `Rdf4J` for class names * Uses `RDF4JException` as the base class of all custom exceptions
BTW, I don't know what is up with the Eclipse ECA. I signed the agreement using my committer email address, but now when I try to sign it again, just to be sure, I get |
Looks like there is a known issue with the ECA database at the moment - see https://bugs.eclipse.org/bugs/show_bug.cgi?id=566859 . Fwiw I can manually check the ECA database and the address on your commits is not currently on file, but if I read that bug report correctly that might be a sync issue. Let's keep an eye on that bugzilla ticket and once that's resolved give it another go. |
* fixes minor documentation bugs * tests reveal that result caching did not work as expected because queries do not report identical hashCodes for identical queries. Fixed, but not perfectly
I have submitted a zip of the current state of this PR as the initial contribution for IP review. |
ECA check now succeeds! |
also renamed and refactored of TransactionData (now TransactionObject)
I think I am done with the first round of the tests now. A few notes: I feel like the The 'reference project' mentioned in the PR description is more like a set of integration tests, but they fulfill the purpose. A shortcoming of the current implementation is that I haven't done any work with quads using this framework - we may need to refactor for everything to work with quads smoothly. |
spring-components/rdf4j-spring/src/main/java/org/eclipse/rdf4j/model/impl/MultiIRI.java
Outdated
Show resolved
Hide resolved
import org.eclipse.rdf4j.sparqlbuilder.graphpattern.GraphPattern; | ||
import org.eclipse.rdf4j.sparqlbuilder.util.SparqlBuilderUtils; | ||
|
||
public class Bind implements GraphPattern { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're keeping this as part of this PR (which I'm fine with) we should move it to the sparqlbuilder module I think.
BindingSet currentBindings = getDelegate().getBindings(); | ||
// TODO: this might be pretty slow due to the toString() call. Is there a better way to get | ||
// a hash for a query with minmal risk of collision ? | ||
Integer cacheKey = currentBindings.hashCode() + getDelegate().toString().hashCode(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeenbroekstra maybe you can help me here. I am trying to generate cacheKey
in such a way that it is always the same if the query and the bindings are identical. Until today I thought that getDelegate().hashCode()
was stable across multiple query instances, but it seems it isn't. (When running the unit tests, the delegate is a SailTupleQuery
. getDelegate().toString()
is stable, but I looked into the implementation and I think it could be quite slow. Is there a good way to do this? If not, could we devise one, like adding a getQueryHash
method in the Query interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fkleedorfer I must not have been paying attention before, as I wasn't really aware this PR introduced query result caching in this fashion. Sorry about that, my personal circumstances at the moment make it hard for me to spend as much time on RDF4J as I'd like.
Before we dive into the cache key mechanics: can you elaborate a little on what problem this cache mechanism solves? What exactly are we caching (and why), and what is the scope/duration of that cache? How is this cache controlled (e.g. how do we take care it does not get out of sync with the underlying repository)? Is any of this specific to Spring, or should we consider making this a general, separate component?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to explain what it does in org/eclipse/rdf4j/spring/resultcache/package-info.java
. The problem it solves is answering multiple identical queries issued by the application over time. The cache is connection-specific and is cleared upon connection close or any writes. If global caching is enabled, a global cache is also kept that is cleared whenever the application (any connection) writes.
The cache is configured via org.eclipse.rdf4j.spring.resultcache.ResultCacheProperties
, which provides a bit more insight as to how to control it.
To follow what is happening, look at org.eclipse.rdf4j.spring.resultcache.CachingRepositoryConnection.prepareTupleQuery()
.
It is not strongly spring-related, but to extract it to another package might uproot other parts of the rdf4j-spring module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The items that are cached are org.eclipse.rdf4j.spring.resultcache.CachedGraphQueryResult
and org.eclipse.rdf4j.spring.resultcache.CachedTupleQueryResult
. The whole result, basically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move code to sparqlbuilder module?
... See code comments in Bind.java
import org.eclipse.rdf4j.sparqlbuilder.graphpattern.GraphPattern; | ||
import org.eclipse.rdf4j.sparqlbuilder.util.SparqlBuilderUtils; | ||
|
||
public class Bind implements GraphPattern { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll gladly move the code. However, there is much more in the constraint
package in this module, including the whole propertypath
subpackage. I suggest we move all of that to the sparqlbuilder module, keeping the package structure as is. Ok?
Removes * `MultiIRI.java` * `ExtendedPrefix.java`
Adds * Bind * NotIn * PSequence * PZeroOrMore * propertypath package with Property path builder * Utility/Generator methods to Expressions class * ExtendedVariable to core * some tests for the above
Just checking in to see if there is anything I can do at this point to help push this PR over the line... |
Unfortunately several of us (including myself) have had less time to spend on RDF4J than we'd like, for various reasons - so this PR has probably not got the attention it deserves. I'll try and get some eyes on your latest changes this weekend. Generally speaking: I'm anxious to get this merged, assuming you've addressed the feedback sofar and we can get the build checks green we should be good. I'm not sure why it's currently reporting the build checks as canceled/failed, and for some odd reason I seem to have no option to retry the build on my end. Could you push another commit (perhaps do a rebase on develop, or just an empty commit) to trigger another build and ping me with a comment as soon as you've done it? I may need to "unlock" the github action as github still considers you a "new' contributor. |
Alternatively, if you give me write access to your fork, I can do a push and take care of this myself as well. |
@jeenbroekstra I just gave you write access to the fork. Thanks for taking the time, anyway! |
Looks like we have some compilation failures in the Sparqlbuilder code. |
Oh? So sorry, will deal with it ASAP! |
..forgot to delete their unnecessary usages
The build should work now. |
All good - that's what we have the build checks for :) Bit annoying that I still have to approve the build checks manually when it's your commit. As soon as we have something of yours merged in, that should hopefully become a bit easier. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with this being merged in, great work @fkleedorfer !
There's probably some followup iterations that need to happen with some further code shifting/cleaning, and we'll need to tackle some user documentation as well - but those things can be taken care of in followup PRs.
Thanks, happy to contribute! |
Addresses #3059
Note that this is work in progress, a number of things need to be done (help is greatly appreciated!)
org.eclipse.rdf4j.model
andorg.eclipse.rdf4.sparqlbuilder.*
; and possibly extract them to other rdf4j modules (may be done after the PR is merged)spring-components/rdf4j-spring/src/main/java/org/eclipse/rdf4j/spring/package-info.java
)