-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Iron-out edge cases for library use-case, adding extensive real-world test assertions #2309
Conversation
Results are logs of what we get with old toplevel-based version of lib_machine. These are also congruent with what our tests logged out based on SWI.
discontiguous
@mthom after fixing the ordering of the first assertions, I'm now stuck with this: Assertion 50 fails with only one of 3 expected matches. Resembling this assertion in its own test case by running all previous Seems like the history of all the queries happening before in the integration test are relevant to replicate this. Debugger jumps to the first line of src/lib.rs when exiting the loop (and then continues after the loop). I have tried increasing and removing the |
Though if removing it doesn't break the build, then it appears to have become superfluous. |
I fixed the Assertion 50 error and in the process I believe I exposed a redundant fact in the integration test text file causing a solution to be reported twice instead of once as expected. Please check its diff to see it. Now Assertion #57 fails because the contents of the solution are out of order again. |
Alright! We got our ad4m integration test suite passing with commit 53028a9. I've fixed all the orderings in the integration assertions and have all scryer tests pass locally :) |
...almost. |
As these are supposed to be integration test, based on https://doc.rust-lang.org/cargo/guide/tests.html they should go into the |
Sorry for the confusion, I called the test "integration" because these are assertions coming from our (ad4m) integration tests where we run scryer as library integrated in ad4m. The tests in lib_machine.rs are testing |
Huh, interesting that the last mentioned problem does not occur on wasm and i686 architectures, as seen in CI runs... (but all other archs) |
🥳 |
src/rcu.rs
Outdated
@@ -22,7 +22,7 @@ thread_local! { | |||
// odd value means the current thread is about to access the active_epoch of an Rcu | |||
// a thread has a single epoch counter for all Rcu it accesses, | |||
// as a thread can only access one Rcu at a time | |||
static THREAD_EPOCH_COUNTER: OnceCell<Arc<AtomicU8>> = OnceCell::new(); | |||
static THREAD_EPOCH_COUNTER: OnceCell<Arc<AtomicU8>> = const { OnceCell::new() }; |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I just noticed in the thread_local!
docs:
This macro supports a special
const {}
syntax that can be used
when the initialization expression can be evaluated as a constant.
This can enable a more efficient thread local implementation that
can avoid lazy initialization.
So, I think this change makes sense.
@@ -524,4 +531,82 @@ mod tests { | |||
),])) | |||
); | |||
} | |||
|
|||
#[test] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the PR description these tests are the result of attempting to use scryer-prolog as a library. As such I would expect these to be integration test that do just that.
On a quick glance these tests also appear to only use the public accessible interface of scryer as a library, so I think these could be just moved to be integration tests in tests/
.
Based on you comment at #2309 (comment) you appear to disagree, could you explain why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say this PR (and the ping-pong between me and Mark) demonstrates that the tests here are, first and foremost, covering the function run_query()
in this same file. So I would suggest to keep them here because the context of the test be located near run_query()
provides import meaning, independent of the visibility of that function.
That said, I'm happy to move it over to tests/
if that's where you want it to be :)
(just explaining my reasoning)
You are talking about all tests in this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have invited you to our forked repo so you can push changes to this branch if you like, @Skgland!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I had some misconception/misunderstanding while writing that comment.
While I most of these would make sense as integration tests, as they appear to only use the public interface, I think most are small enough to be fine as is, especially as most already existed here from before this PR.
Only integration_test
due to the now huge lib_integration_test_commands.txt
(12k+ lines) appears too large.
Maybe the txt file could be split into multiple smaller txt files, so that the files can still be viewed here on Github?
Not sure how intertwined everything in there is.
integration_test
could then be split into multiple test functions including the individual txt files which would then call a helper function that is basically the current integration_test
but taking code
as an argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's adequate to question if this test should be here at all. It was a pragmatic way for us to delineate where the problems are stemming from as we did some refactoring in our project at the same time we switched from SWI to Scryer, but had our integration tests from before.
Running this big txt file here is a bit opaque, which is also why I tried to extract the failing assertions into the other more understandable test cases. But it did already uncover a problem that you only see if you have multiple consecutive calls to consult
followed by queries. This is actually close the reason that made us add this test mechanism in the first place: we saw scryer slowing down query by query when used in our ad4m integration tests (that's why we had the previous version without assertions on the results - it was just making sure Scryer would make it to the end). So I would argue against splitting it up.
But yeah, that does make it feel more like an integration test.
I'll leave it completely up to you guys to decide if you want to keep this test at all, and if so where it should stay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its not worth blocking the whole PR about this, as it can always be moved later.
Co-authored-by: Bennet Bleßmann <bennet.blessmann+github@googlemail.com>
32a2ff3
to
b43f097
Compare
The style check workflow uses nightly Rust, which even fails the regular tests. |
The problem is apparently that the |
I think updating |
Yeah, I mean these things can happen "over night" ;) so my latest commit is a suggestion to not use nightly in CI if not absolutely necessary.. the wasm target build surprisingly works despite nightly. I guess ahash is not build there.. (?) |
Yeah, I think that is good for the style and report CI. Maybe the nightly CIs can be marked |
Ah, I see. So the build and test actually failed for the wasm target (https://github.com/mthom/scryer-prolog/actions/runs/7843777400/job/21404819971) but because of continue-on-error it showed up as green tick anyways... hm, what is the benefit of running these jobs in CI if they won't trigger a failure? |
Right, I forgot that GitHub Actions doesn't differentiate between success and failed but allowed to fail. This way the original failing CI would be better as the error is at least visible. |
- fix nightly build - not bumping to latest aka. 0.8.8 as that has a msrv of 1.72.0 and we are only at 1.70.0
- currently `continue-on-error` is not shown in a usefull way on failiure see <https://github.com/orgs/community/discussions/15452>
I bumped ahash to 0.8.7 to fix nightly and undid the |
Looks like the transitive
|
Is a failing nightly build even worth delaying PRs? It seems that issues with nightly may also be resolved by completely unrelated changes in other crates that will be made according to their own schedule. |
The only one that's still failing is wasm32 and the test failure already present on master just ignored. |
What is needed for this PR to get merged? Please let me know if there is something left I can do. |
Get the nightly tests to pass, I suppose? I'm not sure how important they are ultimately. |
They already pass
It's a good question. Rust should never break compatibility but this rule doesn't apply for nightly, so sometimes it can fail due to mistakes in the Rust side. |
Additions
Context
After merging #1880 we tried to jump to upstream scryer commits/versions but were forced to stay at an old commit that was still using our custom toplevel. Using the fully Rust-based version that got merged we couldn't get our full integration tests to pass.
After double checking our code, I went ahead and completed the integration tests here with the results that our test code expects to get from scryer. (that has not really changed from our old SWI integration over the toplevel based scryer lib_machine).
1. problems with
discontiguous
Interpreting the new test results, I think it's clear that the problem only occurs when declaring a predicate with
discontiguous
. In the new testdont_return_partial_matches
we load the program:which is all we need from our integration tests to show the faulty behaviour when running the query:
which (wrongly) yields a match for C: "c".
Changing the order of the predicates in the query gives a different result:
=> false
The second new test case
dont_return_partial_matches_without_discountiguous
shows that a similar case works fine whendiscontiguous
is not used.fixed in: 7de693e
2. Assertion 50 leaves
run_query
loopAssertion 50 fails with only one of 3 expected matches.
Debugging through this case shows a strange behaviour: after the first execution of the loop in
run_query
(ln. 111) which adds the first match, the loop is exited at the end, not callingdispatch_loop()
again. This seems to happen without hitting abreak
.Resembling this assertion in its own test case by running all previous
consult
s and then running the query and assertion does not yield this strange behaviour.Seems like the history of all the queries happening before in the integration test are relevant to replicate this.
Debugger jumps to the first line of src/lib.rs when exiting the loop (and then continues after the loop). I have tried increasing and removing the
#![recursion_limit = "4112"]
there to no avail.