Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TestQueryOneToOneRelations flaky in CI #1702

Closed
fredcarle opened this issue Jul 24, 2023 · 5 comments · Fixed by #1740
Closed

TestQueryOneToOneRelations flaky in CI #1702

fredcarle opened this issue Jul 24, 2023 · 5 comments · Fixed by #1740
Assignees
Labels
area/cli Related to the CLI binary bug Something isn't working

Comments

@fredcarle
Copy link
Collaborator

fredcarle commented Jul 24, 2023

TestQueryOneToOneRelations often fails on the change detector on the CI. It's not clear yet what is causing this.

It looks to be #1672 - only now flaky instead of consistently failing (and on a different, but similar test).

Possibly significant is that both this and #1672 involved tests with one-one relations.

  • Executing TestQueryOneToOneRelations only does not resolve the issue (other tests do not appear to be a factor)
  • Removing the create statements for either Publishers or Authors appears to remove the issue
  • Reducing the query to just Book {name} appears to resolve the issue - this is unlike AssertFail in badger file on db close #1672 which persisted on a simplified query
    • Reducing the query to Book {name author {name}} (removing the publisher join) also resolves the issue
    • Reducing the query to Book {name publisher {name}} (removing the author join) does not resolve the issue. This interestingly is the one-one join (same as AssertFail in badger file on db close #1672)
  • Setting n.subtype.Spans() does not resolve the issue (is a difference between the primary and secondary)
  • No errors in valuesSecondary appear to be generated, despite typeJoinOne silently discards errors in Next #1711
  • Returning early from valuesSecondary immediately after n.subType.Init() resolves the issue, this is before n.subType.Next()
  • Returning early from valuesSecondary immediately after n.subType.Next() does not resolve the issue, suggesting Next is the trigger (same as in AssertFail in badger file on db close #1672)
  • Removing n.subType.Init() from typeJoinOne.Init() does not resolve the issue (although it is a duplicate, pointless call I think)
  • Updating to badger v4 still has no affect
  • Removing the 'extra' df.kvResultsIter.Close() call added to the fetcher in AssertFail in badger file on db close #1672 appears to resolve the issue It did for a while, and then broke following a git push
@fredcarle fredcarle added bug Something isn't working area/cli Related to the CLI binary labels Jul 24, 2023
@AndrewSisley
Copy link
Contributor

AndrewSisley commented Jul 24, 2023

It looks to be #1672 - only now flaky instead of consistently failing (and on a different, but similar test).

The fact that it is now flaky is pulling me closer to believing this is something GC related within the badger code.

@AndrewSisley
Copy link
Contributor

AndrewSisley commented Jul 24, 2023

Oddly, whilst this has been problematic in the CI today, I have no trouble running the change detector on my laptop atm (3 out of 3) (unlike #1672 which was failing pretty consistently locally)

EDIT: 2 out of 3 runs failed locally this morning (same test as in desc).

@fredcarle
Copy link
Collaborator Author

It fails every third run for me locally. Like clockwork

@shahzadlone shahzadlone changed the title TestQueryOneToOneRelations facky in CI TestQueryOneToOneRelations flaky in CI Jul 25, 2023
@AndrewSisley
Copy link
Contributor

AndrewSisley commented Jul 25, 2023

It fails every third run for me locally. Like clockwork

I wonder if that is significant. How many times have you seen this set-of-three pattern?

@fredcarle
Copy link
Collaborator Author

It fails every third run for me locally. Like clockwork

I wonder if that is significant. How many times have you seen this set-of-three pattern?

Pretty consistently. So at least 20 times thus far. Sometimes I get 3 or 4 successful runs but it's usually 2 good ones and the third fails. A couple subsequent failures too.

fredcarle added a commit to fredcarle/defradb that referenced this issue Aug 1, 2023
@fredcarle fredcarle added this to the DefraDB v0.7 milestone Aug 1, 2023
@fredcarle fredcarle self-assigned this Aug 1, 2023
fredcarle added a commit that referenced this issue Aug 1, 2023
## Relevant issue(s)

Resolves #1702

## Description

This PR updates the Badger version to v4.

It includes a temporary fix to our badger `os.Exit` issue that is
costing us time when managing PRs. This should be reverted or properly
fixed before releasing v0.7.
shahzadlone pushed a commit to shahzadlone/defradb that referenced this issue Feb 23, 2024
## Relevant issue(s)

Resolves sourcenetwork#1702

## Description

This PR updates the Badger version to v4.

It includes a temporary fix to our badger `os.Exit` issue that is
costing us time when managing PRs. This should be reverted or properly
fixed before releasing v0.7.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cli Related to the CLI binary bug Something isn't working
Projects
None yet
2 participants