Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid access or accessor meta-issue #4032

Closed
finnschiermer opened this issue Nov 2, 2020 · 3 comments
Closed

Invalid access or accessor meta-issue #4032

finnschiermer opened this issue Nov 2, 2020 · 3 comments
Assignees

Comments

@finnschiermer
Copy link
Contributor

finnschiermer commented Nov 2, 2020

This meta issue track other issues that indicate use of a stale accessor.

There may be a single or multiple underlying root causes. We don't know.

We have 4 different ways such an error can manifest itself

  • A no such table exception (all of these are now considered fixed - but leaving metadata here for now)
  • A stale object reference exception (this is the same as above, just a name change)
  • A failure that appears to be related to encryption or the ref translation machinery, but encryption is reported to not be used.
  • A segfault or access error at a transaction boundary

Guesses re root cause:

  • Abuse of the Core API by ObjectStore. A bug which produces (or merely allows) use of a stale accessor
  • A bug in the way we track ref-translation changes and force accessor updates.
  • A bug in the management of memory mappings causing a mapping to be removed too early.

Links (to be added):

Historic note:

Prior to Core-6, we had a trickle of error reports pertaining to invalid access to the accessor tracking machinery. We never nailed the root cause of this, and the whole machinery was designed away for Core-6. If the root cause actually resided outside (above) Core in the software stack, it may still be there, just showing itself differently.

@finnschiermer
Copy link
Contributor Author

I no longer believe it can be a bug in the memory mapping management. To stress this part of the machinery, a special version of Core was built, in which the section size was lowered to 16 kilobytes. This increases turnaround of memory mappings by several orders of magnitude, but requires special filtering of tests - since many of our tests cannot run with such a change. Tests which failed as expected where filtered out. No tests failed in unexpected ways (including anything mimicking the reported bugs). Multi process and multi threaded tests were then expanded to run for hours. We were still unable to trigger a crash.

@sync-by-unito
Copy link

sync-by-unito bot commented Jun 2, 2021

➤ Finn Andersen commented:

The ref-translation bugs are all explained by bugs in selecting the proper allocator and thus indirectly getting to a stale ref-translation table.

This was fixed in multiple rounds, last in #4417. No ref-translation error has been reported since.

@sync-by-unito sync-by-unito bot closed this as completed Jun 2, 2021
@sync-by-unito
Copy link

sync-by-unito bot commented Jun 2, 2021

➤ Finn Andersen commented:

All errors in this meta-issue are now considered fixed.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 21, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant