Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash output after ct tests #462

Closed
martinsumner opened this issue Dec 2, 2024 · 0 comments
Closed

Crash output after ct tests #462

martinsumner opened this issue Dec 2, 2024 · 0 comments

Comments

@martinsumner
Copy link
Owner

On completion of the ct tests, sometimes a crash is seen of some leveled_cdb files.

These are leveled_cdb files from the end of the tictac_SUITE basic_headonly test - where they have been compacted, and then the inker is immediately closed:

Closing actual store <0.24249.137>

2024-12-02T20:51:28.887 log_level=info log_ref=i0013 db_id=65536 pid=<0.24251.137> File testHO/journal/journal_files/5179_6ca1e6b8-62e0-4395-ac79-0e849e34f1f0 to be removed from manifest

2024-12-02T20:51:28.887 log_level=info log_ref=i0013 db_id=65536 pid=<0.24251.137> File testHO/journal/journal_files/4532_e157acd0-4928-441d-8db8-d92605cc6fb9 to be removed from manifest

2024-12-02T20:51:28.888 log_level=info log_ref=i0016 db_id=65536 pid=<0.24251.137> Writing new version of manifest for manifestSQN=13

2024-12-02T20:51:28.898 log_level=info log_ref=i0005 db_id=65536 pid=<0.24251.137> Inker closing journal for reason close

2024-12-02T20:51:28.898 log_level=info log_ref=i0006 db_id=65536 pid=<0.24251.137> Close triggered with journal_sqn=6250 and manifest_sqn=13

2024-12-02T20:51:28.898 log_level=info log_ref=i0007 db_id=65536 pid=<0.24251.137> Inker manifest when closing is:

2024-12-02T20:51:28.899 log_level=info log_ref=pc005 db_id=65536 pid=<0.24255.137> Penciller's Clerk <0.24255.137> shutdown now complete for reason normal

2024-12-02T20:51:28.899 log_level=info log_ref=p0008 db_id=65536 pid=<0.24254.137> Penciller closing for reason close

2024-12-02T20:51:28.899 log_level=info log_ref=p0010 db_id=65536 pid=<0.24254.137> level zero discarded_count=0 on close of Penciller

2024-12-02T20:51:28.901 log_level=info log_ref=b0003 db_id=65536 pid=<0.24249.137> Bookie closing for reason normal

Trim has reduced journal count from 12 to 5 and 5 after restart

The issue here is that close prompts a close of the manifest, however delete_pending files will linger until there next delete_confirmed check. At this stage they will recognise the inker is not alive and then close:

https://github.com/martinsumner/leveled/blob/aaeac7ba36bf7561b4496c31d9fd37b6a7cfa825/src/leveled_cdb.erl#L7832-L836

Except that this occurs ten seconds later, and so after the VM has been closed due to the end of the test.

martinsumner added a commit to OpenRiak/leveled that referenced this issue Jan 15, 2025
* Test and fix - issue with folding beyond JournalSQN

Test previously fails, as even on a fast machine the fold goes on for 5s beyond the last object found.

With change to reduce batch size, and stop when batch goes beyond JournalSQN - success with << 100ms spent folding after the last object discovered

* Wait after suite for delete_pending to close

martinsumner#462

* Avoid processing key changes in object fold runner

As the key changes are going to be discarded
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant