Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate entries created in Merge table #920

Closed
samuelbray32 opened this issue Apr 10, 2024 · 1 comment · Fixed by #922
Closed

Duplicate entries created in Merge table #920

samuelbray32 opened this issue Apr 10, 2024 · 1 comment · Fixed by #922
Assignees
Labels
bug Something isn't working merge To do with merge tables spike sorting

Comments

@samuelbray32
Copy link
Collaborator

Describe the bug

  • Found that some of the entries in SpikeSortingOutput.CuratedSpikeSorting are duplicated with different merge_ids. This should be prevented by the UUID hashing in Merge._merge_insert().
  • Possible that these were due to a change in how we hash Fault-permit insert and remove mutual exclusivity protections on Merge #824
    • If so, should _merge_insert check for a matching entry in the part table for a given source primary key before generating a UUID and inserting? This would help prevent overlapping entries going forward in case of other hash method changes
    • Side note: Essentially this is the inverse problem of Merge key conflict in PositionOutput #915

To Reproduce
Example duplicate key:

part_key = {
    "curation_id": 1,
    "nwb_file_name": "Winnie20220717_.nwb",
    "sort_group_id": 13,
    "sort_interval_name": "12_lineartrack",
    "preproc_params_name": "franklab_tetrode_hippocampus",
    "team_name": "ms_stim",
    "sorter": "mountainsort4",
    "sorter_params_name": "franklab_tetrode_hippocampus_30KHz_tmp",
    "artifact_removed_interval_list_name": "Winnie20220717_.nwb_12_lineartrack_13_franklab_tetrode_hippocampus_ampl_2000_prop_75_artifact_removed_valid_times",
}

SpikeSortingOutput().CuratedSpikeSorting() & part_key
@samuelbray32 samuelbray32 added bug Something isn't working spike sorting merge To do with merge tables labels Apr 10, 2024
@samuelbray32 samuelbray32 self-assigned this Apr 10, 2024
@samuelbray32
Copy link
Collaborator Author

Can confirm the difference is due to inclusion of the source table in the hash generation. duplicate entry UUIDs match hash results with and without the source table respectively.

Solution: Implement check for existing key in part table prior to UUID generation and insert

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working merge To do with merge tables spike sorting
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant