Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct _format_refs to only include unique documents #31

Merged
merged 1 commit into from
Nov 4, 2024

Conversation

spodgorny9
Copy link
Collaborator

No description provided.

@spodgorny9 spodgorny9 force-pushed the sp/ref_corrections branch 2 times, most recently from 8d19801 to 5d4d8b3 Compare September 27, 2024 18:28
for ref_dict in ref_list:
if any(ref_dict == d for d in unique_ref_list):
if ref_dict['nrel_id'] in unique_nrel_ids:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think about this? We need the chunk id column id to order the reference list and the document id column nrel_id to make sure each item in the list is unique. All of our DBs currently use nrel_id but I could change it to document_id to make it more public friendly, it would just require re-creating meta files and a little additional work. Or if you have another method in mind I am all ears.

@spodgorny9 spodgorny9 requested a review from grantbuster October 1, 2024 03:16
@spodgorny9 spodgorny9 merged commit df5d12d into main Nov 4, 2024
12 of 15 checks passed
@spodgorny9 spodgorny9 deleted the sp/ref_corrections branch November 4, 2024 17:44
github-actions bot pushed a commit that referenced this pull request Nov 4, 2024
Correct _format_refs to only include unique documents
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants