Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add b+ tree secondary indexing for fast search and retrieval #22

Open
wants to merge 20 commits into
base: develop
Choose a base branch
from

Conversation

bkal01
Copy link

@bkal01 bkal01 commented Aug 6, 2020

This feature adds a secondary indexing option using a B+ tree closely mirroring the existing primary indexing option. I added Node and BPlusTree classes for the tree structure and included testing. I also added a secondary_indexing option when adding/reading records. For example, if the user wants to index by 'val', they pass secondary_indexing={'val': val} into the append_record method. The tree is saved in .aimrecords_storage/<insert_artifact_name>/val.tree and can be loaded/saved using pickle.

There are issues with consecutive runs of the secondary_indexing_example. For example, if we save 1000 records with val=1,2,...,1000 and then run the example again, There are 2000 records remembered but still only 1000 keys for the 1000 new records. This causes some issues when trying to access records using negative indexing

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants