Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

len should take constant-time #45

Open
Nemo157 opened this issue Jan 17, 2018 · 4 comments · May be fixed by #182 or #105
Open

len should take constant-time #45

Nemo157 opened this issue Jan 17, 2018 · 4 comments · May be fixed by #182 or #105

Comments

@Nemo157
Copy link
Collaborator

Nemo157 commented Jan 17, 2018

rust-lang/api-guidelines#149

@josephglanville
Copy link
Contributor

josephglanville commented Mar 23, 2020

Contant time len() will probably require additional storage. Atleast for my usecase that would be quite unfortunate as my program has hundreds of thousands of bitmaps at any time.

Perhaps instead the name of len() could be changed to cardinality() or another name so it doesn't conflict with the API guidelines and doesn't bloat RoaringBitmap with extra bytes?

@saik0
Copy link
Contributor

saik0 commented Jan 10, 2022

#127 removes 8 bytes from each array container, so maybe we can afford a few extra bytes at the root now ;)

@saik0
Copy link
Contributor

saik0 commented Feb 6, 2022

This is relevant to #167.

Cardinality of all ops can all be trivially implemented in terms of the cardinality of the intersection |A ∩ B|.

|A \ B| = |A| − |A ∩ B|
|A ∪ B| = |A| + |B| − |A ∩ B|
|A △ B| = |A| + |B| − 2|A ∩ B|

Intersection cardinality is cheaper to compute, so it's both simpler to implement and has better runtime performance if we have the len cached.

This was referenced Feb 6, 2022
@Dr-Emann
Copy link
Member

Dr-Emann commented Mar 21, 2022

Crazy, terrible, stupid idea to avoid increasing the size of RoaringBitmap but still store the cardinality:

The max cardinality of the bitmap is 0x1_0000_0000 (33 bits). but the len and capacity of the root vec should only ever reach 0x1_0000 (17 bits). On 64 bit systems, that gives us plenty of space to store the cardinality in the top bits. On 32 bit systems, that gives us 30 bits (15 bits for capacity/len each). Additionally, because BitmapStore contains a u64, the alignment of Container is at least 8, so we could store the remaining 3 bits of cardinality in the low bits of the vec data pointer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants