Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for nvCOMP batch API #249

Merged
merged 4 commits into from
Jul 3, 2023

Conversation

Alexey-Kamenev
Copy link
Contributor

See #248 for more details.

@Alexey-Kamenev Alexey-Kamenev requested review from a team as code owners June 27, 2023 21:59
@rapids-bot
Copy link

rapids-bot bot commented Jun 27, 2023

Pull requests from external contributors require approval from a rapidsai organization member with write permissions or greater before CI can begin.

@jakirkham
Copy link
Member

cc @thomcom @madsbk (for awareness)

@jakirkham jakirkham added improvement Improves an existing functionality non-breaking Introduces a non-breaking change c++ Affects the C++ API of KvikIO python Affects the Python API of KvikIO labels Jun 27, 2023
@jakirkham
Copy link
Member

/ok to test

Copy link
Member

@madsbk madsbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @Alexey-Kamenev !
Have some minor suggestion for my first review pass.

python/kvikio/nvcomp_codec.py Outdated Show resolved Hide resolved
"""
return self.encode_batch([buf])[0]

def encode_batch(self, bufs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def encode_batch(self, bufs):
def encode_batch(self, bufs : List[Any]) -> List[Any]:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I did not add type hints since numcodecs Codec does not use them, so I decided to do the same (but I still prefer to use type hints).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

numcodecs Codec does not use them

Could you please raise an upstream issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

max_chunk_size,
num_chunks,
temp_buf,
comp_chunks,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think comp_chunks is used afterwards, I guess it should be part of the returned result?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct - comp_chunks is used only as a container that stores pointers to actual chunks. nvCOMP requires this container to be on GPU as well i.e. it's a pointer to pointers and it has to be in GPU memory, same as actual chunk pointers. Once compress returns, this container is not needed/used anymore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh got it, could you add some comments describing the nature of comp_chunks and comp_chunks_header in more detail?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - also added similar comments to decode_batch.

python/kvikio/nvcomp_codec.py Outdated Show resolved Hide resolved
python/kvikio/nvcomp_codec.py Outdated Show resolved Hide resolved
python/kvikio/nvcomp_codec.py Outdated Show resolved Hide resolved
* Addressed review feedback.
* Added sample Jupyter notebook.
@Alexey-Kamenev
Copy link
Contributor Author

It looks like there are 3 pipeline failures for this PR but I don't think they are related to the PR itself, since the errors look like this:

Error: OpenIDConnect provider's HTTPS certificate doesn't match configured thumbprint

@jakirkham jakirkham requested a review from madsbk June 29, 2023 01:00
@jakirkham
Copy link
Member

/ok to test

Copy link
Member

@madsbk madsbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only have some minor comments.

I think an important follow-up PR would be to support GPU memory output of encode_batch and decode_batch: #251

max_chunk_size,
num_chunks,
temp_buf,
comp_chunks,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh got it, could you add some comments describing the nature of comp_chunks and comp_chunks_header in more detail?

python/tests/test_nvcomp_codec.py Outdated Show resolved Hide resolved
Copy link
Member

@madsbk madsbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks @Alexey-Kamenev !

@jakirkham
Copy link
Member

/ok to test

@madsbk
Copy link
Member

madsbk commented Jul 3, 2023

/merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Affects the C++ API of KvikIO improvement Improves an existing functionality non-breaking Introduces a non-breaking change python Affects the Python API of KvikIO
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants