[DOC] low-memory reader options not very discoverable #16443

wence- · 2024-07-31T09:47:37Z

Recently, we added chunked (low-memory) readers in cudf-python for parquet and json formats.

The only place these features are documented are in the options values that globally select whether to use the chunked reader. These options are, respectively io.parquet.low_memory and io.json.low_memory.

These are shown (in an unformatted manner) as the output of describe_options in the user documentation as part of the description of options: https://docs.rapids.ai/api/cudf/nightly/user_guide/api_docs/options/#api-options

If I were looking for information about how to control IO memory usage, I do not think that I would think to look here.

I would suggest that:

chunked reader control is mentioned in the relevant read_parquet and read_json docstrings. This is especially important because there is no keyword argument to control the behaviour, it is only controlled through the option.
these settings are mentioned in the I/O overview documentation (somewhere here https://docs.rapids.ai/api/cudf/nightly/user_guide/io/)

The text was updated successfully, but these errors were encountered:

bdice · 2024-07-31T12:04:48Z

@galipremsagar Could you take this on?

galipremsagar · 2024-07-31T12:07:10Z

@galipremsagar Could you take this on?

Sure

vyasr · 2024-08-16T19:31:24Z

It might also be good to have a high-level user guide indicating how to user cudf in low memory situations. That would include the I/O options as well as things like switching to a managed memory allocator or tips and tricks for cleaning up intermediate objects to reduce how many allocations stick around.

Closes #16443 Authors: - Brian Tepera (https://github.com/btepera) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #17314

wence- added the doc Documentation label Jul 31, 2024

galipremsagar self-assigned this Jul 31, 2024

vyasr added this to cuDF Python Oct 10, 2024

github-project-automation bot moved this to Todo in cuDF Python Oct 10, 2024

vyasr added the Python Affects Python cuDF API. label Oct 10, 2024

vyasr assigned btepera Oct 29, 2024

btepera mentioned this issue Nov 13, 2024

Add documentation for low memory readers #17314

Merged

3 tasks

GPUtester moved this from Todo to In Progress in cuDF Python Nov 13, 2024

rapids-bot bot closed this as completed in #17314 Nov 14, 2024

rapids-bot bot pushed a commit that referenced this issue Nov 14, 2024

Add documentation for low memory readers (#17314)

9da8eb2

Closes #16443 Authors: - Brian Tepera (https://github.com/btepera) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #17314

github-project-automation bot moved this from In Progress to Done in cuDF Python Nov 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DOC] low-memory reader options not very discoverable #16443

[DOC] low-memory reader options not very discoverable #16443

wence- commented Jul 31, 2024

bdice commented Jul 31, 2024

galipremsagar commented Jul 31, 2024

vyasr commented Aug 16, 2024

[DOC] low-memory reader options not very discoverable #16443

[DOC] low-memory reader options not very discoverable #16443

Comments

wence- commented Jul 31, 2024

bdice commented Jul 31, 2024

galipremsagar commented Jul 31, 2024

vyasr commented Aug 16, 2024