Skip to content

Issues: Lightning-AI/litData

Beta
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

Support for Pathlib in litdata.map enhancement New feature or request
#581 opened May 5, 2025 by SkafteNicki
litdata.optimize() function returns without raising error, prior to all processes finishing work bug Something isn't working help wanted Extra attention is needed waiting on author Waiting for user input or feedback.
#575 opened Apr 29, 2025 by JacobARose
Add example for hugging face dataset optimization and benchmarks reports documentation Improvements or additions to documentation
#558 opened Apr 17, 2025 by bhimrazy
OutOfBoundsError when streaming parquet files with low_memory=True bug Something isn't working help wanted Extra attention is needed waiting on author Waiting for user input or feedback.
#553 opened Apr 13, 2025 by kyoungrok0517
Duplicate UserWarning Logs for lightning-sdk Version Check bug Something isn't working help wanted Extra attention is needed
#527 opened Mar 24, 2025 by deependujha
Cycle option for StreamingDataLoader enhancement New feature or request waiting on author Waiting for user input or feedback.
#524 opened Mar 24, 2025 by Aceticia
Local cache dir not fully clearing in DDP multi-node training. bug Something isn't working help wanted Extra attention is needed
#512 opened Mar 12, 2025 by JackUrb
How to optimimize dataset for pretraining from HuggingFace bug Something isn't working question Further information is requested
#482 opened Feb 21, 2025 by TheLukaDragar
Add pytest fixture to limit max time a test can take bug Something isn't working help wanted Extra attention is needed
#475 opened Feb 17, 2025 by deependujha
CI error: All chunks should've been deleted keeps coming back bug Something isn't working help wanted Extra attention is needed won't fix
#437 opened Dec 20, 2024 by deependujha
Restart training with new data, mid-epoch enhancement New feature or request won't fix
#436 opened Dec 17, 2024 by schopra8
incorrect dataloader length when drop_last=False bug Something isn't working help wanted Extra attention is needed won't fix
#402 opened Oct 28, 2024 by grez72
The config isn't consistent between chunks bug Something isn't working help wanted Extra attention is needed waiting on author Waiting for user input or feedback.
#370 opened Sep 17, 2024 by gluonfield
StreamingDataset causes NCCL timeout when using multiple nodes bug Something isn't working help wanted Extra attention is needed
#340 opened Aug 26, 2024 by hubenjm
StreamingDataset intermittently fails due to lack of index.json bug Something isn't working help wanted Extra attention is needed won't fix
#337 opened Aug 20, 2024 by plra
Use different batch sizes in CombinedStreamingDataset enhancement New feature or request help wanted Extra attention is needed won't fix
#327 opened Aug 10, 2024 by schopra8
ProTip! Add no:assignee to see everything that’s not assigned.