-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Issues: Lightning-AI/pytorch-lightning
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
lr_scheduler does not work when "interval": "step"
bug
Something isn't working
needs triage
Waiting to be triaged by maintainers
ver: 2.4.x
#20436
opened Nov 21, 2024 by
lucl13
Multi-gpu training with slurm times out
bug
Something isn't working
needs triage
Waiting to be triaged by maintainers
ver: 2.3.x
#20434
opened Nov 19, 2024 by
nightingal3
Make Is an improvement or enhancement
needs triage
Waiting to be triaged by maintainers
save_hyperparameters
consistent for CLI and hardcoded training for custom python objects
feature
#20432
opened Nov 19, 2024 by
cgebbe
When interrupting a run with Ctrl+C, sometimes the WandbLogger does not upload a checkpoint artifact
bug
Something isn't working
needs triage
Waiting to be triaged by maintainers
ver: 2.4.x
#20425
opened Nov 16, 2024 by
edmcman
Why only one GPU is getting used in the kaggle kernel
waiting on author
Waiting on user action, correction, or update
#20424
opened Nov 16, 2024 by
KeesariVigneshwarReddy
Weird error while training a model with tabular data!!!! Some problem related self.log_dict
bug
Something isn't working
ver: 2.4.x
#20423
opened Nov 16, 2024 by
KeesariVigneshwarReddy
PyTorch Lightning uses deprecated torch_xla function, causing compatibility issues with latest torch_xla versions
3rd party
Related to a 3rd-party
ver: 2.5.x
#20419
opened Nov 13, 2024 by
aahila-aws
Log default metrics
feature
Is an improvement or enhancement
logger
Related to the Loggers
#20418
opened Nov 13, 2024 by
ierezell
seed_everything(..., workers=True)
causes the Dataloader
to apply exactly the same augmentations each epoch if they sample values from torch.distributions
bug
#20412
opened Nov 12, 2024 by
nan-dre
update dataset at "on_train_epoch_start", but "training_step" still get old data
bug
Something isn't working
loops
Related to the Loop API
waiting on author
Waiting on user action, correction, or update
#20407
opened Nov 8, 2024 by
Yak1m4Sg
FSDP full state dict mangles fsspec path
bug
Something isn't working
ver: 2.4.x
ver: 2.5.x
#20406
opened Nov 8, 2024 by
oceanusxiv
How to deal with uneven inputs in DDP with sharded data without hanging
discussion
In a discussion stage
#20404
opened Nov 7, 2024 by
ssharpe42
Proposal(CLI): after_instantiate_classes hook
design
Includes a design discussion
feature
Is an improvement or enhancement
lightningcli
pl.cli.LightningCLI
#20400
opened Nov 6, 2024 by
AlessandroW
PytorchStreamReader failed reading zip archive: not a ZIP archive
bug
Something isn't working
checkpointing
Related to checkpointing
strategy: deepspeed
ver: 2.4.x
#20398
opened Nov 6, 2024 by
Crazy-LittleBoy
put the monitor metric into default filename for ModelCheckpoint
feature
Is an improvement or enhancement
#20397
opened Nov 5, 2024 by
VDFaller
Light / dark mode for documentation
bug
Something isn't working
docs
Documentation related
ver: 2.5.x
#20396
opened Nov 5, 2024 by
nbrosse
Gradient checkpointing and ddp do not work together
bug
Something isn't working
repro needed
The issue is missing a reproducible example
ver: 2.4.x
#20395
opened Nov 4, 2024 by
rubenweitzman
Error if SLURM_NTASKS != SLURM_NTASKS_PER_NODE
ver: 2.4.x
working as intended
Working as intended
#20391
opened Nov 4, 2024 by
guarin
Major performance degradation when multiple metrics/losses
bug
Something isn't working
ver: 2.4.x
ver: 2.5.x
#20388
opened Nov 3, 2024 by
EtayLivne
FSDP with HYBRID_SHARD loss doesn't improve with more nodes
repro needed
The issue is missing a reproducible example
ver: 2.4.x
#20385
opened Nov 2, 2024 by
zaptrem
Custom TQDMProgressBar changes not reflected
bug
Something isn't working
needs triage
Waiting to be triaged by maintainers
ver: 2.4.x
#20384
opened Nov 1, 2024 by
oseymour
Optimize Is an improvement or enhancement
repro needed
The issue is missing a reproducible example
fit_loop()
to reduce train_dataloader()
's memory footprint
feature
#20382
opened Nov 1, 2024 by
guillaume-rochette-oxb
Fabric and FFCV?
repro needed
The issue is missing a reproducible example
#20380
opened Nov 1, 2024 by
richardrl
Deepspeed ZERO MiCS support
feature
Is an improvement or enhancement
strategy: deepspeed
waiting on author
Waiting on user action, correction, or update
#20378
opened Oct 31, 2024 by
hehepig4
FSDP checkpoint loading fails
bug
Something isn't working
checkpointing
Related to checkpointing
strategy: fsdp
Fully Sharded Data Parallel
ver: 2.4.x
waiting on author
Waiting on user action, correction, or update
#20373
opened Oct 29, 2024 by
Nilabhra
Previous Next
ProTip!
Follow long discussions with comments:>50.