How to get access to DDP functions before initializing trainer? #20001

seermer · 2024-06-20T21:37:33Z

seermer
Jun 20, 2024

I am currently writing a pipeline that update settings (e.g., add fileHandler to command line logging) -> initialize config from a user input file -> dynamically set some parameters accordingly (e.g., set rank-specific seed) -> initialize lightning trainer and LightningModule and start training. It needs to be done in this order (as many steps depend on previous steps), and must be synchronized across GPU after each step (otherwise some operations that only runs on rank 0 with decorator rank_zero_only may not be effective and ready for other GPUs yet). How can I achieve this?
All the documentations I can find uses trainer.global_rank to get global rank number, and uses trainer.strategy.barrier to synchronize at certain line, and I cannot find examples that perform communications (e.g., gather, broadcast) outside a LightningModule. What should I use before trainer and LightningModule are initialized?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to get access to DDP functions before initializing trainer? #20001

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

How to get access to DDP functions before initializing trainer? #20001

seermer Jun 20, 2024

Replies: 0 comments

seermer
Jun 20, 2024