-
Notifications
You must be signed in to change notification settings - Fork 1
/
Readme.4kcubes
54 lines (31 loc) · 1.68 KB
/
Readme.4kcubes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
Packing & Train on 4k cunbes from Zarija with zoom=8
Packing has 2 steps:
salloc -C cpu -q interactive -t4:00:00 -A m3363 -N 1
module load pytorch
A– harvest LR+HR 4k flux cubes from Zarija, 1:1 content
Source: /pscratch/sd/z/zarija/MLHydro/
Destination: ls -lh /pscratch/sd/b/balewski/tmp_Nyx2023-flux/twopack_4Kcubes
Execute:
sim_Nyx4K_Zarija> time ./make_twopack_Nyx4K.py
B– slice cube into 2D smaller plains: 128 → 1k (zoom 8)
Source : /pscratch/sd/b/balewski/tmp_Nyx2023-flux/twopack_4Kcubes
Destination : /pscratch/sd/b/balewski/tmp_Nyx2023-flux/data
and permanent: /global/cfs/cdirs/m3363/balewski/superRes-???
Execute :
sim_Nyx4K_Zarija> time ./format_twopack_2_srgan2D_1lr8hr.py
salloc -C gpu -q interactive -t4:00:00 --gpus-per-task=1 --image=nersc/pytorch:ngc-21.08-v2 -A m3363_g --ntasks-per-node=4 -N 1
shifter ./test_dataloader.py -g 1 -v2 --design benchmk_flux4k --dataName flux_L80_s12-1LR8HR-Nyx4k-r2c32 --numSamp 300
==========
ML -Testing components
export MASTER_ADDR=`hostname`
srun -n 1 shifter python -u ./train_dist.py -v2 --numGlobSamp 256 --expName exp2 --dataName flux_L80_s12-1LR8HR-Nyx4k-r2c32 --basePath /pscratch/sd/b/balewski/tmp_Nyx2022a-flux/jobs/inter --epochs 4 --design benchmk_flux4k
export SLURM_ARRAY_JOB_ID=556
export SLURM_ARRAY_TASK_ID=49
./batchShifter.slr
Perdictions needs to be done on a separate node - or task is killed (not enough RAM in a cgroup?)
exp=556_4
sol=last
dat=/pscratch/sd/b/balewski/tmp_Nyx2022a-flux/jobs
./predict.py --basePath . --expName .
./ana_sr2d.py --expName $exp --genSol $sol --dataPath $dat -p a b d c
./ana_flux.py --expName $exp --genSol $sol --dataPath $dat -p abc