Stable-Diffusion-for-Remote.../scripts/slurm
2023-05-06 17:04:03 +08:00
..
resume_512 first commit 2023-05-06 17:04:03 +08:00
resume_512_improvedaesthetic first commit 2023-05-06 17:04:03 +08:00
resume_768_hr first commit 2023-05-06 17:04:03 +08:00
v1_edgeinpainting first commit 2023-05-06 17:04:03 +08:00
v1_iahr_torch111 first commit 2023-05-06 17:04:03 +08:00
v1_iahr_torch111_ucg first commit 2023-05-06 17:04:03 +08:00
v1_improvedaesthetics first commit 2023-05-06 17:04:03 +08:00
v1_improvedaesthetics_torch111 first commit 2023-05-06 17:04:03 +08:00
v1_inpainting_aesthetics-larger-masks first commit 2023-05-06 17:04:03 +08:00
v1_inpainting_aesthetics-larger-masks-ucg first commit 2023-05-06 17:04:03 +08:00
v1_inpainting_improvedaesthetics_torch111 first commit 2023-05-06 17:04:03 +08:00
v1_laionhr_torch111 first commit 2023-05-06 17:04:03 +08:00
v1-upscaling-f16-pretraining-512-aesthetics first commit 2023-05-06 17:04:03 +08:00
v2_laionhr1024 first commit 2023-05-06 17:04:03 +08:00
v2_laionhr1024_2 first commit 2023-05-06 17:04:03 +08:00
v2_pretraining first commit 2023-05-06 17:04:03 +08:00
v3_pretraining first commit 2023-05-06 17:04:03 +08:00
README.md first commit 2023-05-06 17:04:03 +08:00

Example

Resume f8 @ 512 on Laion-HR

sbatch scripts/slurm/resume_512/sbatch.sh

Reuse

To reuse this as a template, copy sbatch.sh and launcher.sh somewhere. In sbatch.sh, adjust the lines

#SBATCH --job-name=stable-diffusion-512cont
#SBATCH --nodes=24

and the path to your launcher.sh in the last line,

srun bash /fsx/stable-diffusion/stable-diffusion/scripts/slurm/resume_512/launcher.sh

In launcher.sh, adjust CONFIG and EXTRA. Maybe give it a test run with debug flags uncommented and a reduced number of nodes.