nemo-mbridge-multi-node-slurm
v?Convert single-node scripts to multi-node Slurm sbatch jobs and debug common multi-node failures. Covers srun-native vs uv run torch.distributed approaches, container setup, NCCL timeouts, OOM sizing for MoE models, and interactive allocation.
0· 0·0 当前·0 累计
运行时依赖
无特殊依赖
安装命令
点击复制官方npx clawhub@latest install nemo-mbridge-multi-node-slurm
镜像加速npx clawhub@latest install nemo-mbridge-multi-node-slurm --registry https://cn.longxiaskill.com镜像同步中