Skip to content

Add Multi-GPU Support#62

Merged
LovelyBuggies merged 21 commits intomainfrom
codex/trun
Feb 17, 2026
Merged

Add Multi-GPU Support#62
LovelyBuggies merged 21 commits intomainfrom
codex/trun

Conversation

@LovelyBuggies
Copy link
Member

No description provided.

@LovelyBuggies
Copy link
Member Author

LovelyBuggies commented Feb 27, 2026

Version 1.3.7 Results

Tests on TLDR

python train_magrpo.py --config configs/magrpo_tldr_config.yaml --override magrpo.parallel_training=none wandb.project=homo_tldr wandb.name=magrpo_tldr_1p7b_1p7b_1gpu

CUDA_VISIBLE_DEVICES=0,1 python train_magrpo.py --config configs/magrpo_tldr_config.yaml --override magrpo.parallel_training=mp mgrpo.agent_devices='[\"cuda:0\",\"cuda:1\"]' wandb.project=homo_tldr wandb.name=magrpo_tldr_1p7b_1p7b_2gpu
python train_maac.py --config configs/maac_tldr_config.yaml --override maac.parallel_training=None wandb.project=homo_tldr wandb.name=maac_tldr_1p7b_1p7b_1gpu

CUDA_VISIBLE_DEVICES=0,1 python train_maac.py --config configs/maac_tldr_config.yaml --override maac.parallel_training=mp maac.agent_devices='[\"cuda:0\",\"cuda:1\"]' maac.critic_devices='[\"cuda:0\"]' wandb.project=homo_tldr wandb.name=maac_tldr_1p7b_1p7b_2gpu
python train_iac.py --config configs/iac_tldr_config.yaml --override iac.parallel_training=none wandb.project=homo_tldr wandb.name=iac_tldr_1p7b_1p7b_1gpu

CUDA_VISIBLE_DEVICES=0,1 python train_iac.py --config configs/iac_tldr_config.yaml --override iac.parallel_training=mp iac.agent_devices='[\"cuda:0\",\"cuda:1\"]' iac.critic_devices='[\"cuda:0\",\"cuda:1\"]' wandb.project=homo_tldr wandb.name=iac_tldr_1p7b_1p7b_2gpu

Tests on CHE

python train_magrpo.py --config configs/magrpo_che_config.yaml --override magrpo.parallel_training=None agent_model.name=None agents='[\"Qwen/Qwen2.5-Coder-3B\",\"Qwen/Qwen3-4B-Instruct-2507\"]' magrpo.agent_devices='[\"cuda:0\",\"cuda:1\"]' critic_model.name=None critics=None wandb.project=hetero_che wandb.name=magrpo_che_3b_4b_1gpu

CUDA_VISIBLE_DEVICES=0,1 python train_magrpo.py --config configs/magrpo_che_config.yaml --override magrpo.parallel_training=mp agent_model.name=None agents='[\"Qwen/Qwen2.5-Coder-3B\",\"Qwen/Qwen3-4B-Instruct-2507\"]' magrpo.agent_devices='[\"cuda:0\",\"cuda:1\"]' critic_model.name=None critics=None wandb.project=hetero_che wandb.name=magrpo_che_3b_4b_2gpu
python train_maac.py --config configs/maac_che_config.yaml --override maac.parallel_training=none agent_model.name=None agents='[\"Qwen/Qwen2.5-Coder-3B\",\"Qwen/Qwen3-4B-Instruct-2507\"]' critic_model.name=None critics='[\"Qwen/Qwen2.5-Coder-3B\"]' wandb.project=hetero_che wandb.name=maac_che_3b_4b_1gpu

CUDA_VISIBLE_DEVICES=0,1 python train_maac.py --config configs/maac_che_config.yaml --override maac.parallel_training=mp agent_model.name=None agents='[\"Qwen/Qwen2.5-Coder-3B\",\"Qwen/Qwen3-4B-Instruct-2507\"]' maac.agent_devices='[\"cuda:0\",\"cuda:1\"]' maac.critic_devices='[\"cuda:0\"]' critic_model.name=None critics='[\"Qwen/Qwen2.5-Coder-3B\"]' wandb.project=hetero_che wandb.name=maac_che_3b_4b_2gpu
python train_iac.py --config configs/iac_che_config.yaml --override iac.use_separate_critic=false iac.parallel_training=none agent_model.name=None agents='[\"Qwen/Qwen2.5-Coder-3B\",\"Qwen/Qwen3-4B-Instruct-2507\"]' critic_model.name=None critics=None wandb.project=hetero_che wandb.name=iac_che_3b_4b_share_1gpu

CUDA_VISIBLE_DEVICES=0,1 python train_iac.py --config configs/iac_che_config.yaml --override iac.use_separate_critic=false iac.parallel_training=mp agent_model.name=None agents='[\"Qwen/Qwen2.5-Coder-3B\",\"Qwen/Qwen3-4B-Instruct-2507\"]' iac.agent_devices='[\"cuda:0\",\"cuda:1\"]' critic_model.name=None critics=None wandb.project=hetero_che wandb.name=iac_che_3b_4b_share_2gpu

Tests on Minecraft

python house_build/train/train_magrpo.py --config house_build/configs/house_build_magrpo_config.yaml --override agents='[\"Qwen/Qwen2.5-3B-Instruct\",\"Qwen/Qwen3-4B-Instruct-2507\"]' magrpo.parallel_training=None agent_model.name=None critics=None critic_model.name=None wandb.project=hetero-mc wandb.name='magrpo_house_3B_4B_1gpu'
python house_build/train/train_maac.py --config house_build/configs/house_build_maac_config.yaml --override agents='[\"Qwen/Qwen2.5-3B-Instruct\",\"Qwen/Qwen3-4B-Instruct-2507\"]' maac.parallel_training=None agent_model.name=None critics='[\"Qwen/Qwen3-4B-Instruct-2507\"]' critic_model.name=None wandb.project=hetero-mc wandb.name='maac_house_3B_4B_1gpu'"
python house_build/train/train_maac.py --config house_build/configs/house_build_maac_config.yaml --override agents='[\"Qwen/Qwen2.5-3B-Instruct\",\"Qwen/Qwen3-4B-Instruct-2507\"]' maac.parallel_training=None agent_model.name=None critics='[\"Qwen/Qwen3-4B-Instruct-2507\"]' critic_model.name=None wandb.project=hetero-mc wandb.name='maac_house_3B_4B_1gpu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant