First off, thanks for open-sourcing this amazing work!
I'm looking to scale run_ce.sh training across multiple nodes because my 4xA100 node runs out of memory.
Any pointers on how to get implicit PRM training with run_ce.sh to work on a multi-node setup?
Are the examples in the OpenRLHF documentation immediately applicable?
Any insights appreciated!
Best
First off, thanks for open-sourcing this amazing work!
I'm looking to scale
run_ce.shtraining across multiple nodes because my 4xA100 node runs out of memory.Any pointers on how to get implicit PRM training with
run_ce.shto work on a multi-node setup?Are the examples in the OpenRLHF documentation immediately applicable?
Any insights appreciated!
Best