This is the official repository of the NeurIPS 2023 paper Game Solving with Online Fine-Tuning.
If you use this work for research, please consider citing our paper as follows:
@inproceedings{
wu2023game,
title={Game Solving with Online Fine-Tuning},
author={Wu, Ti-Rong and Guei, Hung and Wei, Ting Han and Shih, Chung-Chin and Chin, Jui-Te and Wu, I-Chen},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
year={2023},
url={https://openreview.net/forum?id=hN4qpvGzWn}
}
The following instructions are prepared for reproducing the main experiments (baseline, online-cp, online-sp, online-sp+cp) in the NeurIPS paper.
The game solver program requires a Linux operating system and at least one NVIDIA GPU to operate.
Nevertheless, to reproduce the main experiments in the NeurIPS paper, it is assume that there are three machines, HOST1, HOST2, and HOST3, each of which meets the requirements:
- At least 16 CPU threads and 192G RAM.
- Four NVIDIA GPU cards (GTX 1080 Ti or above).
- Properly installed NVIDIA drivers,
podman(ordocker), andtmux.
Note that these are not strict requirements of the program.
You may run the solver on a single host with only one GPU installed, just use the same machine for HOST1, HOST2, and HOST3 in the instructions below.
Clone this repository with the required submodules:
git clone --recursive --branch neurips2023 [email protected]:rlglab/online-fine-tuning-solver.git
cd online-fine-tuning-solverEnter the container to build the required executables:
scripts/start-container.sh
# run the below build commands inside the container
scripts/build.sh killallgo # build the main game solver program
cd chat && make && cd .. # build the message service for distributed computing
exit # exit the containerNow we will set up three computing nodes for the distributed game solver.
In the instructions below, remember to change HOST1, HOST2, and HOST3 to your actual machine host names.
Launch a new tmux window and start the chat service on HOST1:
scripts/chat/start-service.sh 8888 # use port 8888Hint: Once the script has been started successfully, it outputs a single line showing the executed command. Just detach the relevant
tmuxwindow and let it run in the background.
Deploy the distributed computing nodes on HOST2 and HOST3:
# run this script on both HOST2 and HOST3
scripts/setup-solver-tmux-session.sh CHAT=HOST1:8888Hint: The script launches a new
tmuxsession namedsolver. Just simply detach it as previously.
Configure the required workers for either the baseline or the online fine-tuning solver on HOST1 as follows:
export CHAT=HOST1:8888
# run this for the baseline solver:
yes | scripts/run-workers.sh killallgo "8B" HOST2 HOST3
# or, run this for the online fine-tuning solver:
yes | scripts/run-workers.sh killallgo "8O" HOST2 HOST3Hint: After the successful configuration, the script prints the broker name required for subsequent steps. E.g.,
b2 has been configured ...indicates a broker namedb2.
Finally, start the container to solve opening problems on HOST1 as follows.
scripts/start-container.sh
# run the below commands inside the container
# to solve JA using the baseline solver (remember to set CHAT and BROKER accordingly)
scripts/run-manager.sh killallgo JA CHAT=HOST1:8888 BROKER=b2 CUDA_VISIBLE_DEVICES=0 TIMEOUT=173000 actor_num_simulation=10000000 use_online_fine_tuning=false
# to solve JA using the online fine-tuning solver (online-cp)
scripts/run-manager.sh killallgo JA CHAT=HOST1:8888 BROKER=b2 CUDA_VISIBLE_DEVICES=0 TIMEOUT=173000 INFO=cp actor_num_simulation=10000000 use_online_fine_tuning=true use_critical_positions=true use_solved_positions=false
# to run online-sp or online-sp+cp solver, set these configs accordingly: INFO, use_critical_positions, and use_solved_positionsHint: To solve the opening using the baseline (or online fine-tuning) solver, the computing nodes must be configured accordingly by running
run-workers.shwith8B(or8O) before runningrun-manager.sh.
Results are stored after a successful run. For example, the results of solving JA with the online fine-tuning solver (where $NAME is JA-online-10000000-16.5-gpcn_n32_m16_empty_op-100000-7g384w+1sp1op-cp):
$NAME.logstores the log output by the manager.$NAME.sgfstores the search tree. Use GoGui to open the file.$NAME.statstores the main result and statistics. Usescripts/extract-main-stat.sh $NAME.statto extract fields as shown in the paper.$NAME.jobsstores all jobs generated by the manager.training/online_$NAMEstores the network models generated during the online fine-tuning.
- Development Environment covers the basic of developing with containers and building programs.
- Train PCN Models for the Solver introduces on the training of Proof Cost Networks.
- Launch the Standalone Solver instructs how to solve problems using the worker.
- Launch the Distributed Solver instructs how to solve problems using the manager, the workers, and the learner.
- Tools describes the usage of miscellaneous tools related to the game solvers.