-
Notifications
You must be signed in to change notification settings - Fork 0
/
ray_cmds
executable file
·19 lines (16 loc) · 1012 Bytes
/
ray_cmds
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
ray start --head --redis-port=54504
ray stop
ray start --address=152.71.172.184:54504 --load-code-from-local
sudo apt install sshpass (only required on master)
DO THIS FOR EACH NEW WORKER NODE
1. sudo apt-get install openssh-server openssh-client (install on workers)
2. Manually connect scp NAS-0.1.tar.gz user@ip:/tmp (Required only once to save as trusted key)
3. Install conda on node
4. pip install numpy psutil ray tensorflow==2.0.0-rc1 gym box2d-py
5. conda install pytorch torchvision cpuonly -c pytorch
6. (roboschool environments) pip install roboschool; sudo apt-get install libpcre3-dev libgl1-mesa-dev
python setup.py sdist
sshpass -p "hpcisrg1" scp NAS-0.1.tar.gz hpc1@152.71.172.127:/tmp
sshpass -p "hpcisrg1" ssh hpc1@152.71.172.127 -t "bash -l -c -i 'pip install /tmp/NAS-0.1.tar.gz'"
sshpass -p "hpcisrg1" ssh hpc1@152.71.172.127 -t "bash -l -c -i 'ray stop'"
sshpass -p "hpcisrg1" ssh hpc1@152.71.172.127 -t "bash -l -c -i 'ray start --address=152.71.172.184:54504 --load-code-from-local'"