Difference between revisions of "Torque"
Jump to navigation
Jump to search
m (→Installing torque) |
(→Installing torque) |
||
Line 14: | Line 14: | ||
**build the client packages and install # both mom and clients as per the webmo link | **build the client packages and install # both mom and clients as per the webmo link | ||
**qmgr -c “set server scheduling = True” | **qmgr -c “set server scheduling = True” | ||
+ | **qmgr -c “set queue batch resources_max.walltime = 1000:00:00” # should this be "set server" also? | ||
**test with <tt>qsub -I</tt>, <tt>qsub --version</tt>, and <tt>pbsnodes -a</tt> (all compute nodes in the last command should be listed as "up") | **test with <tt>qsub -I</tt>, <tt>qsub --version</tt>, and <tt>pbsnodes -a</tt> (all compute nodes in the last command should be listed as "up") | ||
**reboot server and test again | **reboot server and test again |
Revision as of 13:34, 26 September 2019
A hub for notes about torque.
Installing torque
- (references: https://www.webmo.net/support/torque.html, http://docs.adaptivecomputing.com/torque/6-0-1/help.htm#topics/hpcSuiteInstall/manual/1-installing/installingTorque.htm)
- stop and disable the local firewalld (service stop and service disable)
- download tarball from adaptive computing (v6.0.0.1), unpack, cd
- ./configure && make && make install
- check file contents of /var/spool/torque/server_name
- add trqauthd to system startup and start
- ./torque.setup root
- setup /var/spool/torque/server_priv/nodes #see http://docs.adaptivecomputing.com/torque/6-0-1/help.htm#topics/torque/1-installConfig/specifyComputeNodes.htm
- add pbs_server to system startup and start
- add pbs_sched to system startup and start
- build the client packages and install # both mom and clients as per the webmo link
- qmgr -c “set server scheduling = True”
- qmgr -c “set queue batch resources_max.walltime = 1000:00:00” # should this be "set server" also?
- test with qsub -I, qsub --version, and pbsnodes -a (all compute nodes in the last command should be listed as "up")
- reboot server and test again
Uninstalling Torque from modules
Torque should always be a system program, not a module. These are the steps for pollock, as done by Charlie. Substitute appropriate server names, and note that different systems have different ways of configuring software to launch on system start.
- uninstall torque from modules
- Make a tarball of /mounts/pollock/software/torque and save it to the root or sysadmin directory (this will save you problems)
- rm -rf /mounts/pollock/software/Modules/MODVER/modulefiles/torque
- rm -rf /mounts/pollock/software/torque
- check with “module avail”
- uninstall boost from modules
- as above with s/torque/boost/
- install torque dependencies
- yum install libxml2-devel openssl-devel gcc gcc-c++ boost boost-devel
- (aside, install locate, yum install mlocate and updatedb as root)
- (aside, fix modules install, cleanup /etc/bashrc, profile, and profile.d/modules.sh; this fixed the MANPATH problem too)
- (aside, fix the python 3.5 install, s/bronte/pollock in setup files, change to correct version (3.6), reset paths)