Difference between revisions of "Torque"
Jump to navigation
Jump to search
m |
|||
(4 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | + | If you're a sysadmin you want the information about [[Sysadmin:Slurm|the Slurm scheduler]] instead of Torque. This page is only for reference and to satisfy idle curiosity about how we did this in days of yore. | |
+ | |||
+ | = Archival Torque notes = | ||
+ | |||
+ | == Installing torque == | ||
+ | **(references: https://www.webmo.net/support/torque.html, http://docs.adaptivecomputing.com/torque/6-0-1/help.htm#topics/hpcSuiteInstall/manual/1-installing/installingTorque.htm) | ||
+ | **stop and disable the local firewalld (service stop and service disable) | ||
+ | **download tarball from adaptive computing (v6.0.0.1), unpack, cd | ||
+ | **./configure && make && make install | ||
+ | **check file contents of /var/spool/torque/server_name | ||
+ | **add trqauthd to system startup and start | ||
+ | **./torque.setup root | ||
+ | **setup /var/spool/torque/server_priv/nodes #see http://docs.adaptivecomputing.com/torque/6-0-1/help.htm#topics/torque/1-installConfig/specifyComputeNodes.htm | ||
+ | **add pbs_server to system startup and start | ||
+ | **add pbs_sched to system startup and start | ||
+ | **build the client packages and install # both mom and clients as per the webmo link | ||
+ | **qmgr -c “set server scheduling = True” | ||
+ | **qmgr -c “set queue batch resources_max.walltime = 1000:00:00” # should this be "set server" also? | ||
+ | **test with <tt>qsub -I</tt>, <tt>qsub --version</tt>, and <tt>pbsnodes -a</tt> (all compute nodes in the last command should be listed as "up") | ||
+ | **reboot server and test again | ||
== Uninstalling Torque from modules == | == Uninstalling Torque from modules == | ||
Line 22: | Line 41: | ||
*(aside, fix the python 3.5 install, s/bronte/pollock in setup files, change to correct version (3.6), reset paths) | *(aside, fix the python 3.5 install, s/bronte/pollock in setup files, change to correct version (3.6), reset paths) | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− |
Latest revision as of 07:54, 3 June 2020
If you're a sysadmin you want the information about the Slurm scheduler instead of Torque. This page is only for reference and to satisfy idle curiosity about how we did this in days of yore.
Archival Torque notes
Installing torque
- (references: https://www.webmo.net/support/torque.html, http://docs.adaptivecomputing.com/torque/6-0-1/help.htm#topics/hpcSuiteInstall/manual/1-installing/installingTorque.htm)
- stop and disable the local firewalld (service stop and service disable)
- download tarball from adaptive computing (v6.0.0.1), unpack, cd
- ./configure && make && make install
- check file contents of /var/spool/torque/server_name
- add trqauthd to system startup and start
- ./torque.setup root
- setup /var/spool/torque/server_priv/nodes #see http://docs.adaptivecomputing.com/torque/6-0-1/help.htm#topics/torque/1-installConfig/specifyComputeNodes.htm
- add pbs_server to system startup and start
- add pbs_sched to system startup and start
- build the client packages and install # both mom and clients as per the webmo link
- qmgr -c “set server scheduling = True”
- qmgr -c “set queue batch resources_max.walltime = 1000:00:00” # should this be "set server" also?
- test with qsub -I, qsub --version, and pbsnodes -a (all compute nodes in the last command should be listed as "up")
- reboot server and test again
Uninstalling Torque from modules
Torque should always be a system program, not a module. These are the steps for pollock, as done by Charlie. Substitute appropriate server names, and note that different systems have different ways of configuring software to launch on system start.
- uninstall torque from modules
- Make a tarball of /mounts/pollock/software/torque and save it to the root or sysadmin directory (this will save you problems)
- rm -rf /mounts/pollock/software/Modules/MODVER/modulefiles/torque
- rm -rf /mounts/pollock/software/torque
- check with “module avail”
- uninstall boost from modules
- as above with s/torque/boost/
- install torque dependencies
- yum install libxml2-devel openssl-devel gcc gcc-c++ boost boost-devel
- (aside, install locate, yum install mlocate and updatedb as root)
- (aside, fix modules install, cleanup /etc/bashrc, profile, and profile.d/modules.sh; this fixed the MANPATH problem too)
- (aside, fix the python 3.5 install, s/bronte/pollock in setup files, change to correct version (3.6), reset paths)