Difference between revisions of "Cluster: New BobSCEd Install Log"
Jump to navigation
Jump to search
Mail
(→Head Node) |
|||
Line 92: | Line 92: | ||
** yum install compat-gcc-34-g77.x86_64 and gfortran | ** yum install compat-gcc-34-g77.x86_64 and gfortran | ||
** Followed directions from [http://www.webmo.net/support/gamess_linux.html Webmo site] | ** Followed directions from [http://www.webmo.net/support/gamess_linux.html Webmo site] | ||
− | + | * Added the following line to httpd.conf: | |
:<code>SuexecUserGroup bob users </code> | :<code>SuexecUserGroup bob users </code> | ||
+ | * Gaussian 09 not supported, though it's installed in /mounts/bobsced/usr/local/g09 | ||
+ | * Installed g03, except get errors: | ||
+ | <pre>Erroneous write during file extend. write 160 instead of 4096 | ||
+ | Probably out of disk space. | ||
+ | Write error in NtrExt1: No such file or directory | ||
+ | </pre> | ||
+ | or | ||
+ | <pre>Write error in NtrExt1: Bad address</pre> | ||
+ | ** To fix this, do <code>echo 0 > /proc/sys/kernel/randomize_va_space</code> |
Revision as of 16:19, 8 October 2009
Contents
Scratch Space
Log
Green color indicates something that still needs to be done.
Cloning
- Download the udpcast rpm from http://udpcast.linux.lu/source.html
- Install with
yum --nogpgcheck localinstall udpcast-20081213-1.i386.rpm
- On hopper, installed the syslinux and tftpd-hpa ports
- Enable tftpd in
/etc/inetd.conf
by removing the comments and restart inetd with/etc/rc.d/inetd restart
, and then also run the command listed on that line to start tftpd - The following lines were already in
/usr/local/etc/dhcpd.conf: allow booting; allow bootp;
, put the filename in the particular group (see Debian Clusters) cp /usr/local/share/syslinux/pxelinux.0 /tftpboot/
(the /tftpboot directory needs to be created)- Download linux, initrd, and default from the udpcast site into /tftpboot
- Move default into /tftpboot/pxelinux.cfg
- Restart dhcpd (
killall -KILL dhcpd
and/usr/local/sbin/dhcpd -q -cf /usr/local/etc/dhcpd.conf -lf /var/db/dhcpd/dhcpd.leases -pf /var/run/dhcpd/dhcpd.pid -user dhcpd -group dhcpd
- Enable tftpd in
- Install with
Head Node
Yum installed:
- gcc.x86_64, gcc-c++.x86_64
- for Ganglia:
- apr.x86_64 and apr-devel.x86_64
- libconfuse-2.5-4.el5.x86_64.rpm, libconfuse-devel-2.5-4.el5.x86_64.rpm (from Fedora repositories)
- expat-devel.x86_64
- for Intel updates:
- compat-libstdc++-33.i386
- blas.x86_64 (on all nodes)
Install C3 tools from http://www.csm.ornl.gov/torc/C3/C3softwarepage.shtml
- Downloaded full install rpm on bs0, installed with
yum --nogpgcheck localinstall c3-4.0.1-1.noarch.rpm
- See C3 Tools README and C3 Tools INSTALL
- Put root's keys in the home directory and authorized itself, then copied that to the worker node image
Ganglia
- On hopper, added the data_source line for bs0 to
/usr/local/etc/gmetad.conf
and restarted it with/usr/local/etc/rc.d/gmetad restart
- Downloaded tar ball from http://sourceforge.net/projects/ganglia/
- See Ganglia README
./configure --prefix=/cluster
- The head node uses a different Ganglia gmond.conf in /etc/ganglia/gmond.conf and the workers just have theirs symlinked to /cluster/etc/gmond.conf
- By default, iptables is running on the CentOS install and blocks hopper's Ganglia requests
- Turned off by clearing it and then running
/sbin/service iptables save
- Turned off by clearing it and then running
Networking
- Shorewall, see
/etc/shorewall/params
for almost all of the important definitions- Natting is done through
/etc/shorewall/masq
- Natting is done through
- DHCP relay, added to boot with
chkconfig on
, set for hopper (installed as part of dhcp yum package)- See
/etc/sysconfig/dhcrelay
- This means that a dhcp server is also installed, but it is not set to run and is not configured, either
- Hopper needs to have a static route added in order to have the responses return, these are in
/etc/rc.conf
:
- See
static_routes="bs0"
route_bs0="192.168.0.1 159.28.234.200"
Modules
- Installed environment-modules from http://download.fedora.redhat.com/pub/epel/5/x86_64/repoview/environment-modules.html
- Important directories:
/usr/share/Modules/
- Important directories:
Torque
- Installed from source with
./configure --with-default-server=bs0.bobsced.loc --with-rc=scp --disable-mom --with-server-home=/var/spool/pbs
("clients" is what installs qmgr) - Installs to /usr/local/
- Set up according to Debian Clusters setup
- Reran the ./configure but without --disable-moms, then ran
make packages
, copied this to worker node
Maui
- installed Maui according to same link as above
Intel Firmware Updates
- Configured sendmail by adding bs0.bobsced.loc to /etc/mail/local-host-names
NFS
- The actual filesystem on bs0-new is at /mounts/bobsced. The nodes all mount this in the same place.
- It's mounted on hopper at /mounts/bobsced, with the symlink (currently) at /cluster/bobscednew
WebMO
- yum installed httpd
- Installed on bs0 with the following params:
Path to perl: /usr/bin/perl Webserver name: bs0-new.cluster.earlham.edu HTML directory: /var/www/webmo HTML URL: /webmo CGI script directory: /var/www/cgi-bin CGI script URL: /cgi-bin User files directory: /mounts/bobsced/WebMO
- Get this error when authing with LDAP:
Can't locate Authen/Simple/LDAP.pm
- yum installed perl-LDAP.noarch, didn't work, so used CPAN to install Authen::Simple::LDAP
- edited /var/www/cgi-bin/interfaces/authen.conf for our LDAP settings
- Before externally authenticated users can use it, you have to go in as administrator and check the box to allow them in the Webmo group (or whatever other group)
- Gamess:
- yum install compat-gcc-34-g77.x86_64 and gfortran
- Followed directions from Webmo site
- Added the following line to httpd.conf:
SuexecUserGroup bob users
- Gaussian 09 not supported, though it's installed in /mounts/bobsced/usr/local/g09
- Installed g03, except get errors:
Erroneous write during file extend. write 160 instead of 4096 Probably out of disk space. Write error in NtrExt1: No such file or directory
or
Write error in NtrExt1: Bad address
- To fix this, do
echo 0 > /proc/sys/kernel/randomize_va_space
- To fix this, do