Difference between revisions of "ShutdownProcedure"

From Earlham CS Department
Jump to navigation Jump to search
(Mounting Logical Volumes)
(Starting VMs)
Line 108: Line 108:
 
=== Starting VMs ===
 
=== Starting VMs ===
  
The VMs on <code>control</code> should be brought up in the reverse order they were shutdown. It is very important to bring up <code>net</code> first.
+
The VMs on <code>control</code> should be brought up in the reverse order they were shutdown. <b>home</b> and <b>tools</b> are on <code>control</code>.
 +
 
 +
It is very important to bring up <code>net</code> first, which is on <code>smiley</code>.
  
 
<pre>
 
<pre>
control-# xm create -c /home/sysadmin/eccs-<hostname>.cfg
+
control-# xm create -c /home/sysadmin/xen/configs/eccs-<hostname>.cfg
 
# To exit to the hypervisor shell you can press Ctrl + ]
 
# To exit to the hypervisor shell you can press Ctrl + ]
 
# To start VM without going into console for boot messages, forego the -c
 
# To start VM without going into console for boot messages, forego the -c
 
</pre>
 
</pre>
The configuration files used for starting the virtual machines are in <code>/mnt/vmdata-shared/config/</code>. There are 4: <code>eccs-home.cfg</code>, <code>eccs-web.cfg</code>, <code>eccs-net.cfg</code>, and <code>eccs-tools.cfg</code>. There are backup copies in <code>~sysadmin/xen-configs</code>.
 
  
 
Connect to VM console
 
Connect to VM console
 
 
<pre>
 
<pre>
 
console-# xm console <hostname>.cs.earlham.edu
 
console-# xm console <hostname>.cs.earlham.edu
 
</pre>
 
</pre>
 +
 
<blockquote>You'll need to make sure that their DNS resolver settings are correctly set. I've had trouble with them using the incorrect DNS server settings sometimes and I'm not sure if the issues are resolved or not.
 
<blockquote>You'll need to make sure that their DNS resolver settings are correctly set. I've had trouble with them using the incorrect DNS server settings sometimes and I'm not sure if the issues are resolved or not.
  
Line 127: Line 128:
 
</blockquote>
 
</blockquote>
  
Update 24/6/2017: The new procedure for starting and shutting down net and web are as follow.
 
  
 
To shut down:
 
To shut down:
 +
Ideally the VM's should be shutdown from inside (by ssh'ing into them). After that, run "xl list" to see if they're still listed as domains, then run the "xl destroy" commands as above if needed.
 
<pre>
 
<pre>
 
# xl destroy net.cs.earlham.edu
 
# xl destroy net.cs.earlham.edu
 
# xl destroy web.cs.earlham.edu
 
# xl destroy web.cs.earlham.edu
 
</pre>
 
</pre>
Idealy though, the VM's should be shutdown from inside (by ssh'ing into them). After that, run "xl list" to see if they're still listed as domains, then run the "xl destroy" commands as above if needed.
+
 
  
 
To start them up:
 
To start them up:

Revision as of 04:19, 11 June 2018

These are the shutdown and boot up instructions for CS and Cluster servers. These page also has the reboot procedure.

Originally published: 2017-04-18 Updated: 2018-04-08

Backup Critical wiki pages

There is a script in sysadmin@home.cs.earlham.edu:~/wiki_critical/ called send_wiki.sh that specifies which pages to pull down and send out via email. This is important to do before anyone starts shutting down the machines because the wiki will go offline.


General Info

  • babbage should be the very last machine brought down
  • to get to sysadmin@control first ssh into home or hopper.
  • Make sure all virtual machines are shut down before restarting the bare metal hardware

CS virtual machies

The different VMs mount from eachother, so just be patient and hopefully everything will work out.

Tools

We may have to restart nginx, jupyter, and sage by hand. Using history | grep <command> is helpful here. (make sure to grab the entire command including ampersand)

Jupyter

eccs-tools# nohup su -c "/mnt/lovelace/software/anaconda/envs/py35/bin/jupyterhub -f /etc/jupyterhub/jupyterhub_config.py --no-ssl" &

Sage

eccs-tools# nohup /home/sage/sage-6.8/sage --notebook=sagenb accounts=False automatic_login=False interface= port=8080 &

Hadoop

Hadoop runs on whedon and might also need to be restarted manually.

sysadmin@hopper$ ssh w0
sysadmin@w0$ sudo su hadoop
haddop@w0$ cd $HADOOP_HOME
hadoop@w0$ ./sbin/start-all.sh

Cluster

order of shutdowns

  1. all compute nodes: (layout, alsalam, whedon) and t-voc, bigfe, elwood
  2. all head nodes: (layout, alsalam, whedon)
  3. pollock
  4. bronte + disk array
  5. wait until everything up to this point has shutdown
  6. dali
  7. kahlo
  8. wait until everything up to this point has shutdown
  9. hopper

order to bring up

The reverse of shutdown, again make sure to wait before proceeding at the appropriate steps.

CS

order of shutdowns

If hopper is back online, ssh sysadmin@cluster.cs.earlham.edu and then ssh sysadmin@control.cs.earlham.edu. This way we can shutdown all the VMs directly without being knocked off line or being in the machine room.

Example for shutting down a machine on control.

ssh sysadmin@home.cs.earlham.edu
ssh sysadmin@control.cs.earlham.edu
sudo su -
control-# xm destroy <hostname>.cs.earlham.edu

List running VMs

control-# xm list
  1. proto (lives seperatly, ssh admin@proto.cs.earlham.edu)
  2. murphy ( ssh admin@home.cs.earlham.edu then ssh admin@murphy.cs.earlham.edu)
  3. tools
  4. web
  5. home
  6. net
  7. control (where xen runs home and tools as VMs)
  8. smiley (web and net)
  9. babbage (firewall)

order of bring up

Bring up control first.

  • Make sure all virtual machines are shut down before restarting the bare metal hardware*

Mounting Logical Volumes

When you reboot, the LVM volume groups and logical volumes may not be automatically enabled. To bring them back do

console-# lvscan
console-# vgscan
console-# vgchange -a y

This should be done at boot using /etc/init.d/rc.sysinit but there still might be some subtleties there.

Starting VMs

The VMs on control should be brought up in the reverse order they were shutdown. home and tools are on control.

It is very important to bring up net first, which is on smiley.

control-# xm create -c /home/sysadmin/xen/configs/eccs-<hostname>.cfg
# To exit to the hypervisor shell you can press Ctrl + ]
# To start VM without going into console for boot messages, forego the -c

Connect to VM console

console-# xm console <hostname>.cs.earlham.edu

You'll need to make sure that their DNS resolver settings are correctly set. I've had trouble with them using the incorrect DNS server settings sometimes and I'm not sure if the issues are resolved or not.

-Eamon


To shut down: Ideally the VM's should be shutdown from inside (by ssh'ing into them). After that, run "xl list" to see if they're still listed as domains, then run the "xl destroy" commands as above if needed.

# xl destroy net.cs.earlham.edu
# xl destroy web.cs.earlham.edu


To start them up:

# xl create ~sysadmin/xen-configs/eccs-net.cfg
# xl create ~sysadmin/xen-configs/eccs-web.cfg