Shut down one server

From Earlham CS Department
Revision as of 16:05, 23 January 2019 by Craigje (talk | contribs) (Machines where this is okay)
Jump to navigation Jump to search

If you have one server you want to shut down and bring back up, make sure to keep the following in mind. Most of these points are non-technical.

Machines where this is okay

You may restart any of these machines relatively quickly and non-disruptively most of the time.

On the CS subdomain:

  • tools
  • web

On the cluster subdomain:

  • compute nodes
  • in rare cases, individual clusters, one at a time (note that this excludes hopper, which should rarely be shut down)

For other machines, email the admin mailing list or talk to a faculty supervisor first. (This is good practice in any case, but *especially* do it for machines not listed here.)

Reminders

The process to restart one of our servers is as follows:

  1. unmount file systems if applicable
  2. check on backups if applicable
  3. sudo shutdown -h now or sudo reboot when you're ready; you will immediately lose ssh connections
    1. shutdown will require you to physically start the machine
    2. reboot *should* automatically bring everything back up normally within a few minutes, though it may depend on your problem

Also remember these guidelines:

  1. In an emergency (e.g. no services seem available, no one can log in to things), you can just restart the machine.
  2. That said, give at least a few hours' notice if possible. These servers aren't in use 100% of the time but when they *are* in use it's important that people can continue to use them.
  3. Be prepared to go to Noyes basement if there are problems restarting the server remotely. In other words, be on-campus and preferably in the science complex when you do this (or be in communication with an admin who is).

The point is to be courteous to the community for whom we run these servers.