Difference between revisions of "Sysadmin"
(→Compute (servers and clusters)) |
(→Compute (servers and clusters)) |
||
Line 17: | Line 17: | ||
|+ CS machines and cluster machines | |+ CS machines and cluster machines | ||
|- | |- | ||
− | ! Machine name !! Ip | + | ! Machine name !! 159 Ip Address !! 10G Ip address !! Metal or Virtual !! Description |
|- | |- | ||
− | | Bowie || 159.28.22.5 10.10.10.15 || Metal || hosts and exports user files; Jupyterhub; landing server | + | | Bowie || 159.28.22.5 || 10.10.10.15 || Metal || hosts and exports user files; Jupyterhub; landing server |
|- | |- | ||
− | | Smiley || fill in || Metal || VM host, not accessible to regular users | + | | Smiley || fill in || fill in || Metal || VM host, not accessible to regular users |
|- | |- | ||
− | | Web || 159.28.22.2 10.10.10.200 || Virtual || Website host | + | | Web || 159.28.22.2 || 10.10.10.200 || Virtual || Website host |
|- | |- | ||
− | | Auth || 159.28.22.39 || Virtual || host of LDAP user database | + | | Auth || 159.28.22.39 || No 10G internet|| Virtual || host of LDAP user database |
|- | |- | ||
− | | Code || 159.28.22.42 10.10.10.42 || Virtual || Gitlab host | + | | Code || 159.28.22.42 || 10.10.10.42 || Virtual || Gitlab host |
|- | |- | ||
− | | Net || 159.28.22.1 10.10.10.100 || Virtual || network administration host for CS | + | | Net || 159.28.22.1 || 10.10.10.100 || Virtual || network administration host for CS |
|- | |- | ||
− | | Lovelace || 159.28.23.35 10.10.10.35 || Metal || Example | + | | Lovelace || 159.28.23.35 || 10.10.10.35 || Metal || Example |
|- | |- | ||
− | | Hopper || 159.28.23.1 10.10.10.1 || Metal || landing server, NFS host for cluster | + | | Hopper || 159.28.23.1 || 10.10.10.1 || Metal || landing server, NFS host for cluster |
|- | |- | ||
− | | Sakurai || 159.23.23.3 10.10.10.3 || Metal || Example | + | | Sakurai || 159.23.23.3 || 10.10.10.3 || Metal || Example |
|- | |- | ||
− | |HopperPrime || | + | |HopperPrime || 159.28.23.142 || 10.10.10.142 || Metal || Runs Backup |
|- | |- | ||
− | | Monitor || fill in || Metal || Server Monitoring | + | | Monitor || fill in || fill in|| Metal || Server Monitoring |
|- | |- | ||
− | | Bronte || 159.28.23.140 || Metal || Example | + | | Bronte || 159.28.23.140 || No 10G internet || Metal || Example |
|- | |- | ||
− | | Layout 0 || 159.28.23.2 10.10.10.2 || Metal || Example | + | | Layout 0 || 159.28.23.2 || 10.10.10.2 || Metal || Example |
|- | |- | ||
− | | Layout 3 || fill in || Metal || Example | + | | Layout 3 || fill in || fill in || Metal || Example |
|- | |- | ||
− | | Layout 1 || fill in || Metal || Example | + | | Layout 1 || fill in || fill in || Metal || Example |
|- | |- | ||
− | | Layout 2 || fill in || Metal || Example | + | | Layout 2 || fill in || fill in || Metal || Example |
|- | |- | ||
− | | Whedon || 159.28.23.4 | + | | Whedon || 159.28.23.4 || No 10G internet|| Metal || Example |
|- | |- | ||
− | | Pollock || 159.28.23.8 10.10.10.8 || Metal || Example | + | | Pollock || 159.28.23.8 || 10.10.10.8 || Metal || Example |
|} | |} | ||
CS machines: | CS machines: | ||
bowie.cs.eaarlham.edu web.cs.earlham.edu auth.cs.earlham.edu code.cs.earlham.edu net.cs.earlham.edu | bowie.cs.eaarlham.edu web.cs.earlham.edu auth.cs.earlham.edu code.cs.earlham.edu net.cs.earlham.edu | ||
− | Cluster Machines: lovelace.cluster.earlham.edu hopper.cluster.earlham.edu sakurai.cluster.earlham.edu bronte.cluster.earlham.edu whedon.cluster.earlham.edu pollock.cluster.earlham.edu layout.cluster.earlham.edu monitor.cluster.earlham.edu | + | Cluster Machines: lovelace.cluster.earlham.edu hopper.cluster.earlham.edu hopperprime.cluster.earlham.edu sakurai.cluster.earlham.edu bronte.cluster.earlham.edu whedon.cluster.earlham.edu pollock.cluster.earlham.edu layout.cluster.earlham.edu monitor.cluster.earlham.edu |
Revision as of 11:25, 8 November 2021
This is the hub for the CS sysadmins on the wiki.
Overview
If you're visually inclined, we have a colorful and easy-to-edit map of our servers here!
Server room
Our servers are in Noyes, the science building that predates the CST. For general information about the server room and how to use it, check out this page.
Columns: machine name, IPs, type (virtual, metal), purpose, dies, cores, RAM
Compute (servers and clusters)
Machine name | 159 Ip Address | 10G Ip address | Metal or Virtual | Description |
---|---|---|---|---|
Bowie | 159.28.22.5 | 10.10.10.15 | Metal | hosts and exports user files; Jupyterhub; landing server |
Smiley | fill in | fill in | Metal | VM host, not accessible to regular users |
Web | 159.28.22.2 | 10.10.10.200 | Virtual | Website host |
Auth | 159.28.22.39 | No 10G internet | Virtual | host of LDAP user database |
Code | 159.28.22.42 | 10.10.10.42 | Virtual | Gitlab host |
Net | 159.28.22.1 | 10.10.10.100 | Virtual | network administration host for CS |
Lovelace | 159.28.23.35 | 10.10.10.35 | Metal | Example |
Hopper | 159.28.23.1 | 10.10.10.1 | Metal | landing server, NFS host for cluster |
Sakurai | 159.23.23.3 | 10.10.10.3 | Metal | Example |
HopperPrime | 159.28.23.142 | 10.10.10.142 | Metal | Runs Backup |
Monitor | fill in | fill in | Metal | Server Monitoring |
Bronte | 159.28.23.140 | No 10G internet | Metal | Example |
Layout 0 | 159.28.23.2 | 10.10.10.2 | Metal | Example |
Layout 3 | fill in | fill in | Metal | Example |
Layout 1 | fill in | fill in | Metal | Example |
Layout 2 | fill in | fill in | Metal | Example |
Whedon | 159.28.23.4 | No 10G internet | Metal | Example |
Pollock | 159.28.23.8 | 10.10.10.8 | Metal | Example |
CS machines: bowie.cs.eaarlham.edu web.cs.earlham.edu auth.cs.earlham.edu code.cs.earlham.edu net.cs.earlham.edu
Cluster Machines: lovelace.cluster.earlham.edu hopper.cluster.earlham.edu hopperprime.cluster.earlham.edu sakurai.cluster.earlham.edu bronte.cluster.earlham.edu whedon.cluster.earlham.edu pollock.cluster.earlham.edu layout.cluster.earlham.edu monitor.cluster.earlham.edu
There are 6 machines currently not in use in the 6 spaces above Monitor on the Equitorial Guinea rack
Specialized resources
Specialized computing applications are supported on the following machines:
- GPU’s for AI/ML/data science: layout cluster
- virtualization: smiley
- containers: bowie
Network
We have two network fabrics linking the machines together. There are three subdomains.
10 Gb
We have 10Gb fabric to mount files over NFS. Machines with 10Gb support have an IP address in the class C range 10.10.10.0/24 and we want to add DNS to these addresses.
1 Gb (cluster, cs)
We have two class C subnets on the 1Gb fabric: 159.28.22.0/24 (CS) and 159.28.23.0/24 (cluster). This means we have double the IP addresses on the 1Gb fabric that we have on the 10Gb fabric.
Any user accessing *.cluster.earlham.edu and *.cs.earlham.edu is making calls on a 1Gb network.
Intra-cluster fabrics
The layout cluster has an Infiniband infrastructure. Wachowski has only a 1Gb infrastructure.
Power
We have a backup power supply, with batteries last upgraded in 2019 (?). We’ve had a few outages since then and power has held up well.
HVAC
HVAC systems are static and are largely managed by Facilities.
See full topology diagrams here.
A word about what's happening between files and the drives they live on.
New sysadmins
These pages will be helpful for you if you're just starting in the group:
- Welcoming a new sysadmin
- General troubleshooting tips for admins
- Sandbox Notes
- Password managers
- Server safety
- Ticket tracking for current projects
Note: you'll need to log in with wiki credentials to see most Sysadmin pages.
Additional information
These pages contain a lot of the most important information about our systems and how we operate.
Technical docs
- Ticket tracking for current projects
- Server safety
- Backup
- Monitoring
- SSH info relevant to admins
- User Management and LDAP generally
- Jupyterhub and NBGrader
- Email service
- Xen Server
- Network File System (NFS)
- Web Servers and Websites
- Databases
- DNS and DHCP
- AWS
- Bash startup scripts
- VirtualBox
- X Applications
- Cluster Overview and additional details
- Firewall running on babbage.cs.e.e
Common tasks
- Recurring tasks - e.g. software updates, hardware replacements
- Contacting all users
- Reset password
- Software installation
- Installing software under modules
- Add a computer to CS or cluster domains
- Supporting senior projects
- How to do a planned shutdown and reboot of the system
- Testing services (after a reboot, upgrade, change in the phase of the moon, etc.)
- Upgrading SSL Certificates
- Launch a process at startup