Difference between revisions of "Sysadmin"

From Earlham CS Department
Jump to navigation Jump to search
m
(Compute (servers and clusters))
 
(25 intermediate revisions by 3 users not shown)
Line 8: Line 8:
  
 
Our servers are in Noyes, the science building that predates the CST. For general information about the server room and how to use it, check out [[Sysadmin:Server Room|this page]].
 
Our servers are in Noyes, the science building that predates the CST. For general information about the server room and how to use it, check out [[Sysadmin:Server Room|this page]].
 +
 +
Columns: machine name, IPs, type (virtual, metal), purpose, dies, cores, RAM
  
 
== Compute (servers and clusters) ==
 
== Compute (servers and clusters) ==
  
We have CS and cluster machines.
 
  
CS machines:
+
{| class="wikitable"
* bowie: hosts and exports user files; Jupyterhub; landing server
+
|+ CS machines and cluster machines
* smiley: VM host, not accessible to regular users
+
|-
* web: website host
+
! Machine name !! 159 Ip Address !! 10Gb Ip address !! Operating System !! Metal or Virtual !! Description !! RAM
* net: network administration host for CS
+
|-
* code: GitLab host
+
| Bowie || 159.28.22.5 || 10.10.10.15 || Debian 9 || Metal || hosts and exports user files; Jupyterhub; landing server || 198 GB
* auth: host of the LDAP user database
+
|-
 +
| Smiley || 159.28.22.251 || 10.10.10.252 || Ubuntu 18.04 || Metal || VM host, not accessible to regular users
 +
|-
 +
| Web || 159.28.22.2 || 10.10.10.200 || Ubuntu 18.04 || Virtual || Website host || 8 GB
 +
|-
 +
| Auth || 159.28.22.39 || No 10Gb internet|| CentOS 7 || Virtual || host of LDAP user database || 4 GB
 +
|-
 +
| Code || 159.28.22.42 || 10.10.10.42 || Ubuntu 18.04 || Virtual || Gitlab host || 4 GB
 +
|-
 +
| Net || 159.28.22.1 || 10.10.10.100 || Ubuntu 18.04 || Virtual || network administration host for CS || 4 GB
 +
|-
 +
| Lovelace || 159.28.23.35 || 10.10.10.35 || CentOS 7 || Metal || Large compute server || 96 GB
 +
|-
 +
| Hopper || 159.28.23.1 || 10.10.10.1 || Debian 10 || Metal || landing server, NFS host for cluster || 64 GB
 +
|-
 +
| Sakurai || 159.23.23.3 || 10.10.10.3 || Debian 10 || Metal || Runs Backup || 12 GB
 +
|-
 +
| Miyamoto || 159.28.23.45 || No 10Gb currently || Debian 10 || Metal || Runs Backup || 16 GB
 +
|-
 +
|HopperPrime || 159.28.23.142 || 10.10.10.142 || Debian 10 || Metal || Runs Backup || 16 GB
 +
|-
 +
| Monitor || 159.28.23.250 || No 10Gb internet || CentOS 7 || Metal || Server Monitoring || 16 GB
 +
|-
 +
| Bronte || 159.28.23.140 || No 10Gb internet || CentOS 7 || Metal || Large compute server || 115 GB
 +
|-
 +
| Layout 0 || 159.28.23.2 || 10.10.10.2 || CentOS 7 || Metal || Head Node || 32 GB
 +
|-
 +
| Layout 1 || None || None || CentOS 7 || Metal || Compute Node || 32 GB
 +
|-
 +
| Layout 2 || None || None || CentOS 7 || Metal || Compute Node || 32 GB
 +
|-
 +
| Layout 3 || None || None || CentOS 7 || Metal || Compute Node || 32 GB
 +
|-
 +
| Layout 4 || None || None || CentOS 7 || Metal || Compute Node || 32 GB
 +
|-
 +
| Whedon || 159.28.23.4 || No 10Gb internet|| CentOS 7 || Metal || Head Node || 230 GB
 +
|-
 +
| Pollock || 159.28.23.8 || 10.10.10.8 || CentOS 7 || Metal || Large compute server || 131 GB
 +
|}
 +
CS machines:  
 +
bowie.cs.earlham.edu web.cs.earlham.edu auth.cs.earlham.edu code.cs.earlham.edu net.cs.earlham.edu
  
Cluster machines:
+
Cluster Machines: lovelace.cluster.earlham.edu hopper.cluster.earlham.edu hopperprime.cluster.earlham.edu sakurai.cluster.earlham.edu bronte.cluster.earlham.edu whedon.cluster.earlham.edu pollock.cluster.earlham.edu layout.cluster.earlham.edu monitor.cluster.earlham.edu miyamoto.cluster.earlham.edu
* hopper: landing server, NFS host for cluster
 
* bronte, pollock, lovelace: large compute servers
 
* layout, wachowski: clusters of multiple nodes linked together through a switch and managed through a headnode
 
* meier, miyamoto, sakurai: backup servers
 
* monitor: server monitoring
 
  
We have spare nodes on the old al-salam cluster’s rack. These should be used for services that can handle minutes to hours of downtime, as they only have one power supply.
 
  
 +
There are 6 machines currently not in use in the 6 spaces above Monitor on the Equitorial Guinea rack
 
=== Specialized resources ===
 
=== Specialized resources ===
  
Line 106: Line 142:
 
* [[X Applications]]
 
* [[X Applications]]
 
* [[Sysadmin:Services:ClusterOverview|Cluster Overview]] and [[Sysadmin:Ccg-admin|additional details]]
 
* [[Sysadmin:Services:ClusterOverview|Cluster Overview]] and [[Sysadmin:Ccg-admin|additional details]]
 +
* [[Sysadmin:Firewall|Firewall]] running on babbage.cs.e.e
  
 
===Common tasks===
 
===Common tasks===

Latest revision as of 15:22, 13 December 2021

This is the hub for the CS sysadmins on the wiki.

Overview

If you're visually inclined, we have a colorful and easy-to-edit map of our servers here!

Server room

Our servers are in Noyes, the science building that predates the CST. For general information about the server room and how to use it, check out this page.

Columns: machine name, IPs, type (virtual, metal), purpose, dies, cores, RAM

Compute (servers and clusters)

CS machines and cluster machines
Machine name 159 Ip Address 10Gb Ip address Operating System Metal or Virtual Description RAM
Bowie 159.28.22.5 10.10.10.15 Debian 9 Metal hosts and exports user files; Jupyterhub; landing server 198 GB
Smiley 159.28.22.251 10.10.10.252 Ubuntu 18.04 Metal VM host, not accessible to regular users
Web 159.28.22.2 10.10.10.200 Ubuntu 18.04 Virtual Website host 8 GB
Auth 159.28.22.39 No 10Gb internet CentOS 7 Virtual host of LDAP user database 4 GB
Code 159.28.22.42 10.10.10.42 Ubuntu 18.04 Virtual Gitlab host 4 GB
Net 159.28.22.1 10.10.10.100 Ubuntu 18.04 Virtual network administration host for CS 4 GB
Lovelace 159.28.23.35 10.10.10.35 CentOS 7 Metal Large compute server 96 GB
Hopper 159.28.23.1 10.10.10.1 Debian 10 Metal landing server, NFS host for cluster 64 GB
Sakurai 159.23.23.3 10.10.10.3 Debian 10 Metal Runs Backup 12 GB
Miyamoto 159.28.23.45 No 10Gb currently Debian 10 Metal Runs Backup 16 GB
HopperPrime 159.28.23.142 10.10.10.142 Debian 10 Metal Runs Backup 16 GB
Monitor 159.28.23.250 No 10Gb internet CentOS 7 Metal Server Monitoring 16 GB
Bronte 159.28.23.140 No 10Gb internet CentOS 7 Metal Large compute server 115 GB
Layout 0 159.28.23.2 10.10.10.2 CentOS 7 Metal Head Node 32 GB
Layout 1 None None CentOS 7 Metal Compute Node 32 GB
Layout 2 None None CentOS 7 Metal Compute Node 32 GB
Layout 3 None None CentOS 7 Metal Compute Node 32 GB
Layout 4 None None CentOS 7 Metal Compute Node 32 GB
Whedon 159.28.23.4 No 10Gb internet CentOS 7 Metal Head Node 230 GB
Pollock 159.28.23.8 10.10.10.8 CentOS 7 Metal Large compute server 131 GB

CS machines: bowie.cs.earlham.edu web.cs.earlham.edu auth.cs.earlham.edu code.cs.earlham.edu net.cs.earlham.edu

Cluster Machines: lovelace.cluster.earlham.edu hopper.cluster.earlham.edu hopperprime.cluster.earlham.edu sakurai.cluster.earlham.edu bronte.cluster.earlham.edu whedon.cluster.earlham.edu pollock.cluster.earlham.edu layout.cluster.earlham.edu monitor.cluster.earlham.edu miyamoto.cluster.earlham.edu


There are 6 machines currently not in use in the 6 spaces above Monitor on the Equitorial Guinea rack

Specialized resources

Specialized computing applications are supported on the following machines:

Network

We have two network fabrics linking the machines together. There are three subdomains.

10 Gb

We have 10Gb fabric to mount files over NFS. Machines with 10Gb support have an IP address in the class C range 10.10.10.0/24 and we want to add DNS to these addresses.

1 Gb (cluster, cs)

We have two class C subnets on the 1Gb fabric: 159.28.22.0/24 (CS) and 159.28.23.0/24 (cluster). This means we have double the IP addresses on the 1Gb fabric that we have on the 10Gb fabric.

Any user accessing *.cluster.earlham.edu and *.cs.earlham.edu is making calls on a 1Gb network.

Intra-cluster fabrics

The layout cluster has an Infiniband infrastructure. Wachowski has only a 1Gb infrastructure.

Power

We have a backup power supply, with batteries last upgraded in 2019 (?). We’ve had a few outages since then and power has held up well.

HVAC

HVAC systems are static and are largely managed by Facilities.

See full topology diagrams here.

A word about what's happening between files and the drives they live on.


New sysadmins

These pages will be helpful for you if you're just starting in the group:

Note: you'll need to log in with wiki credentials to see most Sysadmin pages.

Additional information

These pages contain a lot of the most important information about our systems and how we operate.

Technical docs

Common tasks

Group and institution information