Difference between revisions of "Sysadmin"

From Earlham CS Department
Jump to navigation Jump to search
m (Current Projects)
(Compute (servers and clusters))
 
(77 intermediate revisions by 5 users not shown)
Line 1: Line 1:
__NOTOC__
+
This is the hub for the CS sysadmins on the wiki.
  
= Machines and Brief Descriptions of Services =
+
= Overview =
== CS Machines ==
 
[[File:Server_layout_summer2017.jpg|thumb|200px|right|Server layout as of May 2017]]
 
  
{| style="float:left; margin-right:2px;"
+
[https://docs.google.com/drawings/d/1XaULz5IxXV_BZQjrko3QJ8wV5aXsSTYcSWxxT49OyZk/edit If you're visually inclined, we have a colorful and easy-to-edit map of our servers here!]
| style="height:40px; width:150px; text-align:center; background-color:#54C571; border-left:solid 5px #54C571; border-top:solid 5px #54C571; border-bottom:solid 1px white; border-right:solid 5px #54C571; font-size:120%;" | NET <br> (vm1)
+
 
 +
== Server room ==
 +
 
 +
Our servers are in Noyes, the science building that predates the CST. For general information about the server room and how to use it, check out [[Sysadmin:Server Room|this page]].
 +
 
 +
Columns: machine name, IPs, type (virtual, metal), purpose, dies, cores, RAM
 +
 
 +
== Compute (servers and clusters) ==
 +
 
 +
 
 +
{| class="wikitable"
 +
|+ CS machines and cluster machines
 +
|-
 +
! Machine name !! 159 Ip Address !! 10Gb Ip address !! Operating System !! Metal or Virtual !! Description !! RAM
 +
|-
 +
| Bowie || 159.28.22.5 || 10.10.10.15 || Debian 9 || Metal || hosts and exports user files; Jupyterhub; landing server || 198 GB
 +
|-
 +
| Smiley || 159.28.22.251 || 10.10.10.252 || Ubuntu 18.04 || Metal || VM host, not accessible to regular users
 +
|-
 +
| Web || 159.28.22.2 || 10.10.10.200 || Ubuntu 18.04 || Virtual || Website host || 8 GB
 +
|-
 +
| Auth || 159.28.22.39 || No 10Gb internet|| CentOS 7 || Virtual || host of LDAP user database || 4 GB
 
|-
 
|-
| style="height:210px; width:150px; background-color:#54C571; border-left:solid 5px #54C571; border-bottom:solid 5px #54C571; border-right:solid 5px #54C571;" | [[LDAP Server]] <br> [[Sysadmin:DNS & DHCP | DNS]] <br> [[Sysadmin:DNS & DHCP | DHCP]] <br><br> Backup to Dali: etc, var
+
| Code || 159.28.22.42 || 10.10.10.42 || Ubuntu 18.04 || Virtual || Gitlab host || 4 GB
|}
+
|-
 
+
| Net || 159.28.22.1 || 10.10.10.100 || Ubuntu 18.04 || Virtual || network administration host for CS || 4 GB
{| style="float:left; margin-right:2px;"
+
|-
| style="height:40px; width:150px; text-align:center; background-color:#E77471; border-left:solid 5px #E77471; border-top:solid 5px #E77471; border-bottom:solid 1px white; border-right:solid 5px #E77471; font-size:120%;" | WEB <br> (vm2)
+
| Lovelace || 159.28.23.35 || 10.10.10.35 || CentOS 7 || Metal || Large compute server || 96 GB
 +
|-
 +
| Hopper || 159.28.23.1 || 10.10.10.1 || Debian 10 || Metal || landing server, NFS host for cluster || 64 GB
 
|-
 
|-
| style="height:210px; width:150px; background-color:#E77471; border-left:solid 5px #E77471; border-bottom:solid 5px #E77471; border-right:solid 5px #E77471;" | [[Sysadmin:Email:Mailman | Mailman]] <br> [[Sysadmin:Mail Stack | Mail Stack]]<br> Apache2 <br> PostgresQL <br> MySQL <br> Wiki <br><br> Backup to Dali: etc, var
+
| Sakurai || 159.23.23.3 || 10.10.10.3 || Debian 10 || Metal || Runs Backup || 12 GB
|}
 
 
 
{| style="float:left; margin-right:2px;"
 
| style="height:40px; width:150px; text-align:center; background-color:#C38EC7; border-left:solid 5px #C38EC7; border-top:solid 5px #C38EC7; border-bottom:solid 1px white; border-right:solid 5px #C38EC7; font-size:120%;" | TOOLS <br> (vm3)
 
 
|-
 
|-
| style="height:210px; width:150px; background-color:#C38EC7; border-left:solid 5px #C38EC7; border-bottom:solid 5px #C38EC7; border-right:solid 5px #C38EC7;" | [[SageNB Server | SageNB Server]] <br> [[Jupyterhub notebook server | Jupyterhub Server]] <br> [[Sysadmin:Software Modules | Software Modules]] <br> NginX  <br>SSH<br>Users<br><br> Backup to Dali: etc, var, mnts, sage
+
| Miyamoto || 159.28.23.45 || No 10Gb currently || Debian 10 || Metal || Runs Backup || 16 GB
|}
 
 
 
{| style="float:left; margin-right:2px;"
 
| style="height:55px; width:150px; text-align:center; background-color:#E3A869; border-left:solid 5px #E3A869; border-top:solid 5px #E3A869; border-bottom:solid 1px white; border-right:solid 5px #E3A869; font-size:120%;" | BABBAGE
 
 
|-
 
|-
| style="height:210px; width:150px; background-color: #E3A869; border-left:solid 5px #E3A869; border-bottom:solid 5px #E3A869; border-right:solid 5px #E3A869;" | [[Sysadmin:Firewall | Firewall]]
+
|HopperPrime || 159.28.23.142 || 10.10.10.142 || Debian 10 || Metal || Runs Backup || 16 GB
|}
 
 
 
{|  
 
| style="height:55px; width:150px; text-align:center; background-color:#EEDC82; border-left:solid 5px #EEDC82; border-top:solid 5px #EEDC82; border-bottom:solid 1px white; border-right:solid 5px #EEDC82; font-size:120%;" | [[Sysadmin:Servers:Proto | PROTO]]
 
 
|-
 
|-
| style="height:210px; width:150px; background-color: #EEDC82; border-left:solid 5px #EEDC82; border-bottom:solid 5px #EEDC82; border-right:solid 5px #EEDC82;" | Weather Monitoring <br> GPS/NTP <br> Energy Monitoring <br><br> Backup to Dali: etc, var
+
| Monitor || 159.28.23.250 || No 10Gb internet || CentOS 7 || Metal || Server Monitoring || 16 GB
|}
+
|-  
 
+
| Bronte || 159.28.23.140 || No 10Gb internet || CentOS 7 || Metal || Large compute server || 115 GB
{| style="float:left; margin-right:2px;"
 
| style="height:40px; width:150px; text-align:center; background-color:#FF7E6D; border-left:solid 5px #FF7E6D; border-top:solid 5px #FF7E6D; border-bottom:solid 1px white; border-right:solid 5px      #FF7E6D; font-size:120%;" | BOWIE
 
 
|-
 
|-
| style="height:210px; width:150px; background-color:#FF7E6D; border-left:solid 5px #FF7E6D; border-bottom:solid 5px #FF7E6D; border-right:solid 5px #FF7E6D;" | PostgreSQL <br> Docker <br><br> Backup to Dali: etc, var
+
| Layout 0 || 159.28.23.2 || 10.10.10.2 || CentOS 7 || Metal || Head Node || 32 GB
|}
 
 
 
{| style="float:left; margin-right:2px;"
 
| style="height:40px; width:150px; text-align:center; background-color:#54C571; border-left:solid 5px #54C571; border-top:solid 5px #54C571; border-bottom:solid 1px white; border-right:solid 5px      #54C571; font-size:120%;" | SMILEY
 
 
|-
 
|-
| style="height:210px; width:150px; background-color:#54C571; border-left:solid 5px #54C571; border-bottom:solid 5px #54C571; border-right:solid 5px #54C571;" | [[XenDocs]] <br> NET <br> WEB <br>[[NFS]]<br><br> Backup to Dali: etc, var
+
| Layout 1 || None || None || CentOS 7 || Metal || Compute Node || 32 GB
|}
 
 
 
{| style="float:left; margin-right:2px;"
 
| style="height:40px; width:150px; text-align:center; background-color:#E77471; border-left:solid 5px #E77471; border-top:solid 5px #E77471; border-bottom:solid 1px white; border-right:solid 5px      #E77471; font-size:120%;" | SHINKEN
 
 
|-
 
|-
| style="height:210px; width:150px; background-color:#E77471; border-left:solid 5px #E77471; border-bottom:solid 5px #E77471; border-right:solid 5px #E77471;" | Users <br> SSH <br> Add machines
+
| Layout 2 || None || None || CentOS 7 || Metal || Compute Node || 32 GB
|}
 
 
 
 
 
 
 
<br> <br> <br> <br> <br> <br><br> <br> <br> <br> <br> <br>
 
 
 
== Cluster Machines ==
 
 
 
{| style="float:left; margin-right:2px;"
 
| style="height:55px; width:150px; text-align:center; background-color:#0099cc; border-left:solid 5px #0099cc; border-top:solid 5px #0099cc; border-bottom:solid 1px white; border-right:solid 5px      #0099cc; font-size:120%;" | HOPPER
 
 
|-
 
|-
| style="height:300px; width:150px; background-color:#0099cc; border-left:solid 5px #0099cc; border-bottom:solid 5px #0099cc; border-right:solid 5px #0099cc;" | Users <br> SSH <br> NFS server <br> LDAP server <br> [[Sysadmin:Software Modules | Software Modules]] <br> PostgreSQL <br> Wiki <br> Apache2 <br> [[Sysadmin:DNS & DHCP | DNS]] <br> [[Sysadmin:DNS & DHCP | DHCP]]  <br><br> Backup to Indiana: etc, var, cluster
+
| Layout 3 || None || None || CentOS 7 || Metal || Compute Node || 32 GB
|}
 
 
 
{| style="float:left; margin-right:2px;"
 
| style="height:55px; width:150px; text-align:center; background-color:#ffdb4d; border-left:solid 5px #ffdb4d; border-top:solid 5px #ffdb4d; border-bottom:solid 1px white; border-right:solid 5px #ffdb4d; font-size:120%;" | INDIANA
 
 
|-
 
|-
| style="height:300px; width:150px; background-color:#ffdb4d; border-left:solid 5px #ffdb4d; border-bottom:solid 5px #ffdb4d; border-right:solid 5px #ffdb4d;" | [[Indiana Storage Server|New Storage Server]]
+
| Layout 4 || None || None || CentOS 7 || Metal || Compute Node || 32 GB
|}
 
 
 
{| style="float:left; margin-right:2px;"
 
| style="height:55px; width:150px; text-align:center; background-color:#ffdb4d; border-left:solid 5px #ffdb4d; border-top:solid 5px #ffdb4d; border-bottom:solid 1px white; border-right:solid 5px #ffdb4d; font-size:120%;" | DALI
 
 
|-
 
|-
| style="height:300px; width:150px; background-color:#ffdb4d; border-left:solid 5px #ffdb4d; border-bottom:solid 5px #ffdb4d; border-right:solid 5px #ffdb4d;" | Storage Server <br>[[Sysadmin:Gitlab | Gitlab]] <br> Backups <br> NginX <br><br> Backup to Indiana (/media/r10_vol/backups/): etc, var/opt/gitlab/backups
+
| Whedon || 159.28.23.4 || No 10Gb internet|| CentOS 7 || Metal || Head Node || 230 GB
|}
 
 
 
{| style="float:left; margin-right:2px;"
 
| style="height:55px; width:150px; text-align:center; background-color:#ff4d94; border-left:solid 5px #ff4d94; border-top:solid 5px #ff4d94; border-bottom:solid 1px white; border-right:solid 5px #ff4d94; font-size:120%;" | AL-SALAM
 
 
|-
 
|-
| style="height:300px; width:150px; background-color:#ff4d94; border-left:solid 5px #ff4d94; border-bottom:solid 5px #ff4d94; border-right:solid 5px #ff4d94;" | [[WebMO]] <br> [[Sysadmin:Software Modules | Software Modules]] <br> Apache2 <br><br> Backup to Indiana: etc, var
+
| Pollock || 159.28.23.8 || 10.10.10.8 || CentOS 7 || Metal || Large compute server || 131 GB
 
|}
 
|}
 +
CS machines:
 +
bowie.cs.earlham.edu web.cs.earlham.edu auth.cs.earlham.edu code.cs.earlham.edu net.cs.earlham.edu
  
{| style="float:left; margin-right:2px;"
+
Cluster Machines: lovelace.cluster.earlham.edu hopper.cluster.earlham.edu hopperprime.cluster.earlham.edu sakurai.cluster.earlham.edu bronte.cluster.earlham.edu whedon.cluster.earlham.edu pollock.cluster.earlham.edu layout.cluster.earlham.edu monitor.cluster.earlham.edu miyamoto.cluster.earlham.edu
| style="height:55px; width:150px; text-align:center; background-color:#ff4d94; border-left:solid 5px #ff4d94; border-top:solid 5px #ff4d94; border-bottom:solid 1px white; border-right:solid 5px #ff4d94; font-size:120%;" | WHEDON
 
|-
 
| style="height:300px; width:150px; background-color:#ff4d94; border-left:solid 5px #ff4d94; border-bottom:solid 5px #ff4d94; border-right:solid 5px #ff4d94;" | [[Sysadmin:Software Modules | Software Modules]] <br><br> Backups to Indiana: etc, var
 
|}
 
  
{| style="float:left; margin-right:2px;"
 
| style="height:55px; width:150px; text-align:center; background-color:#39ad39; border-left:solid 5px #39ad39; border-top:solid 5px #39ad39; border-bottom:solid 1px white; border-right:solid 5px #39ad39; font-size:120%;" | LAYOUT
 
|-
 
| style="height:300px; width:150px; background-color:#39ad39; border-left:solid 5px #39ad39; border-bottom:solid 5px #39ad39; border-right:solid 5px #39ad39;" | [[Sysadmin:Jupyterhub Notebook Server | Jupyterhub Server]] <br> [[Sysadmin:Software Modules | Software Modules]] <br> NginX <br> Apache2 <br> [[WebMO]] <br><br> Backup to Indiana: etc, var
 
|}
 
  
{| style="float:left; margin-right:2px;"
+
There are 6 machines currently not in use in the 6 spaces above Monitor on the Equitorial Guinea rack
| style="height:55px; width:150px; text-align:center; background-color:#0099cc; border-left:solid 5px #0099cc; border-top:solid 5px #0099cc; border-bottom:solid 1px white; border-right:solid 5px #0099cc; font-size:120%;" | BRONTE
+
=== Specialized resources ===
|-
 
| style="height:300px; width:150px; background-color:#0099cc; border-left:solid 5px #0099cc; border-bottom:solid 5px #0099cc; border-right:solid 5px #0099cc;" |  [[Sysadmin:Software Modules | Software Modules]] <br><br> Backup to Indiana: etc, var, nbserver
 
|}
 
  
{| style="float:left; margin-right:2px;"
+
Specialized computing applications are supported on the following machines:
| style="height:55px; width:150px; text-align:center; background-color:#0099cc; border-left:solid 5px #0099cc; border-top:solid 5px #0099cc; border-bottom:solid 1px white; border-right:solid 5px #0099cc; font-size:120%;" | POLLOCK
 
|-
 
| style="height:300px; width:150px; background-color:#0099cc; border-left:solid 5px #0099cc; border-bottom:solid 5px #0099cc; border-right:solid 5px #0099cc;" |  [[Sysadmin:Software Modules | Software Modules]] <br> [[WebMO]] <br> NginX <br><br> Backup to Indiana: etc, var
 
|}
 
  
{| style="float:left; margin-right:2px;"
+
* [[Sysadmin:GPGPU|GPU’s for AI/ML/data science]]: layout cluster
| style="height:55px; width:150px; text-align:center; background-color:#ffdb4d; border-left:solid 5px #ffdb4d; border-top:solid 5px #ffdb4d; border-bottom:solid 1px white; border-right:solid 5px #ffdb4d; font-size:120%;" | KAHLO
+
* virtualization: smiley
|-
+
* containers: bowie
| style="height:300px; width:150px; background-color:#ffdb4d; border-left:solid 5px #ffdb4d; border-bottom:solid 5px #ffdb4d; border-right:solid 5px #ffdb4d;" | Storage Server <br>Backups <br> NginX <br><br> Backup to Indiana: etc, var
 
|}
 
  
{| style="float:left; margin-right:2px;"
+
== Network ==
| style="height:55px; width:150px; text-align:center; background-color:#0099cc; border-left:solid 5px #0099cc; border-top:solid 5px #0099cc; border-bottom:solid 1px white; border-right:solid 5px #0099cc; font-size:120%;" | BIGFE
 
|-
 
| style="height:300px; width:150px; background-color:#0099cc; border-left:solid 5px #0099cc; border-bottom:solid 5px #0099cc; border-right:solid 5px #0099cc;" |  [[Sysadmin:Software Modules | Software Modules]] <br><br> Hosts BCCD related repositories and distributions.
 
|}
 
  
{| style="float:left; margin-right:2px;"
+
We have two network fabrics linking the machines together. There are three subdomains.
| style="height:55px; width:150px; text-align:center; background-color:#0099cc; border-left:solid 5px #0099cc; border-top:solid 5px #0099cc; border-bottom:solid 1px white; border-right:solid 5px #0099cc; font-size:120%;" | T-VOC
 
|-
 
| style="height:300px; width:150px; background-color:#0099cc; border-left:solid 5px #0099cc; border-bottom:solid 5px #0099cc; border-right:solid 5px #0099cc;" |  [[Sysadmin:Software Modules | Software Modules]]
 
|}
 
  
{| style="float:left; margin-right:2px;"
+
=== 10 Gb ===
| style="height:55px; width:150px; text-align:center; background-color:#0099cc; border-left:solid 5px #0099cc; border-top:solid 5px #0099cc; border-bottom:solid 1px white; border-right:solid 5px #0099cc; font-size:120%;" | ELWOOD
 
|-
 
| style="height:300px; width:150px; background-color:#0099cc; border-left:solid 5px #0099cc; border-bottom:solid 5px #0099cc; border-right:solid 5px #0099cc;" |  [[Sysadmin:Software Modules | Software Modules]] <br> <br> Used by BCCD to host www.bccd.net and www.littlefe.net. Will be deprecated when BCCD project offloads their sites onto cloud-based hosting platforms.
 
|}
 
  
{| style="float:left; margin-right:2px;"
+
We have 10Gb fabric to mount files over NFS. Machines with 10Gb support have an IP address in the class C range 10.10.10.0/24 and we want to add DNS to these addresses.
| style="height:55px; width:150px; text-align:center; background-color:#ff4d94; border-left:solid 5px #ff4d94; border-top:solid 5px #ff4d94; border-bottom:solid 1px white; border-right:solid 5px #ff4d94; font-size:120%;" | krasner
 
|-
 
| style="height:300px; width:150px; background-color:#ff4d94; border-left:solid 5px #ff4d94; border-bottom:solid 5px #ff4d94; border-right:solid 5px #ff4d94;" | [[Docker]] platform on an old lovelace machine upgraded to have 16GB of RAM.
 
|}
 
  
 +
=== 1 Gb (cluster, cs) ===
  
<br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br>
+
We have two class C subnets on the 1Gb fabric: 159.28.22.0/24 (CS) and 159.28.23.0/24 (cluster). This means we have double the IP addresses on the 1Gb fabric that we have on the 10Gb fabric.
  
== Switches ==
+
Any user accessing *.cluster.earlham.edu and *.cs.earlham.edu is making calls on a 1Gb network.
  
 +
=== Intra-cluster fabrics ===
  
 +
The layout cluster has an Infiniband infrastructure. Wachowski has only a 1Gb infrastructure.
  
{| style="float:left; margin-right:2px;"
+
== Power ==
| style="height:55px; width:175px; text-align:center; background-color:#0099cc; border-left:solid 5px #0099cc; border-top:solid 5px #0099cc; border-bottom:solid 1px white; border-right:solid 5px      #0099cc; font-size:120%;" | SG538SF02J
 
|-
 
| style="height:200px; width:175px; background-color:#0099cc; border-left:solid 5px #0099cc; border-bottom:solid 5px #0099cc; border-right:solid 5px #0099cc; font-size:80%;" |
 
*Model: HP Procurve 3400cl
 
*Ports: 24
 
*Backplane bandwidth:
 
**88 Gbps
 
**64 million pps
 
*Memory:
 
**2MB packet buffer
 
**16 MB dual flash
 
**128 MB SDRAM
 
*Cut-through switching: No
 
*Unused as of May 12, 2017
 
|}
 
  
{| style="float:left; margin-right:2px;"
+
We have a backup power supply, with batteries last upgraded in 2019 (?). We’ve had a few outages since then and power has held up well.
| style="height:55px; width:175px; text-align:center; background-color:#ffdb4d; border-left:solid 5px #ffdb4d; border-top:solid 5px #ffdb4d; border-bottom:solid 1px white; border-right:solid 5px #ffdb4d; font-size:120%;" | CN63FP762S
 
|-
 
| style="height:200px; width:175px; background-color:#ffdb4d; border-left:solid 5px #ffdb4d; border-bottom:solid 5px #ffdb4d; border-right:solid 5px #ffdb4d;font-size:80%;" |
 
*Model: HP 2530-24G
 
*Ports: 24
 
*Switching Capacity:
 
**56 Gbps
 
**41.6 million pps
 
*Memory:
 
**1.5 MB packet buffer
 
**256 MB  flash
 
**128 MB DDR3 DIMM
 
*Cut-through switching: No
 
*Connected to Al-Salam as of May 12, 2017
 
|}
 
  
{| style="float:left; margin-right:2px;"
+
== HVAC ==
| style="height:55px; width:175px; text-align:center; background-color:#ff4d94; border-left:solid 5px #ff4d94; border-top:solid 5px #ff4d94; border-bottom:solid 1px white; border-right:solid 5px #ff4d94; font-size:120%;" | SG525SG025
 
|-
 
| style="height:200px; width:175px; background-color:#ff4d94; border-left:solid 5px #ff4d94; border-bottom:solid 5px #ff4d94; border-right:solid 5px #ff4d94;font-size:80%;" |
 
*Model: HP Procurve 3400cl
 
*Ports: 24
 
*Backplane bandwidth:
 
**88 Gbps
 
**64 million pps
 
*Memory:
 
**2MB packet buffer
 
**16 MB dual flash
 
**128 MB SDRAM
 
*Cut-through switching: No
 
*Connected to layout and whedon as of May 12, 2017
 
|}
 
  
{| style="float:left; margin-right:2px;"
+
HVAC systems are static and are largely managed by Facilities.
| style="height:55px; width:175px; text-align:center; background-color:#39ad39; border-left:solid 5px #39ad39; border-top:solid 5px #39ad39; border-bottom:solid 1px white; border-right:solid 5px #39ad39; font-size:120%;" | Netgear JGS524
 
|-
 
| style="height:200px; width:175px; background-color:#39ad39; border-left:solid 5px #39ad39; border-bottom:solid 5px #39ad39; border-right:solid 5px #39ad39;font-size:80%;" |
 
*Current cluster head-node
 
*Unmanaged (no console/configuration)
 
*Ports: 24
 
*Switching bandwidth:
 
**48 Gbps
 
**1.5 million pps
 
*Memory:
 
**2MB packet buffer
 
*Cut-through switching: No
 
*Connected to Al-Salam, Hopper, Pollock, Nagios, Dali, Kahlo, Bronte as of May 12, 2017
 
|}
 
  
{| style="float:left; margin-right:2px;"
+
[[Topology|See full topology diagrams here.]]
| style="height:55px; width:175px; text-align:center; background-color:#E77471; border-left:solid 5px #E77471; border-top:solid 5px #E77471; border-bottom:solid 1px white; border-right:solid 5px #E77471; font-size:120%;" | cs-main
 
|-
 
| style="height:200px; width:175px; background-color:#E77471; border-left:solid 5px #E77471; border-bottom:solid 5px #E77471; border-right:solid 5px #E77471;font-size:80%;" |
 
*Model: HP 5920AF-24XG
 
*Ports: 24
 
*Backplane bandwidth:
 
**480 Gbps
 
**367 million pps
 
*Memory:
 
**3.6 GB packet buffer
 
**256 MB dual flash
 
**2 GB SDRAM
 
*Cut-through switching: Yes
 
*IP Address: 159.28.31.66
 
*Connected to layout, kahlo, and dali as of May 12, 2017
 
|}
 
  
{| style="float:left; margin-right:2px;"
+
[[Sysadmin:Layers of abstraction for filesystems|A word about what's happening between files and the drives they live on.]]
| style="height:55px; width:175px; text-align:center; background-color:#ADDFFF; border-left:solid 5px #ADDFFF; border-top:solid 5px #ADDFFF; border-bottom:solid 1px white; border-right:solid 5px #ADDFFF; font-size:120%;" | 5500denniscs-sw1
 
|-
 
| style="height:200px; width:175px; background-color:#ADDFFF; border-left:solid 5px #ADDFFF; border-bottom:solid 5px #ADDFFF; border-right:solid 5px #ADDFFF;font-size:80%;" |
 
*Model: HP 5500 JG542A
 
*Ports: 24
 
*Backplane bandwidth:
 
**224 Gbps
 
**166.6 million pps
 
*Memory:
 
**6 MB packet buffer
 
**512 MB dual flash
 
**1 GB SDRAM
 
*Cut-through switching: No
 
*IP Address: 159.28.31.67
 
*Connected to Babbage, Bowie, Nagios, and the cluster's netgear switch (via port 14) as of May 12, 2017
 
|}
 
  
<br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br><br>
 
  
= Systems Administration Documentation =
+
= New sysadmins =
For old documentation, see: [[Sysadmin:Old | Old Wiki Information]]
 
  
{|
+
These pages will be helpful for you if you're just starting in the group:
|- valign:"top"
 
|
 
<div style="border:10px solid #E0EAF8; padding:5px; width:230px; height:500px">
 
<div style="background-color:#CEDEF4; padding:5px;">
 
  
=== Admin Tasks ===
+
* [[Sysadmin:New Sysadmins | Welcoming a new sysadmin ]]
</div>
+
* [[Sysadmin:Troubleshooting|General troubleshooting tips for admins]]
* [[Sysadmin:Monitoring | Monitoring ]]
+
* [[Sandbox Notes|Sandbox Notes]]
* [[Sysadmin:Upgrading SSL Certificate | Upgrading SSL Certificates ]]
 
* [[Sysadmin:User Management | User Management]]
 
* [[Modules | Installing software under modules ]]
 
* [[Sysadmin:Backup|Backup]]
 
* [[Sysadmin:Contacting all users|Contacting all users]]
 
* [[Sysadmin:New Sysadmins | Welcoming a new sysadmin to the fold]]
 
* [[Sysadmin:AddComputer|Add a computer]]
 
* [[Sysadmin:Setting up Lovelace Lab Machines | Setting up Lovelace Lab Machines]]
 
* [[Reset password]]
 
* [[Senior projects]]
 
* [[ShutdownProcedure| Shutdown and Boot up]]
 
* [[Sysadmin:ImportantInfo:SSLcerts| Generating SSL Certificates]]
 
 
* [[Password managers]]
 
* [[Password managers]]
 +
* [[Server safety]]
 +
* [https://code.cs.earlham.edu/sysadmin/ticket-tracker Ticket tracking for current projects]
  
 +
Note: you'll need to log in with wiki credentials to see most Sysadmin pages.
  
<!-- This has to stay as part of the formatting -->
+
= Additional information =
</div>
 
| style="float:left;" |
 
|
 
<div style="border:10px solid #FFDFFF; padding:5px; width:230px; height:500px;">
 
<div style="background-color:#FFCEFF; padding:5px;">
 
  
=== Services ===
+
These pages contain a lot of the most important information about our systems and how we operate.
</div>
+
 
* [[Sysadmin:Services:ClusterOverview|Cluster Overview]]
+
===Technical docs===
* [[Sysadmin:Services:Apache2|Apache2]]
+
 
 +
* [https://code.cs.earlham.edu/sysadmin/ticket-tracker Ticket tracking for current projects]
 +
* [[Server safety]]
 +
* [[Sysadmin:Backup|Backup]]
 +
* [[Sysadmin:Monitoring | Monitoring ]]
 +
* [[Sysadmin:SSH|SSH info relevant to admins]]
 +
* [[Sysadmin:User Management | User Management]] and [[Sysadmin:LDAP|LDAP]] generally
 +
* [[Sysadmin:Jupyterhub Notebook Server|Jupyterhub]] and [[Nbgrader notes|NBGrader]]
 +
* [[Sysadmin:MailStack|Email service]]
 +
* [[Sysadmin:XenDocs | Xen Server]]
 +
* [[Sysadmin:NFS|Network File System (NFS)]]
 +
* [[Sysadmin:Web Servers|Web Servers and Websites]]
 
* [[Sysadmin:Services:Databases|Databases]]
 
* [[Sysadmin:Services:Databases|Databases]]
 
* [[Sysadmin:DNS & DHCP|DNS and DHCP]]
 
* [[Sysadmin:DNS & DHCP|DNS and DHCP]]
* [[Sysadmin:Services:Virtualization | Virtualization]]
+
* [[Sysadmin:AWS|AWS]]
* [[Sysadmin:Services:XenServerSetup | Xen Server]]
+
* [[Bash_start_up_script|Bash startup scripts]]
 +
* [[Sysadmin:VirtualBox | VirtualBox]]
 +
* [[X Applications]]
 +
* [[Sysadmin:Services:ClusterOverview|Cluster Overview]] and [[Sysadmin:Ccg-admin|additional details]]
 +
* [[Sysadmin:Firewall|Firewall]] running on babbage.cs.e.e
  
|}
+
===Common tasks===
 
+
* [[Sysadmin:Recurring Tasks | Recurring tasks - e.g. software updates, hardware replacements]]
== Current Projects ==
+
* [[Sysadmin:Contacting all users|Contacting all users]]
This is the list we will work from in addition to service requests.
+
* [[Reset password]]
 
+
* [[Sysadmin:Software installation | Software installation]]
Some important procedural pages:
+
* [[Modules | Installing software under modules ]]  
* Use the [[Sysadmin/New_Task_Template|Sysadmin task template]] if you're starting a new project. Copy and paste the wiki source page and populate the basic fields.
+
* [[Sysadmin:AddComputer|Add a computer to CS or cluster domains]]
* This is a new way of [[Sysadmin/Task_Process|task processing]]. It's subject to change.
+
* [[Senior projects|Supporting senior projects]]
* You can see our [[:Category:Open Tasks|Open Tasks here]].
+
* [[ShutdownProcedure|How to do a planned shutdown and reboot of the system]]
* We will also start filling up the [[:Category:Closed Tasks|Closed Tasks category]].
+
** [[Sysadmin:TestingServices | Testing services]] (after a reboot, upgrade, change in the phase of the moon, etc.)
 
+
* [[Sysadmin:Upgrading SSL Certificate | Upgrading SSL Certificates ]]
Please update specific projects at their own page.
+
* [[Sysadmin:Launch at startup|Launch a process at startup]]
* [[Documentation]] - a meta-project - please click here to update projects you've worked on but haven't documented yet
 
* [[Web logins]]
 
* [[Password management]]
 
* [[Docker and WebODM on Bronte]]
 
* [[Fix shinken server access]]
 
* Backup in Lilly basement
 
* [[Backup on all machines]] - includes backup.cs.e.e (indiana?)
 
* [[Post shutdown]]
 
* power map additions and updates - update the power map and get as much of our power data (how much we use, how much we could theoretically be using, etc.) all together
 
* clean up system variables, e.g. verifying that all our systems have Python 2 as a default not Python 3, using Charlie’s earlier email about cexecs as a starting point (in your inbox or the mailing list, search for “A bit of cleaning on the cluster side”)
 
* install and configure RT
 
* install and configure bioinformatics software (a focus for Laurence but one that others may want to look at) - includes making sure qsub works on both bronte and pollock
 
  
Smaller projects
+
===Group and institution information===
* update our DNS files (and thus our store of reserved IP addresses) based on which machines are currently in use (a good chance to learn DNS, both generally and how we run it here) - cf. [[Verify Lovelace DNS]]
+
* [[Sysadmin:CS-ITS Interoperability|Working with ITS]]
* [[Layout Layout]]
+
* [[Sysadmin:Recurring spending | Recurring spending ]]
* [[Fix Lovelace machines]]
+
* [[Sysadmin:SlackAndGitLab | Slack and GitLab integration]]
* Fix man pages (on each machine, check that man pages come up as expected - e.g. run `man ls` - and fix them if not)
 
* Double-check our time servers.
 

Latest revision as of 15:22, 13 December 2021

This is the hub for the CS sysadmins on the wiki.

Overview

If you're visually inclined, we have a colorful and easy-to-edit map of our servers here!

Server room

Our servers are in Noyes, the science building that predates the CST. For general information about the server room and how to use it, check out this page.

Columns: machine name, IPs, type (virtual, metal), purpose, dies, cores, RAM

Compute (servers and clusters)

CS machines and cluster machines
Machine name 159 Ip Address 10Gb Ip address Operating System Metal or Virtual Description RAM
Bowie 159.28.22.5 10.10.10.15 Debian 9 Metal hosts and exports user files; Jupyterhub; landing server 198 GB
Smiley 159.28.22.251 10.10.10.252 Ubuntu 18.04 Metal VM host, not accessible to regular users
Web 159.28.22.2 10.10.10.200 Ubuntu 18.04 Virtual Website host 8 GB
Auth 159.28.22.39 No 10Gb internet CentOS 7 Virtual host of LDAP user database 4 GB
Code 159.28.22.42 10.10.10.42 Ubuntu 18.04 Virtual Gitlab host 4 GB
Net 159.28.22.1 10.10.10.100 Ubuntu 18.04 Virtual network administration host for CS 4 GB
Lovelace 159.28.23.35 10.10.10.35 CentOS 7 Metal Large compute server 96 GB
Hopper 159.28.23.1 10.10.10.1 Debian 10 Metal landing server, NFS host for cluster 64 GB
Sakurai 159.23.23.3 10.10.10.3 Debian 10 Metal Runs Backup 12 GB
Miyamoto 159.28.23.45 No 10Gb currently Debian 10 Metal Runs Backup 16 GB
HopperPrime 159.28.23.142 10.10.10.142 Debian 10 Metal Runs Backup 16 GB
Monitor 159.28.23.250 No 10Gb internet CentOS 7 Metal Server Monitoring 16 GB
Bronte 159.28.23.140 No 10Gb internet CentOS 7 Metal Large compute server 115 GB
Layout 0 159.28.23.2 10.10.10.2 CentOS 7 Metal Head Node 32 GB
Layout 1 None None CentOS 7 Metal Compute Node 32 GB
Layout 2 None None CentOS 7 Metal Compute Node 32 GB
Layout 3 None None CentOS 7 Metal Compute Node 32 GB
Layout 4 None None CentOS 7 Metal Compute Node 32 GB
Whedon 159.28.23.4 No 10Gb internet CentOS 7 Metal Head Node 230 GB
Pollock 159.28.23.8 10.10.10.8 CentOS 7 Metal Large compute server 131 GB

CS machines: bowie.cs.earlham.edu web.cs.earlham.edu auth.cs.earlham.edu code.cs.earlham.edu net.cs.earlham.edu

Cluster Machines: lovelace.cluster.earlham.edu hopper.cluster.earlham.edu hopperprime.cluster.earlham.edu sakurai.cluster.earlham.edu bronte.cluster.earlham.edu whedon.cluster.earlham.edu pollock.cluster.earlham.edu layout.cluster.earlham.edu monitor.cluster.earlham.edu miyamoto.cluster.earlham.edu


There are 6 machines currently not in use in the 6 spaces above Monitor on the Equitorial Guinea rack

Specialized resources

Specialized computing applications are supported on the following machines:

Network

We have two network fabrics linking the machines together. There are three subdomains.

10 Gb

We have 10Gb fabric to mount files over NFS. Machines with 10Gb support have an IP address in the class C range 10.10.10.0/24 and we want to add DNS to these addresses.

1 Gb (cluster, cs)

We have two class C subnets on the 1Gb fabric: 159.28.22.0/24 (CS) and 159.28.23.0/24 (cluster). This means we have double the IP addresses on the 1Gb fabric that we have on the 10Gb fabric.

Any user accessing *.cluster.earlham.edu and *.cs.earlham.edu is making calls on a 1Gb network.

Intra-cluster fabrics

The layout cluster has an Infiniband infrastructure. Wachowski has only a 1Gb infrastructure.

Power

We have a backup power supply, with batteries last upgraded in 2019 (?). We’ve had a few outages since then and power has held up well.

HVAC

HVAC systems are static and are largely managed by Facilities.

See full topology diagrams here.

A word about what's happening between files and the drives they live on.


New sysadmins

These pages will be helpful for you if you're just starting in the group:

Note: you'll need to log in with wiki credentials to see most Sysadmin pages.

Additional information

These pages contain a lot of the most important information about our systems and how we operate.

Technical docs

Common tasks

Group and institution information