Difference between revisions of "Sysadmin"

From Earlham CS Department
Jump to navigation Jump to search
(Sysadmin 2014 to do list:)
 
(621 intermediate revisions by 21 users not shown)
Line 1: Line 1:
__NOTOC__
+
This is the hub for the CS sysadmins on the wiki.
  
== Sysadmin Responsibilities ==
+
= Overview =
This is the basic list of tasks that Earlham CS system administrators are in charge of.
 
  
{| class="wikitable"
+
[https://docs.google.com/drawings/d/1XaULz5IxXV_BZQjrko3QJ8wV5aXsSTYcSWxxT49OyZk/edit If you're visually inclined, we have a colorful and easy-to-edit map of our servers here!]
|-
 
! Responsibilities !! Wilson !! Eamon
 
|-
 
| Install software on Debian (ACL) || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div> ||  <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div>
 
|-
 
| Install software on FreeBSD (servers) || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div> ||  <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div>
 
|-
 
| Make a CS user account || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div> || 
 
|-
 
| Change users CS password || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div> ||
 
|-
 
| Add DNS & DHCP entry || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div> ||
 
|-
 
| Being able to edit wiki || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div> || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div>
 
|-
 
| Make a CS wiki account || || 
 
|-
 
| Add people to different groups (ldap) || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div> || 
 
|-
 
| Modification and maintenance of Nagios ||  ||
 
|-
 
| DD a new ACL image || || 
 
|-
 
| Set up a new ACL || ||
 
|-
 
| Shut down / start up of the entire machine room || ||
 
|-
 
| Creating and configuring mailing lists (electron) || ||
 
|-
 
| Admin list moderating || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div> || 
 
|-
 
| Backups and restore (bacula) || ||
 
|-
 
| Create and configure jails  || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div> ||
 
|-
 
| VMware ||  ||
 
|}
 
  
== Sysadmin basic Training ==
+
== Server room ==
This is the list of skills that our System Administrators are trained during their orientation.
 
  
{| class="wikitable"
+
Our servers are in Noyes, the science building that predates the CST. For general information about the server room and how to use it, check out [[Sysadmin:Server Room|this page]].
|-
 
! Training Sections !! Wilson !! Eamon
 
|-
 
| | || 
 
|-
 
| Installing operating systems (Debian and FreeBSD), including single-user mode || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div> || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div>
 
|-
 
| Installing packages || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div> || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div>
 
|-
 
| *nix Filesystem layout ||  <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div> || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div>
 
|-
 
| Command line tools including I/O redirections and pipes ||  <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div> || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div>
 
|-
 
| TCP,  UDP and ICMP packets, including 3-way handshake || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div> || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div>
 
|-
 
| Ports ||  <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div> || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div>
 
|-
 
| DNS || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div> || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div>
 
|-
 
| DHCP  || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div> || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div>
 
|-
 
| Network debugging tools (tcpdump, ping, traceroute, netstat) || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div> || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div>
 
|-
 
| Simple shell scripting || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div> ||  <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div>
 
|-
 
| Jails || <div style="text-align: center;"> [[File:StarIconBronze.png|20px]] </div> || 
 
|}
 
  
== Sysadmin 2014 to do list: ==
+
Columns: machine name, IPs, type (virtual, metal), purpose, dies, cores, RAM
  
* () Spam Filter (CP, JR)
+
== Compute Resources ==
* () new-proto from the outside world
 
* (I) check hydra with Charlie
 
* Getting rid of Quark <br />
 
* (?) On ACLs login disappears after pressing cancel (Reported JR)
 
* ACL screensaver, leave only lightweight? (JR)
 
* Removing mailman form quark <br />
 
* Mailman Heather (not all of them accepted the changes electron to cs.earlham.edu) <br />
 
* (W) Script for changing users password
 
* (W) Script for changing and adding groups
 
* (W) Brushing the add a user script
 
* (W) Check machines chooser can choose from
 
* (E) Script that will send an e-mail to all people
 
* (E) Improve nagios settings
 
* (W) Can we change Kristin Muterspaw CS username (from kmmuter11 to buzzlightyear)
 
  
** Done
+
[https://wiki.cs.earlham.edu/index.php/Sysadmin:Computer_Resources Machines and VMs related information here!]
* Wireshark should be run only be 410 students (Reported JR) (Fixed)
 
* Re-imaging ENI machine (ACL21) <br />
 
* () DNS troubles (Reported CP)
 
* (I) fab lab list (HL)
 
* (I) Hassan, SSH Trouble to Electron (Hassan, JR)
 
  
 +
== Network ==
  
 +
We have two network fabrics linking the machines together. There are three subdomains.
  
 +
=== 10 Gb ===
  
 +
We have 10Gb fabric to mount files over NFS. Machines with 10Gb support have an IP address in the class C range 10.10.10.0/24 and we want to add DNS to these addresses.
  
 +
=== 1 Gb (cluster, cs) ===
  
 +
We have two class C subnets on the 1Gb fabric: 159.28.22.0/24 (CS) and 159.28.23.0/24 (cluster). This means we have double the IP addresses on the 1Gb fabric that we have on the 10Gb fabric.
  
 +
Any user accessing *.cluster.earlham.edu and *.cs.earlham.edu is making calls on a 1Gb network.
  
 +
=== Intra-cluster fabrics ===
  
'''Documentation:'''
+
The layout cluster has an Infiniband infrastructure. Wachowski has only a 1Gb infrastructure.
  
Wilson:
+
== Power ==
* DNS & DHCP (done)
 
* Sage (done)
 
* Add User (done)
 
* Add/change group
 
* Password change
 
* Firewall
 
  
Eamon:
+
We have a backup power supply, with batteries last upgraded in 2019 (?). We’ve had a few outages since then and power has held up well.
* Cups
 
* PSSH
 
  
Ivan
+
== HVAC ==
* Cloning ACL box
 
  
== Systems Administration Documentation ==
+
HVAC systems are static and are largely managed by Facilities.
  
{|
+
[[Topology|See full topology diagrams here.]]
|- valign="top"
 
|
 
<div style="border:10px solid #E3E0FA; padding:5px">
 
<div style="background-color:#D7D1F8; padding:5px;">
 
=== Works in Progress ===
 
</div>
 
  
* [[Sysadmin:todo13|To do before Fall 13 starts]]
+
[[Sysadmin:Layers of abstraction for filesystems|A word about what's happening between files and the drives they live on.]]
* [[Sysadmin:handbook|Handbook (WIP)]]
 
* [[Sysadmin:Temporary Page | Temporary Page for Wiki Adjustment]]
 
* [[Sysadmin: Upgrading FreeBSD | Upgrading FreeBSD]]
 
* [[Sysadmin:Fail2Ban on FreeBSD | Fail2Ban on FreeBSD]]
 
* [[Sysadmin:Running Nessus | Running Nessus]]
 
* [[Sysadmin:SrvcCheck|Things to check when things go down]]
 
* [[Sysadmin:AaronsHowTo| Aaron's How-To Pages]]
 
* [[Sysadmin:Sonresources| Son's "Cook" Pages]]
 
* [[Sysadmin:Installing ACLs]]
 
  
<!-- This has to stay as part of the formatting -->
+
= New sysadmins =
</div>
 
| style="width:50px;" |
 
|
 
<div style="border:10px solid #E0EAF8; padding:5px;">
 
<div style="background-color:#CEDEF4; padding:5px;">
 
  
=== Admin Tasks ===
+
These pages will be helpful for you if you're just starting in the group:
</div>
 
  
* [[Sysadmin:NEWVirtualbox|NEW Virtualbox]]
+
* [[Sysadmin:New Sysadmins | Welcoming a new sysadmin ]]
* [[Sysadmin:NEWUser Management|NEW User Management]]
+
* [[Sysadmin:Troubleshooting|General troubleshooting tips for admins]]
* [[Sysadmin: NEWcupssetup|NEW CUPS/Printer Adiministration]]
+
* [[Sandbox Notes|Sandbox Notes]]
* [[Sysadmin: NEWAddComputer|NEW Add a computer]]
+
* [[Password managers]]
* [[Sysadmin:NEWStart/Shutdown|NEW Shutdown/Start]]
+
* [[Server safety]]
* [[Sysadmin:NEWMailman|NEW Mailman]]
+
* [https://code.cs.earlham.edu/sysadmin/ticket-tracker Ticket tracking for current projects]
* [[Sysadmin:NEWNagios|NEW Nagios]]
 
* [[Sysadmin:Backup|Backup]] (needs to be updated after new setup)
 
* [[Sysadmin:Contacting all users|Contacting all users]]
 
* [[Sysadmin:New Sysadmins|Welcoming a new sysadmin to the fold]]
 
* [[Sysadmin:RT Ticketing|RT Ticketing]]
 
* [[Sysadmin:AddComputer|Add a computer]]
 
  
 +
Note: you'll need to log in with wiki credentials to see most Sysadmin pages.
  
<!-- This has to stay as part of the formatting -->
+
= Additional information =
</div>
 
|}
 
  
 +
These pages contain a lot of the most important information about our systems and how we operate.
  
{|
+
===Handy Tools===
|- valign="top"
+
* [http://monitor.cluster.earlham.edu:8088/packages Porter's Package Explorer]
|
 
  
<div style="border:10px solid #FFDFFF; padding:5px;">
+
===Technical docs===
<div style="background-color:#FFCEFF; padding:5px;">
 
  
=== Services ===
+
* [https://code.cs.earlham.edu/sysadmin/ticket-tracker Ticket tracking for current projects]
</div>
+
* [[Server safety]]
* [[Sysadmin:Services:Apache2|Apache2]]
+
* [[Sysadmin:Backup|Backup]]
 +
* [[Sysadmin:Monitoring | Monitoring ]]
 +
* [[Sysadmin:SSH|SSH info relevant to admins]]
 +
* [[Sysadmin:User Management | User Management]] and [[Sysadmin:LDAP|LDAP]] generally
 +
* [[Sysadmin:Jupyterhub Notebook Server|Jupyterhub]] and [[Nbgrader notes|NBGrader]]
 +
* [[Sysadmin:MailStack|Email service]]
 +
* [[Sysadmin:XenDocs | Xen Server]]
 +
* [[Sysadmin:NFS|Network File System (NFS)]]
 +
* [[Sysadmin:Web Servers|Web Servers and Websites]]
 
* [[Sysadmin:Services:Databases|Databases]]
 
* [[Sysadmin:Services:Databases|Databases]]
* [[Sysadmin:Services:DNS and DHCP|NEW DNS and DHCP]]
+
* [[Sysadmin:DNS & DHCP|DNS and DHCP]]
* [[Sysadmin:Services:Email|Email]]
+
* [[Sysadmin:AWS|AWS]]
* [[Sysadmin:Services:LVM|LVM]]
+
* [[Bash_start_up_script|Bash startup scripts]]
* [[Sysadmin:User Management|User Management]]
+
* [[Sysadmin:VirtualBox | VirtualBox]]
* [[Sysadmin:positron|NFS]]
+
* [[X Applications]]
* [[Sysadmin:Services:Printers|Printers]]
+
* [[Sysadmin:Services:ClusterOverview|Cluster Overview]] and [[Sysadmin:Ccg-admin|additional details]]
* [[Sysadmin:services:Sage|NEW Sage]]
+
* [[Sysadmin:Firewall|Firewall]] running on babbage.cs.e.e
* [[Sysadmin:Services:SystemImager|System Imager]]
+
* [[Sysadmin:Setting_up_Lovelace_Lab_Machines|Setting up Lab Machines]]
* [[Sysadmin:Services:TracSVN|Trac + svn]]
 
* [[Sysadmin:Services:Virtualization | Virtualization]]
 
* [[Sysadmin:Services:ZFS | ZFS]]
 
  
<!-- This has to stay as part of the formatting -->
+
===Common tasks===
</div>
+
* [[Sysadmin:Recurring Tasks | Recurring tasks - e.g. software updates, hardware replacements]]
| style="width:50px;" |
+
* [[Sysadmin:Contacting all users|Contacting all users]]
|
+
* [[Reset password]]
 
+
* [[Sysadmin:Software installation | Software installation]]
<div style="border:10px solid #DBF0F7; padding:5px;">
+
* [[Modules | Installing software under modules ]]  
<div style="background-color:#C9EAF3; padding:5px;">
+
* [[Sysadmin:AddComputer|Add a computer to CS or cluster domains]]
 
+
* [[Senior projects|Supporting senior projects]]
=== Servers ===
+
* [[ShutdownProcedure|How to do a planned shutdown and reboot of the system]]
</div>
+
** [[Sysadmin:TestingServices | Testing services]] (after a reboot, upgrade, change in the phase of the moon, etc.)
* [[Sysadmin:PhysicalServers | Physical Servers]]
+
* [[Sysadmin:Upgrading SSL Certificate | Upgrading SSL Certificates ]]
* [[Sysadmin:VirtualServersAndJails | Virtual Servers and Jails]]
+
* [[Sysadmin:Launch at startup|Launch a process at startup]]
* [[Sysadmin:SvcChart|Service Chart]]
+
* [[Sysadmin:Psql-setup | setup psql for cs430 students]]
* [[Sysadmin:Monitoring|Monitoring]]
 
* [[Sysadmin:Quark | Quark]]
 
* [[Sysadmin:Forty-Two | Forty-two]]
 
* [[Sysadmin:Lovelace | Lovelace]]
 
* [[Sysadmin:Proto | Proto]]
 
* [[Sysadmin:RetiredServers | Retired Servers]]
 
 
 
<!-- This has to stay as part of the formatting -->
 
</div>
 
| style="width:50px;" |
 
|
 
<div style="border:10px solid #FFFFC8; padding:5px;">
 
<div style="background-color:#FFFFB5; padding:5px;">
 
 
 
=== ACL Workstations ===
 
</div>
 
* [[Sysadmin:ACL:Installation|ACL Installation procedure]]
 
* [[Sysadmin:AclImage|ACL Package Information]]
 
* [[Sysadmin:Acl Locations|ACL Locations]]
 
* [[Sysadmin:Software for Chemistry ACLs|Software for Chemistry ACLs]]
 
* [[Sysadmin:ACL:UpProp|Proposed ACL Update policy]]
 
* [[Sysadmin:ACLParallelCommands|Run Command on all ACLS]]
 
 
 
 
 
<!-- This has to stay as part of the formatting -->
 
</div>
 
|}
 
 
 
 
 
{|
 
|- valign="top"
 
|
 
<div style="border:10px solid #D6F8DE; padding:5px;">
 
<div style="background-color:#BDF4CB; padding:5px;">
 
 
 
=== Networking ===
 
</div>
 
* [[Sysadmin:Networking:NetworkLayout|Network Layout (as of 08/2006)]]
 
* [[Sysadmin:Networking:D224 cable plant|D224 cable plant]]
 
* [[Sysadmin:Networking:Fiber plans|Fiber plans]]
 
* [[Sysadmin:Networking:Switches|Switches]]
 
* [[Sysadmin:Networking:Rack notes|Rack notes]]
 
* [[Sysadmin:Networking:Public|Public Network]]
 
* [[Sysadmin:Networking:NetworkTopo|Old Network Topo Figures]]
 
* [[Sysadmin:Networking:NetworkDiagram|Network layout (May 2007)]]
 
* [[Sysadmin:Networking:Alternate Network Path|Alt Network path]]
 
* [[Sysadmin:UPS Setup]]
 
 
 
<!-- This has to stay as part of the formatting -->
 
</div>
 
| style="width:50px;" |
 
|
 
<div style="border:10px solid #F0DDD5; padding:5px;">
 
<div style="background-color:#E4C0B1; padding:5px;">
 
 
 
=== Miscellaneous ===
 
</div>
 
* [[SysadminContactInfo|Contact Information]]
 
* [[Sysadmin:ImportantInfo:PhoneNumbers|Phone Numbers]]
 
* [[Sysadmin:ImportantInfo:WebSites|Web Sites]]
 
* [[Sysadmin:ImportantInfo:AuthenticationInfo|Authentication Information]]
 
* [[Sysadmin:ImportantInfo:PowerFailure|Power Failure]]
 
* [[Sysadmin:ImportantInfo:UPS|UPS]]
 
* [[Sysadmin:ImportantInfo:SSLcerts|Generating SSL Certificates]]
 
* [[Sysadmin:Power draws|Power draws]]
 
* [[Sysadmin:ImportantInfo:SunHardware|Working with Sun Hardware]]
 
* [[Sysadmin:Passwords]]
 
* Patching
 
** [[LinuxKernelPatching|Linux Kernel Patching]]
 
** [[FreeBSDKernelPatching|FreeBSD Kernel Patching]]
 
* [[Sysadmin:SerialConsoleCableEnds|Cable Ends]]
 
 
 
<!-- This has to stay as part of the formatting -->
 
</div>
 
|}
 
 
 
 
 
 
 
 
 
 
 
=== Old ===
 
 
 
Important Notes:
 
* '''''ALL of the admin '''''  '''CVS/SVN stuff has been centralized to trac.cs.earlham.edu/admin'''.  You'll need to create a username/password for yourself by running (from quark):
 
:<code>htpasswd /usr/local/trac/adminontrac.htpasswd <username></code>
 
* To check out the repository, run (from quark):
 
:<code>svn checkout file:///clients/users/svn/admin</code>
 
* [[Sysadmin:IRC|Chatting on IRC]]
 
  
'''Curent Sysadmins 2013:'''
+
===Group and institution information===
{| class="wikitable"
+
* [[Sysadmin:CS-ITS Interoperability|Working with ITS]]
|-
+
* [[Sysadmin:Recurring spending | Recurring spending ]]
! SysAdmin Name !! Year !! Working time !! Progress notes
+
* [[Sysadmin:SlackAndGitLab | Slack and GitLab integration]]
|-
 
| Wilson || SO || 100% || link to notes
 
|-
 
| Demise || SR || 100% || link to notes
 
|-
 
| Craig || FR || 100% || link to notes
 
|-
 
| Zane || SO || 100% || link to notes
 
|-
 
| Jordan || SO || 100% || link to notes
 
|-
 
| Sonny || JU || 100% || link to notes
 
|-
 
| Elena || SR || 40% || link to notes
 
|-
 
| Kristin || JU || 40% || link to notes
 
|-
 
| Aaron || SR || 20% || link to notes
 
|-
 
| Michael || SR || 0% || link to notes
 
|}
 

Latest revision as of 08:32, 20 March 2024

This is the hub for the CS sysadmins on the wiki.

Overview

If you're visually inclined, we have a colorful and easy-to-edit map of our servers here!

Server room

Our servers are in Noyes, the science building that predates the CST. For general information about the server room and how to use it, check out this page.

Columns: machine name, IPs, type (virtual, metal), purpose, dies, cores, RAM

Compute Resources

Machines and VMs related information here!

Network

We have two network fabrics linking the machines together. There are three subdomains.

10 Gb

We have 10Gb fabric to mount files over NFS. Machines with 10Gb support have an IP address in the class C range 10.10.10.0/24 and we want to add DNS to these addresses.

1 Gb (cluster, cs)

We have two class C subnets on the 1Gb fabric: 159.28.22.0/24 (CS) and 159.28.23.0/24 (cluster). This means we have double the IP addresses on the 1Gb fabric that we have on the 10Gb fabric.

Any user accessing *.cluster.earlham.edu and *.cs.earlham.edu is making calls on a 1Gb network.

Intra-cluster fabrics

The layout cluster has an Infiniband infrastructure. Wachowski has only a 1Gb infrastructure.

Power

We have a backup power supply, with batteries last upgraded in 2019 (?). We’ve had a few outages since then and power has held up well.

HVAC

HVAC systems are static and are largely managed by Facilities.

See full topology diagrams here.

A word about what's happening between files and the drives they live on.

New sysadmins

These pages will be helpful for you if you're just starting in the group:

Note: you'll need to log in with wiki credentials to see most Sysadmin pages.

Additional information

These pages contain a lot of the most important information about our systems and how we operate.

Handy Tools

Technical docs

Common tasks

Group and institution information