Sysadmin

From Earlham CS Department
Revision as of 14:29, 3 February 2017 by Broosa (talk | contribs) (Current Projects (updated 13 Oct 16))
Jump to navigation Jump to search


Machines and Brief Descriptions of Services

HOME
(vm0)
Users
SSH
NFS
NET
(vm1)
LDAP server
DNS
DHCP
WEB
(vm2)
Mailman
Mail Stack
Apache2
PostgresQL
MySQL
Wiki
TOOLS
(vm3)
SageNB Server
Jupyterhub Server
Software Modules
NginX
BABBAGE
Firewall
PROTO
Weather Monitoring
GPS/NTP
Energy Monitoring
HOPPER
Users
SSH
NFS
Software Modules
PostgresQL
Wiki
Apache2
DNS
DHCP
DALI
Gitlab
Backups
NginX
AL-SALAM
WebMO
Software Modules
Apache2
LAYOUT
Jupyterhub Server
Software Modules
NginX
Apache2
WebMO
BRONTE
Software Modules
POLLOCK
Software Modules
WebMO
NginX














Systems Administration Documentation

For old documentation, see: Old Wiki Information

Current Projects (updated 15 Jan 2017)

TODO

  • Document tools: startup / shutdown - Charlie
  • Use Sysadmin namespace for all our pages - All
    • Testing usefulness of documentation - Dave
  • Al Salam: configure switch, re-rack. - Vitalii
    • Check specs on Summit switch (purple, yellow writing) / available better switch, reset configuration
  • LDAP cleanup of system users / old groups - James
  • Layout - Nirdesh
    • Lo0 RAID (mdadm)
    • 10GB from Dali to lo0
    • BIOS reset
  • 10Gb, perfsonar, ...
  • Monitoring: (Ganglia, Shenken)
  • Whedon: configured and available
  • Change passwords (on everything). Postgres, shenken, ...
  • Webcam on office whiteboard (new office location?)
  • Learn virtual machine architecture and modules - Dave
    • Document in a format for future admin training?
    • Find existing introduction material
  • Mirror control for testing, swapping, etc.

DONE (19 Jan 2017)

  • Examine extra "layout" node. - Adam
    • Differences are: Single PSU, Single GPGPU, No VGA.
    • It has Infiniband and 10GB cards installed.
  • Networking - Adam, Charlie
    • IP over Infiniband working on layout
      • Resolved by resetting IB switch configuration: ibwarn: [3349] mad_rpc_rmpp: _do_madrpc failed; dport (Lid 1)

FUTURE

  • Centralized password database / manager / location

Current Projects (updated 13 Oct 16)

  • Groups and LDAP and sudo - James
  • Amber - James
  • Edward's setup - Vitalli
  • WebDev access - Nirdesh
  • Puppet - James and Vitalii
  • Bacula - Nirdesh
  • SSL certificate upgrade and documentation - Kristin
  • Listserv merging with archives preserved - Nirdesh
  • Ganglia - Bret
  • Shenken - Vitalii
    • latency, UPS
  • New Layout node - ? and ?
  • Provision Sappho (compute) - after Puppet
  • Provision Kahlo (storage) -
    • replace broken drive
  • I2 setup
    • DTN, storage nodes, head nodes, ports in CST
  • Provision Whedon (compute) - after Puppet
  • Shutdown and startup test - scheduled for Sunday 27 November
  • Disk cleaning - Charlie
  • Password changing in the CS and cluster domains - Vitalii and James
  • Proto setup and maintenance with HIP/Green Science