Choosing a computing resource

From Earlham CS Department
Revision as of 15:09, 21 September 2022 by Pelibby16 (talk | contribs)
Jump to navigation Jump to search

Writeup on choosing which computing resources to use for what work

Short version

For X use Y:

  • small jobs (seconds or minutes of runtime): any machine, including your local one
  • hosting a website or web app: One of our web servers (web.cs most likely)
  • controlling a complete OS: a VM, send email to admins about this one
  • GPGPU's: Layout
  • running jobs that need lots of RAM and/or storage: a phat node (lovelace or pollock)
  • running jobs you want to split across many machines: a cluster (layout or whedon)

Check out the rest of this document for the more detailed version.

What does your computing problem look like?

Small computing jobs

A small computing job is easy to run: just run it on the computer or server you usually use.

Every system we have, from Bowie to Whedon, will run a basic Python 3 program or build and run C code. Your local machine probably will as well. If you expect your code to complete in seconds or minutes, this is probably your best choice.

You may use `nohup my_command &` for jobs you expect to take a few minutes. It puts your command in the background, returns your shell (so you can type more commands), and saves output to nohup.out in the local folder.

Websites and web apps

A few of our machines run web servers. Websites and web apps can be hosted on these machines.

A dedicated workspace where you configure the entire OS for yourself

Speak to the admins about this. We have a hypervisor running on one of our servers, and it hosts our web server among other utilities. We can quickly spin up a VM for you, either Ubuntu- or CentOS-flavored.

Anything requiring a GPGPU

The layout cluster. This is an easy answer because our other machines do not contain GPGPU’s. :)

If you're new to the term, a GPGPU (general-purpose graphics processing unit) is a GPU that performs computations in problem spaces other than rendering computer graphics.

A GPU can perform some specific computations, such as vector arithmetic, extremely quickly relative to a CPU. This is important in (for example) rendering video game animations. Many problems in the natural and computational sciences also require such calculations, hence the development of GPGPUs. Demand for their use may increase as data science and machine learning continue to grow.

Our NVIDIA GPGPUs (and associated drivers) are installed to the Layout cluster. To program using GPGPU’s, load CUDA (read more about CUDA here) on the Layout cluster. Happy coding.

High-performance computing, including scientific and research computing

If your job is expected to take hours or days, as in the case of many scientific computing and research-oriented workflows, you will want to use a system designed to handle it.

To choose this, you want to

Jumbo servers

tl;dr

  • Jumbo server: one big machine, lots of RAM and storage compared to CPU
  • Best for shared memory parallelism
  • Example problem: DNA sequence workflows
  • Hostnames: lovelace, pollock

Jumbo servers (nee phat nodes) are in many respects just bigger versions of the computers you’re accustomed to running. They have one or two CPU’s, lots of RAM, and lots of disk.

A jumbo server is the best solution for problems that require a lot of data to be loaded into memory and handled all at once, with minimal communications overhead.

Clusters

tl;dr

  • Cluster: several less powerful machines linked together to perform operations in parallel
  • Best for distributed memory parallelism
  • Example problem: molecular genomics simulations

Clusters consist of three pieces: 1. a head node that hosts the scheduler, provides Internet services, manages configuration of the system, and supports user access 2. N compute nodes, where each compute node hosts a pbs_mom and a pbs_client to do the computational work that is handed to it by the head node 3. a network switch to link all the nodes together

The use of multiple nodes requires a lot of communications overhead. As such, a cluster is well-suited to problems where data can be distributed across many places, at each of which CPU’s (and/or GPU’s) can work on it.

Detailed Specs

This section includes detailed information on each machine/cluster. This will likely be more information than you need, but it can come in handy for research or other applications where you need to provide information about how your code was run.

Whedon Specs (cluster)

Memory Details
Total Width: 72 bits
Data Width: 64 bits
Size: 256GB (8x32 GB)
Form Factor: DIMM
Set: None
Type: DDR4
Type Detail: Synchronous
Speed: 2133 MT/s
Manufacturer: Samsung
Serial Number: 315FBBB7
Configured Memory Speed: 1866 MT/s
CPU Details
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 63
Model name: Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
Stepping: 2
CPU MHz: 2599.951
CPU max MHz: 3200.0000
CPU min MHz: 1200.0000
BogoMIPS: 4799.75
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 20480K
NUMA node0 CPU(s): 0-7,16-23
NUMA node1 CPU(s): 8-15,24-31

Hamilton (cluster)

Layout (cluster)

Examples

On GitLab, you will find a series of bits of example code, mostly “hello world” code to verify that you have successfully submitted and run a job on the correct resources.