What is Little-Fe
One of the principle challenges to computational science and high performance computing (HPC) education is that many institutions do not have access to HPC platforms for demonstrations and laboratories. Paul Gray's Bootable Cluster CD (BCCD) project (http://bccd.cs.uni.edu) has made great strides in this area by making it possible to non-destructively, and with little effort, convert a computer lab of Windows or Macintosh computers into an ad-hoc cluster for educational use. Little-Fe takes that concept one step further by merging the BCCD with an inexpensive design for an 8 node portable computational cluster. The result is a machine that weighs less than 50 pounds, easily and safely travels via checked baggage on the airlines, and sets-up in 10 minutes wherever there is a 110V outlet and a wall to project an image on. The BCCD's list-packages feature supports curriculum modules in a variety of natural science disciplines, making the combination of Little-Fe and the BCCD a ready-to-run solution for computational science and HPC education.
In addition to making a case for the value of Little-Fe-like clusters, this article describes Little-Fe's hardware and software configuration including plans for a "do-it-yourself" version.
Why Little-Fe is Useful
Besides being fundamentally cool, Little-Fe's principle edge is resource availability for computational science education. To teach a realistic curriculum in computational science, there must be guaranteed and predictable access to HPC resources. There are currently two common barriers to this access. Local policies typically allocate HPC resources under a "research first, pedagogy second" prioritization scheme, which often precludes the use of "compute it nowÃ¢â‚¬Â science applications in the classroom. The second barrier is the capital and on-going maintenance costs associated with owning an HPC resource, this affects most mid-size and smaller educational institutions.
While relatively low-cost Beowulf-style clusters have improved this situation somewhat, HPC resource ownership is still out of reach for many educational institutions. Little-Fe's total cost is less than $2,500, making it easily affordable by a wide variety of K-16 schools.
Little-Fe's second important feature is ease of use, both technically and educationally. Our adoption of the BCCD as the software distribution toolkit makes it possible to smoothly and rapidly advance from bare hardware to science. Further, we have minimized ongoing maintenance since both hardware and software are standardized. Paul Gray from the University of Northern Iowa has successfully maintained the BCCD for many years now via a highly responsive and personable web presence directly benefiting all BCCD users.
The BCCD also provides a growing repository of computational science software and curriculum modules. We are committed to expanding these modules to enhance the use of Little-Fe. More importantly, we seek to advance the amount of quality computational science woven into the classroom, into laboratory explorations, and into student projects. As others build their Little-Fes, our efforts will leverage their support through the development of additional open source curriculum modules.
Portability is useful in a variety of settings, such as workshops, conferences, demonstrations, and the like. Portability is also useful for educators, whether illustrating principles in the K-12 arena or being easily passed from college classroom to college classroom. Little-Fe is an immediate, full-fledged, available computational resource.
Little-Fe v1 consisted of eight Travla mini-ITX VIA computers placed in a nearly indestructible Pelican case. To use it you took all the nodes, networking gear, power supplies, etc. out of the case and set it up on a table. Each node was a complete computer with its own hard drive. While this design met the portability, cost, and low power design goals, it was overweight and deployment was both time-consuming and error-prone.
Successive versions of Little-Fe have moved to a physical architecture where the compute nodes are bare Mini-ITX motherboards mounted in a custom designed cage, which in turn is housed in the Pelican case. To accomplish this we stripped the Travla nodes down, using only their motherboards, replaced their relatively large power supplies with daughter board style units which mount directly to the motherboard's ATX power connector. These changes saved both space and weight. Little-Fe v2 and beyond use diskless compute nodes, that is only the head node has a disk drive. The mechanics of this setup are described in the software section of this article. Removing 7 disk drives from the system reduced power consumption considerably and further reduced the weight and packaging complexity.
The current hardware manifest consists of:
- 8 - Via EPIA-M motherboards (http://www.viaembedded.com/product/epia_m_spec.jsp?motherboardId=81)
- 1 - PW200-M power supply (head node, http://www.mini-box.com/s.nl/c.ACCT127230/sc.8/category.13/.f)
- 7 - PW70 power supplies (compute nodes, http://www.mini-box.com/s.nl/c.ACCT127230/sc.8/category.13/.f)
- 1 - SE-600-12 100VAC-12VDC switching power supply (http://www.power-factor-1st.com/shop/enclosed-switching-power-supplies/g2-series/se-600.html)
- 10 - CAT-5 Ethernet cables; 1@250mm, 4@350mm, 4@500mm, 1@3m
- 2 - 40GB 5200RPM laptop form-factor disk drive (one for the head node, one for backup)
- 1 - CD-RW/DVD drive
- 1 - 10 port 100MB Ethernet switch
- 1 - custom motherboard cage (Contra Costa Site - *once it exists*, how about just a picture?)
- 1 - Pelican 1660 Case (http://www.pelican.com/pdfs_productes/1660.pdf)
As we continue to develop Little-Fe the parts we employ will evolve. The current parts list can be found at http://contracosta.edu/hpc/resources/Little_Fe/.
Assembling Little-Fe consists of:
- Mounting the power supplies to the motherboards
- Installing the motherboards in the cage
- Mounting the system power supply to the cage
- Cabling the power supplies
- Mounting the Ethernet switch and installing the network cabling
- Mounting the disk drive and CD-RW/DVD drive to the cage and installing the power and data cables
- Installing the cooling fans in the cage
- Plugging in the monitor, keyboard, and mouse
- Performing the initial power-up tests
- Configuring the BIOS on each motherboard to boot via the LAN and PXE
Cooling Little-Fe has been an on-going challenge which we have just recently begun to master. The problem hasn't been the total amount of heat generated, but rather airflow to particular locations on the motherboards during compute intensive loads. By re-using the 25mm fans which came with the Travla cases we have been able to improve inter-board cooling within the motherboard cage. The importance of testing heat dissipation during a variety of system loads became painfully clear to us during a NCSI Parallel Computing Workshop at Oklahoma University in August, 2005. After a presentation on Little-Fe we left it running a POV ray tracing run that was particularly large. Not 10 minutes later there was a dramatic "pop" and a small puff of smoke as one of voltage regulators on one of the motherboards went up-in-smoke. Fortunately Little-Fe can adapt easily to a 7, or fewer, node architecture.
For transportation the cage simply sits inside the Pelican case. The base of the cage is sized so that it fits snugly in the bottom of the case, this prevents Little-Fe from moving around inside the box. The addition of a single layer of foam padding on each of the six sides further cushions Little-Fe.
Early versions of Little-Fe used the Debian Linux distribution as the basis for the system software. This was augmented by a wide variety of system, communication, and computational science packages, each of which had to be installed and configured on each of the 8 nodes. Even with cluster management tools such as C3 this was still a time-consuming process. One of our primary goals has been to reduce the friction associated with using HPC resources for computational science education. This friction is made-up of the time and knowledge required to configure and maintain HPC resources. To this end we re-designed Little-Fe's system software to use Paul Gray's Bootable Cluster CD (BCCD) distribution. The BCCD comes ready-to-run with all of the system and scientific software tools necessary to support a wide range of computational science education. A list of highlights include:
- gcc, g77, and development tools, editors, profiling libraries and debugging utilities
- Cluster Command and Control (C3) tools
- MPICH, LAM-MPI and PVM in every box
- The X Window System
- OpenMosix with openmosixview and userland OpenMosix tools
- Full openPBS Scheduler support
- octave, gnuplot, Mozilla's Firefox, and about 1400 userland utlities
- Network configuration and debugging utilities
- Ganglia and other monitoring packages
Another important aspect of the BCCD environment is the ability to dynamically install packages to tailor the running environment. The BCCD distribution offers supplemental binary packages that are designed to be added as desired to a running BCCD image to extend curricular explorations, to promote research, to further profile or debug applications, and so on. These supplemental packages, installable using the BCCD "list-packages" tool, add
- functionality, such as program profiling support through perfctr and the Performance API utilities
- curricular components, such as lesson plans that leverage Gromacs
- research tools such as planned support for mpiBLAST and CONDOR
- more utilities, such as pyMPI support
- and workarounds for less-than-optimal configurations
More information about the BCCD can be found at http://bccd.cs.uni.edu.
While the name would imply that it is exclusively used for running off of a CDROM image, the BCCD has evolved to support many other forms of operation including network or PXE-booting, running from a RAM disk, and even recent success running off of USB "pen" drives. The BCCD is designed to be completely non-intrusive to the local hard drive, that is you boot from the CD. For teaching lab configurations this very important. Little-Fe's environment permits permanent installation on the local hard drive. This both simplifies the on-going use and improves performance for some types of loads. In order to accomplish this "fossilization" on the head node's hard disk the following steps are performed:
- Download and burn the current BCCD x86 ISO image from http://bccd.cs.uni.edu.
- Place the CD in Little-Fe's drive and boot the head node.
- Login as root.
- Follow the Little-Fe Bootstrap instructions at http://cluster.earlham.edu.
- Reboot the head node.
- Login as root.
- Run "$ list-packages" and install the Little-Fe Configuration package.
- The configuration package will be downloaded and run by list-packages. The script will be prompt you for information about your hardware and network configuration.
- Reboot the head node.
- Boot each of the compute nodes.
- Login as bccd.
- Start teaching computational science.
This set of steps is only required when the BCCD is initially installed on the Little-Fe hardware. Successive uses only require booting the head node and then each of the compute nodes.
The BCCD image was motivated and its evolution is sustained by efforts in the teaching of high performance computing. Curricular modules developed for the BCCD are installed through the above-mentioned "list-packages" tool. Some of the curricular modules that have been developed for the BCCD image, and used in the recent week-long National Computational Science Institute workshop on high parallel and cluster computing held this past summer at the OU Supercomputing Center for Education and Research, include content on molecular dynamics using Gromacs, application performance evaluation using PAPI, and linear algebra explorations that compare BLAS from LINPACK, ATLAS, and Kazushige Goto. Through these and other curricular packages that are being developed, the educational aspect of the LittleFe environment boasts eye-catching educational content which is extremely robust and requires minimal system setup.
Future Plans for Little-Fe
Little-Fe is very much a work-in-progress. Each time we use it as part of a workshop, class, etc. we learn more about ways to improve the design and extend its reach to additional educational settings. We are currently working on these items:
- Standardization of motherboard cage design for commercial fabrication.
- Detailed, step-by-step, plans for assembling the hardware and installing the software.
- Gigabit network fabric.
- Head node motherboard with a full set of peripheral connections, 7 compute nodes with just Ethernet. Compute node motherboards can then be cheaper, consume less power, and generate less heat.
- An as yet unrealized design goal was to be able to use Little-Fe with no external power. We originally thought to do this just via a UPS. Currently we are considering solar panels to support truely standalone usage. We are also considering MPI's regenerative braking feature, which is built into some implementations of MPI_Finalize.
- Cheaper/lighter/faster. Moore's law will affect us, as it will all other compute technology. For instance, we hear that in five years we will have a 64 core processor on our desks. It is reasonable that the five-year-in-the-future-Little-Fe will have at least 256 processors, and will be capable of exploring SMP, ccNUMA, and cluster architectures simultaneously? We hope so.