Archive:LittleFe Cluster
Contents
- 1 N O T I C E
- 2 Application for an Intel/EAPF LittleFe
- 3 How to contribute the the liberation package
- 4 Little-Fe PPC
- 5 Fossilizing the BCCD
- 6 Intel Letter
- 7 To Be Sorted
N O T I C E
Much of the content below is stale, there are a few good nuggets though. We're going to harvest those and move them to the new LittleFe website at some point RSN.
------------------------------------------------------------------------------------------------
Application for an Intel/EAPF LittleFe
Notes:
- Flavors of application:
- Designated: Assembled unit delivered to your institution.
- OU Build-out: Attend the Intermediate Parallel Programming and Cluster Computing workshop at OU on XXX
- SC11 Build-out: Attend the Build-out hosted by the SC Education Program at SC11 in Seattle
- Items the three application types have in common:
- Individual and institutional commitment as shown with a letter outlining their commitment to incorporating the LittleFe/BCCD into their curriculum and to the development of new parallel programming or cluster computing curriculum modules. On institutional letterhead signed by someone(s) with the authority to make those commitments.
- Take-back clause, that is after one year of quarterly check-ins if the plans outlined in the letter have not been met we can recall the unit.
This and related policies will need to be vetted by the "granting agency", ACM/EAPF. Check with Donna Capo.
We also need to identify who pays shipping, if take-back is necessary.
- Items that are different for each application:
- The build-it-yourself options require that a team of two people (faculty and student or two faculty) apply.
Designated
- This option is only available to institutions selected by representatives of Intel and the SC Education Program. Between 5 and 10 LittleFe units will be available through this mechanism. Institutions can request to be part of the OU or SC11 build group.
Build-out at Intermediate Parallel Workshop @ OU in Norman, Oklahoma
- Ability to arrive by Sunday morning for the build session which will take place Sunday afternoon.
- Built units are FOB OU. Recipients are responsible for all shipping costs from OU back to their home institution, typically an extra bag charge for recipients that arrived by airplane.
- Between 5 and 10 LittleFe units will be available through this mechanism.
Build-out at SC11 in Seattle, Washington
- Availability all or part of the two build slots on each of Sunday and Monday afternoon, and for the three build slots on each of Tuesday and Wednesday afternoon of the conference. Preference may be given to institutions with needed build slot availability.
- Built units are FOB SC11. Recipients are responsible for all shipping costs from SC11 back to their home institution, typically an extra bag charge for recipients that arrived by airplane.
- Between 5 and 10 LittleFe units will be available through this mechanism.
How to contribute the the liberation package
Building a test liberation.tar.gz
su to root on hopper. All the liberation build environment is checked out into /root/bccd-liberation/ In order to build a test version of the liberation package:
cvs update cvs commit ./deploy-liberation-pkg.sh <dst path> noupload
This will do the checkout and taring of the liberation package for you. It will end up in the destination path you provide. In order to install this package copy it to a littlefe (wget,scp), untar it into /usr/local/ and use the instructions but skip the step invovling downloading the liberation package. http://www.littlefe.net/mediawiki/index.php/Liberation_Instructions
Things worth editing
There are a few important things worth editing in the bccd-liberation checkout There are three overlays. These are directory structures that will be coppied over the standard bccd either on the server (lf0), clients (lf n>0) or both.
- x86-server
- x86-client
- x86-common
Beyond this there are two scripts that are run: liberate and prepareserver. Commands that go into liberate should be commands that are needed to copy the bccd onto a single machine. The commands that go into prepareserver are commands that are needed to setup lf0 as a server for pxe booting the client nodes. Also anything that edits the clients install should also go into prepareserver.
Tagging & deploying a release
To tag and deploy a liberation package release edit deploy-liberation-pkg.sh and change the $TAG variable. Change this to whatever string you want to tag the release with.
Now, run deploy-liberation-pkg.sh without any arguments. This will build liberation.tar.gz, liberation.tar.gz.sig and upload these files to bccd.cs.uni.edu.
Little-Fe PPC
This is page contains information about the 4 node PPC (Pegasos) version of Little-Fe.
This version of Little-Fe PPC is based on a Debian GNU/Linux installation. It employs UnionFS to facilitate consolidation of system and cluster software on a single hard drive (attached to lf0
). All other nodes netboot from the main image by masking the server-specific files with a lightweight overlay.
lf0
can be found atlf-ppc.cluster.earlham.edu
Documentation and bug reporting
- To report a bug, please use Bugzilla
We are documenting the production of Little-Fe in two ways: First, bugs are filed in bugzilla (and hopefully fixed). Second, we're putting flesh on a set of instructions to build a replica of the original Little-Fe PPC. The latter is probably (but not necessarily) based on the former. The distinction is mainly that Bugzilla will show how things were done wrong, while the wiki-based instructions will show how to do things right the first time.
Adding/Setting up a new node in the Debian Unified Root
These are current as of May 19, 2006.
Server Configuration
- add MAC addresses for the 100Mbit and Gbit network interfaces to
/etc/dhcp3/dhcpd.conf
- restart dhcp with
/etc/init.d/dhcp3-server restart
Client Firmware
These are the current client firmware settings necessary to boot lf[1-n]
via the Debian unified root setup. These must be set on every single client node in order to netboot successfully. If they are not there already, add or correct the following lines in nvedit
:
setenv boot-device eth:dhcp,0.0.0.0,,0.0.0.0 setenv boot-file vmlinuz-2.6.15.6 init=/linuxrc root=/dev/nfs ip=dhcp console=ttyS1,115200n1
After this is setup, type setenv auto-boot? true
at the main firmware prompt (not in nvedit
). Reboot to read in the new environment variables or set them manually and then type boot
.
Creating a new Little-Fe PPC
Follow the instructions on the Diskless Cluster Setup page.
Related pages
- Pegasos (ODW) notes - firmware info, including disk and net booting; serial console info; etc.
Fossilizing the BCCD
Your mileage may vary, and we are not updating this page any longer. We use it internally for reference, but we are now working on Liberating the BCCD (see the main Cluster Computing Group page).
This section outlines the steps required to disassemble a BCCD ISO, manifest it on a hard disk drive, and boot from that hard drive. Most or all of this must be done as root.
Mount the Images
These scripts, used for the lnx-bbc project, might prove to be helpful in working with the BCCD images: FossilScripts
The Basic Images
cd /mnt # or where ever mkdir bccd mount -t iso9660 -o loop bccd-ppc-2005-08-30T00-0500.iso bccd # on PPC mkdir initrd gunzip < bccd/boot/root.bin > initrd.ext2 mount -t ext2 -o loop initrd.ext2 initrd # on x86 mkdir lnx mount -o loop bccd/lnx.img lnx mkdir root gunzip < lnx/root.bin > root.ext2 mount -o loop root.ext2 root
The singularity
First, decompress the singularity with the cloop utility extract_compressed_fs
:
wget http://developer.linuxtag.net/knoppix/sources/cloop_0.66-1.tar.gz tar xzf cloop_0.66-1.tar.gz cd cloop-0.66 vim Makefile # add APPSONLY=1 at the top make zcode make extract_compressed_fs ./extract_compressed_fs ../bccd/singularity > ../singularity.romfs cd ..
The latest currently-available version of cloop (2.01) doesn't work for this purpose; others might (I didn't experiment), but 0.66 definitely does.
Next, mount the singularity (you must have romfs support compiled into the kernel):
mkdir singularity mount -t romfs -o loop singularity.romfs singularity
Extract the singularity
cd singularity tar cf - . | (cd /path/to/destination/partition;tar xvf -)
Create a working initrd
Create an initrd for fossilized booting with the linuxrc at http://ppckernel.org/~tobias/bccd/linuxrc:
cd /mnt/root # or where ever you mounted root.ext2 (from root.bin) wget http://ppckernel.org/~tobias/bccd/linuxrc # replace the existing linuxrc chmod a+x linuxrc cd .. umount root gzip < root.ext2 > /path/to/destination/partition/boot/root.bin
Edit singularity-init
Add / remount read-write hook
Edit /sbin/singularity-init
to remount / read-write during init, using the following command:
debug "Remounting / read-write..." mount -o rw,remount /dev/root /
This can be placed somewhere around the proc mount command.
Prepare for Fossilization of /mnt/rw
Comment out lines concerning /mnt/rw
# mount -n -t tmpfs none /mnt/rw
Add network setup to singularity-init
ifconfig eth0 inet 192.168.10.1 netmask 255.255.255.0 broadcast 192.168.10.255 up route add default gw 192.168.10.1 eth0
Configure the bootloader
Configure your bootloader (e.g., yaboot, lilo, or grub) as follows:
- boot the kernel
/boot/vmlinux
on PowerPC or/boot/bzImage
on x86 - use the initrd
/boot/root.bin
- execute the init script
/linuxrc
.
Here is a sample lilo.conf.
Setup Compatibility Nodes
Add the following to /linuxrc:
- /sbin/devfsd /dev
De-Obfuscation
Remove Unneeded Symlinks
The deal is that the BCCD is now on a different (read/writeable) medium: a harddisk. Let's un-obfuscate some of the workings. An ls -l on / will reveal a few symlinks: /etc, /home, /local, /tmp, and /var. All of these point to an appropriate directory in /mnt/rw. What happens is that since the CD is not writeable, it creates a ramdisk, copies files from /etc.ro/ to /mnt/rw/etc/ (change etc accordingly), and then the /etc symlink becomes a writeable medium.
Here's the works:
rm /etc /home /local /tmp /var mkdir /etc /home /local /tmp /var cd /etc.ro && tar cf - . | (cd /etc/; tar vf -) cd /home.ro && tar cf - . | (cd /home/; tar vf -) cd /local.ro && tar cf - . | (cd /local/; tar vf -) cd /tmp.ro && tar cf - . | (cd /tmp/; tar vf -) cd /var.ro && tar cf - . | (cd /var/; tar vf -)
You're almost done, except you should remove the place in the scripts where the bootup copies the files from /<dir>.ro/. Just comment out the lines in /sbin/singularity-init that do the copying (around line 105):
# cp -a /etc.ro /mnt/rw/etc # cp -a /var.ro /mnt/rw/var
While you're editing /sbin/singularity-init, also comment out these lines:
# rsync -plarv /lib/mozilla-1.6/plugins.ro/ /mnt/rw/plugins/ # chmod 1777 /mnt/rw/tmp # debug "Making /mnt/rw/tmp/build links" # mkdir -p /mnt/rw/tmp/build/ # mkdir -p /mnt/rw/tmp/build/staging # mkdir -p /mnt/rw/tmp/build/staging/singularity # mkdir -p /mnt/rw/tmp/build/staging/singularity/image # ln -s /lib /mnt/rw/tmp/build/staging/singularity/image/lib
Configure gcc Environment
Though the BCCD is now fossilized onto the harddrive, the gcc environment does not know this as it was compiled for the CD. It will look for files in (effectively) /tmp/build/staging/singularity/image/lib ... the directories and symlink creation that we just commented out. Since /tmp is a fossilized directory, just create a symlink inside of it:
mkdir -p /tmp/build/staging/singularity/image cd /tmp/build/staging/singularity/image/ ln -s /lib
TODO
- fix the mounting commands so that / is only mounted once (?)
- decide how to handle directories like /etc that are mounted in ram at /dev/rw/etc and populated with items from /etc.ro (leave as is, or create a script to simplify the setup for hard disk booting?)
- Kevin's done this, we just need to document
- DONE
- Kevin's done this, we just need to document
- modify init scripts to make them appropriate for hard disk booting (e.g., remove the "Enter a password for the default user" prompt)
- This appears to be done
- finish setting up networking
- create a patch against the original singularity image for /sbin/singularity-init and other modified configuration files for automating the fossilize process
- package up any binary additions with list-packages (see the package instructions in the wiki)
- last but not least, keep track of all the changes we make!
Good luck! Direct questions and comments to tobias@cs.earlham.edu.
Intel Letter
Dr. Stephen Wheat, Director
HPC Platform Office
Intel, USA
Dr. Henry Neeman of the OU Supercomputing Center for Education & Research (OSCER) suggested that we write you about the following issue.
For the past several years, the National Computational Science Institute (www.computationalscience.org) has been teaching workshops on Computational Science & Engineering, and on Parallel & Cluster Computing, to hundreds of faculty across the United States. Our subteam has taken responsibility for teaching the Parallel & Cluster Computing workshops, including three held at the University of Oklahoma and co-sponsored by OSCER, hosted by Dr. Neeman. He believes that there may be substantial synergy between our goals and Intel's.
Recently we have been tasked by the SuperComputing conference series to design and implement the education program for the SC07-SC09 conferences. As you may be aware, the overwhelming majority of the High Performance Computing (HPC) resources deployed currently are dedicated to research rather than education -- yet the nation faces a critical shortage of HPC expertise, largely because of the lack of a broad enough base of university faculty trained in HPC pedagogy.
To address this situation, our group spends a significant portion of our time designing and implementing software and hardware solutions to support teaching parallel and cluster computing and CSE. The Bootable Cluster CD (http://bccd.cs.uni.edu) and Little-Fe (http://cluster.earlham.edu/projects.html) are two manifestations of our work. The BCCD is a live CD that transforms an x86 based lab into an ad-hoc computational cluster. Little-Fe is an inexpensive, portable, 4-8 node computational cluster. The principle cost component of the Little-Fe design is the motherboard and CPUs. Our design is based on small form-factor motherboards, such as the Intel D945GPMLKR Media Series boards.
In order to support computational science curriculum development and delivery we are gearing-up to build a number of Little-Fe units, approximately 20, for use by science faculty across the country. These faculty members, working with their undergraduate student researchers, will develop curriculum modules and deliver workshops and presentations in a variety of venues. The curriculum and workshops are preparatory activities for the education program we are implementing for SC07-SC09.
Because of financial considerations, we currently find ourselves forced to use low cost non-Intel components in our Little-Fe units. However, we are aware that Intel has been a longtime supporter of HPC research and education, and that you in particular have been an advocate for precisely the kind of work that our team has been pursuing.
In light of these points, we wonder if Intel might be interested in either donating a number of these boards and CPUs or permitting us to purchase them at a discount? In exchange we could provide Intel with appropriate credit on both the physical units and in our articles about the project.
Thank-you for your time.
Paul Gray
David Joiner
Thomas Murphy
Charles Peck
Intel design
Intel® Desktop Board D945GPMLKR Media Series http://www.intel.com/products/motherboard/d945gpm/index.htm microATX (9.60 inches by 9.60 inches [243.84 millimeters by 243.84 millimeters]) 10/100/1000 interface and 10/100 interface
HPC Wire article (very stale now)
To Be Sorted
- Design Notes
- Initial Setup
- General Specifications
- Power draw
- Motherboard Information
- BIOS downloads
- Overview and Downloads
- Detailed description
- Article about Linux on VIA-M series boards
- Manual (local PDF file)
- Manual (on-line)
- Marketing glossies (local PDF file)
- Networking
- Transition to DHCP (old)
- Adding WAP
- Startup
- Shutdown
- Todo
- PXE Booting
- Wake-on-LAN
- Pictures
- BIOS Upgrade