Difference between revisions of "Archive:LittleFe Cluster"

From Earlham CS Department
Jump to navigation Jump to search
(Updating to include content from child pages and reduce total archive pages)
Line 4: Line 4:
 
'''------------------------------------------------------------------------------------------------'''
 
'''------------------------------------------------------------------------------------------------'''
  
[[LittleFe:application|Application]] for an Intel/EAPF LittleFe (Draft!)
+
= Application for an Intel/EAPF LittleFe =
 +
Notes:
 +
* Flavors of application:
 +
** '''Designated:''' Assembled unit delivered to your institution.
 +
** '''OU Build-out:''' Attend the Intermediate Parallel Programming and Cluster Computing workshop at OU on XXX
 +
** '''SC11 Build-out:''' Attend the Build-out hosted by the SC Education Program at SC11 in Seattle
 +
* Items the three application types have in common:
 +
** Individual and institutional commitment as shown with a letter outlining their commitment to incorporating the LittleFe/BCCD into their curriculum and to the development of new parallel programming or cluster computing curriculum modules.  On institutional letterhead signed by someone(s) with the authority to make those commitments.
 +
** Take-back clause, that is after one year of quarterly check-ins if the plans outlined in the letter have not been met we can recall the unit.  <pre style="color:red">This and related policies will need to be vetted by the "granting agency", ACM/EAPF.  Check with Donna Capo.</pre> <pre style="color:red">We also need to identify who pays shipping, if take-back is necessary.</pre>
 +
* Items that are different for each application:
 +
** The build-it-yourself options require that a team of two people (faculty and student or two faculty) apply.
  
[[LittleFe:Little-Fe sced06|Liberation sced package build instructions]]
+
=== Designated ===
 +
* This option is only available to institutions selected by representatives of Intel and the SC Education Program.  Between 5 and 10 LittleFe units will be available through this mechanism. Institutions can request to be part of the OU or SC11 build group.
  
[[Little-Fe PPC]]
+
=== Build-out at Intermediate Parallel Workshop @ OU in Norman, Oklahoma===
 +
* Ability to arrive by Sunday morning for the build session which will take place Sunday afternoon. 
 +
* Built units are FOB OU. Recipients are responsible for all shipping costs from OU back to their home institution, typically an extra bag charge for recipients that arrived by airplane.
 +
* Between 5 and 10 LittleFe units will be available through this mechanism.
  
[[BCCD:Fossilizing|Liberating the BCCD onto Little-Fe]]
+
=== Build-out at SC11 in Seattle, Washington ===
 +
* Availability all or part of the two build slots on each of  Sunday and Monday afternoon, and for the three build slots on each of Tuesday and  Wednesday afternoon of the conference.  Preference may be given to institutions with needed build slot availability.
 +
* Built units are FOB SC11. Recipients are responsible for all shipping costs from SC11 back to their home institution, typically an extra bag charge for recipients that arrived by airplane.
 +
* Between 5 and 10 LittleFe units will be available through this mechanism.
  
[[Intel-letter|Intel letter]]
+
 
* [[Intel-design|Design notes based on Intel motherboard]]
+
=How to contribute the the liberation package=
 +
==Building a test liberation.tar.gz==
 +
su to root on hopper.
 +
All the liberation build environment is checked out into /root/bccd-liberation/
 +
In order to build a test version of the liberation package:
 +
<pre>
 +
cvs update
 +
cvs commit
 +
./deploy-liberation-pkg.sh <dst path> noupload
 +
</pre>
 +
This will do the checkout and taring of the liberation package for you. It will end up in the destination path you provide. In order to install this package copy it to a littlefe (wget,scp), untar it into /usr/local/ and use the instructions but skip the step invovling downloading the liberation package.
 +
http://www.littlefe.net/mediawiki/index.php/Liberation_Instructions
 +
 
 +
==Things worth editing==
 +
There are a few important things worth editing in the bccd-liberation checkout
 +
There are three overlays. These are directory structures that will be coppied over the standard bccd either on the server (lf0), clients (lf n>0) or both.
 +
* x86-server
 +
* x86-client
 +
* x86-common
 +
Beyond this there are two scripts that are run: liberate and prepareserver. Commands that go into liberate should be commands that are needed to copy the bccd onto a single machine. The commands that go into prepareserver are commands that are needed to setup lf0 as a server for pxe booting the client nodes. Also anything that edits the clients install should also go into prepareserver.
 +
 
 +
==Tagging & deploying a release==
 +
To tag and deploy a liberation package release edit deploy-liberation-pkg.sh and change the $TAG variable. Change this to whatever string you want to tag the release with.
 +
 
 +
Now, run deploy-liberation-pkg.sh without any arguments. This will build liberation.tar.gz, liberation.tar.gz.sig and upload these files to bccd.cs.uni.edu.
 +
 +
=Little-Fe PPC=
 +
This is page contains information about the 4 node PPC (Pegasos) version of Little-Fe.
 +
 
 +
This version of Little-Fe PPC is based on a [http://www.debian.org Debian GNU/Linux] installation.  It employs [http://www.am-utils.org/project-unionfs.html UnionFS] to facilitate consolidation of system and cluster software on a single hard drive (attached to <code>lf0</code>).  All other nodes netboot from the main image by masking the server-specific files with a lightweight overlay.
 +
 
 +
* <code>lf0</code> can be found at <code>lf-ppc.cluster.earlham.edu</code>
 +
 
 +
==Documentation and bug reporting==
 +
 
 +
* To report a bug, please use [http://cluster.earlham.edu/bugzilla/enter_bug.cgi?product=lf-ppc Bugzilla]
 +
 
 +
We are documenting the production of Little-Fe in two ways:  First, bugs are filed in bugzilla (and hopefully fixed).  Second, we're putting flesh on a set of instructions to build a replica of the original Little-Fe PPC.  The latter is probably (but not necessarily) based on the former.  The distinction is mainly that Bugzilla will show how things were done wrong, while the wiki-based instructions will show how to do things right the first time.
 +
 
 +
==Adding/Setting up a new node in the Debian Unified Root==
 +
These are current as of May 19, 2006.
 +
 
 +
===Server Configuration===
 +
* add MAC addresses for the 100Mbit and Gbit network interfaces to <code>/etc/dhcp3/dhcpd.conf</code>
 +
* restart dhcp with <code>/etc/init.d/dhcp3-server restart</code>
 +
 
 +
===Client Firmware===
 +
These are the current client firmware settings necessary to boot <code>lf[1-n]</code> via the Debian unified root setup.  These must be set on every single client node in order to netboot successfully.  If they are not there already, add or correct the following lines in <code>nvedit</code>:
 +
<pre>
 +
setenv boot-device eth:dhcp,0.0.0.0,,0.0.0.0
 +
setenv boot-file vmlinuz-2.6.15.6 init=/linuxrc root=/dev/nfs ip=dhcp console=ttyS1,115200n1
 +
</pre>
 +
After this is setup, type <code>setenv auto-boot? true</code> at the main firmware prompt (not in <code>nvedit</code>).  Reboot to read in the new environment variables or set them manually and then type <code>boot</code>.
 +
 
 +
==Creating a new Little-Fe PPC==
 +
 
 +
Follow the instructions on the [[Diskless Cluster Setup]] page.
 +
 
 +
==Related pages==
 +
* [[Pegasos|Pegasos (ODW) notes]] - firmware info, including disk and net booting; serial console info; etc.
 +
 
 +
 
 +
=Fossilizing the BCCD=
 +
Your mileage may vary, and we are not updating this page any longer.  We use it internally for reference, but we are now working on Liberating the BCCD (see the main Cluster Computing Group page).
 +
 
 +
This section outlines the steps required to disassemble a BCCD ISO, manifest it on a hard disk drive, and boot from that hard drive.  Most or all of this '''must be done as root'''.
 +
 
 +
==Mount the Images==
 +
 
 +
These scripts, used for the lnx-bbc project, might prove to be helpful in working with the BCCD images:
 +
[[BCCD:FossilScripts|FossilScripts]]
 +
===The Basic Images===
 +
<pre>
 +
cd /mnt # or where ever
 +
mkdir bccd
 +
mount -t iso9660 -o loop bccd-ppc-2005-08-30T00-0500.iso bccd
 +
 
 +
# on PPC
 +
mkdir initrd
 +
gunzip < bccd/boot/root.bin > initrd.ext2
 +
mount -t ext2 -o loop initrd.ext2 initrd
 +
 
 +
# on x86
 +
mkdir lnx
 +
mount -o loop bccd/lnx.img lnx
 +
mkdir root
 +
gunzip < lnx/root.bin > root.ext2
 +
mount -o loop root.ext2 root
 +
</pre>
 +
 
 +
===The singularity===
 +
'''First, decompress the singularity with the cloop utility <code>extract_compressed_fs</code>:'''
 +
<pre>
 +
wget http://developer.linuxtag.net/knoppix/sources/cloop_0.66-1.tar.gz
 +
tar xzf cloop_0.66-1.tar.gz
 +
cd cloop-0.66
 +
vim Makefile # add APPSONLY=1 at the top
 +
make zcode
 +
make extract_compressed_fs
 +
./extract_compressed_fs ../bccd/singularity > ../singularity.romfs
 +
cd ..
 +
</pre>
 +
The latest currently-available version of cloop (2.01) doesn't work for this purpose; others might (I didn't experiment), but 0.66 definitely does.
 +
 
 +
'''Next, mount the singularity (you must have romfs support compiled into the kernel):'''
 +
<pre>
 +
mkdir singularity
 +
mount -t romfs -o loop singularity.romfs singularity
 +
</pre>
 +
 
 +
==Extract the singularity==
 +
<pre>
 +
cd singularity
 +
tar cf - . | (cd /path/to/destination/partition;tar xvf -)
 +
</pre>
 +
 
 +
==Create a working initrd==
 +
Create an initrd for fossilized booting with the linuxrc at http://ppckernel.org/~tobias/bccd/linuxrc:
 +
<pre>
 +
cd /mnt/root # or where ever you mounted root.ext2 (from root.bin)
 +
wget http://ppckernel.org/~tobias/bccd/linuxrc # replace the existing linuxrc
 +
chmod a+x linuxrc
 +
cd ..
 +
umount root
 +
gzip < root.ext2 > /path/to/destination/partition/boot/root.bin
 +
</pre>
 +
 
 +
==Edit singularity-init==
 +
===Add / remount read-write hook===
 +
Edit <code>/sbin/singularity-init</code> to remount / read-write during init, using the following command:
 +
<pre>
 +
debug "Remounting / read-write..."
 +
mount -o rw,remount /dev/root /
 +
</pre>
 +
This can be placed somewhere around the proc mount command.
 +
 
 +
===Prepare for Fossilization of /mnt/rw===
 +
Comment out lines concerning /mnt/rw
 +
<pre>
 +
# mount -n -t tmpfs none /mnt/rw
 +
</pre>
 +
 
 +
===Add network setup to singularity-init===
 +
<pre>
 +
ifconfig eth0 inet 192.168.10.1 netmask 255.255.255.0 broadcast 192.168.10.255 up
 +
route add default gw 192.168.10.1 eth0
 +
</pre>
 +
 
 +
==Configure the bootloader==
 +
Configure your bootloader (e.g., yaboot, lilo, or grub) as follows:
 +
* boot the kernel <code>/boot/vmlinux</code> on PowerPC or <code>/boot/bzImage</code> on x86
 +
* use the initrd <code>/boot/root.bin</code>
 +
* execute the init script <code>/linuxrc</code>.
 +
 
 +
Here is a sample [http://ppckernel.org/~tmcnulty/bccd/lilo.conf lilo.conf].
 +
 
 +
===Setup Compatibility Nodes===
 +
Add the following to /linuxrc:
 +
* /sbin/devfsd /dev
 +
 
 +
==De-Obfuscation==
 +
 
 +
===Remove Unneeded Symlinks===
 +
 
 +
The deal is that the BCCD is now on a different (read/writeable) medium: a harddisk.  Let's un-obfuscate some of the workings.  An ls -l on / will reveal a few symlinks: /etc, /home, /local, /tmp, and /var.  All of these point to an appropriate directory in /mnt/rw.  What happens is that since the CD is not writeable, it creates a ramdisk, copies files from /etc.ro/ to /mnt/rw/etc/ (change etc accordingly), and then the /etc symlink becomes a writeable medium.
 +
 
 +
Here's the works:
 +
<pre>
 +
rm /etc /home /local /tmp /var
 +
mkdir /etc /home /local /tmp /var
 +
cd /etc.ro  && tar cf - . | (cd /etc/;  tar vf -)
 +
cd /home.ro  && tar cf - . | (cd /home/;  tar vf -)
 +
cd /local.ro && tar cf - . | (cd /local/; tar vf -)
 +
cd /tmp.ro  && tar cf - . | (cd /tmp/;  tar vf -)
 +
cd /var.ro  && tar cf - . | (cd /var/;  tar vf -)
 +
</pre>
 +
 
 +
You're almost done, except you should remove the place in the scripts where the bootup copies the files from /<dir>.ro/.  Just comment out the lines in /sbin/singularity-init that do the copying (around line 105):
 +
 
 +
<pre>
 +
# cp -a /etc.ro /mnt/rw/etc
 +
# cp -a /var.ro /mnt/rw/var
 +
</pre>
 +
 
 +
While you're editing /sbin/singularity-init, also comment out these lines:
 +
 
 +
<pre>
 +
# rsync -plarv /lib/mozilla-1.6/plugins.ro/ /mnt/rw/plugins/
 +
# chmod 1777 /mnt/rw/tmp
 +
# debug "Making /mnt/rw/tmp/build links"
 +
# mkdir -p /mnt/rw/tmp/build/
 +
# mkdir -p /mnt/rw/tmp/build/staging
 +
# mkdir -p /mnt/rw/tmp/build/staging/singularity
 +
# mkdir -p /mnt/rw/tmp/build/staging/singularity/image
 +
# ln -s /lib /mnt/rw/tmp/build/staging/singularity/image/lib
 +
</pre>
 +
 
 +
===Configure gcc Environment===
 +
 
 +
Though the BCCD is now fossilized onto the harddrive, the gcc environment does not know this as it was compiled for the CD.  It will look for files in (effectively) /tmp/build/staging/singularity/image/lib ... the directories and symlink creation that we just commented out.  Since /tmp is a fossilized directory, just create a symlink inside of it:
 +
 
 +
<pre>
 +
mkdir -p /tmp/build/staging/singularity/image
 +
cd /tmp/build/staging/singularity/image/
 +
ln -s /lib
 +
</pre>
 +
 
 +
==TODO==
 +
* fix the mounting commands so that / is only mounted once (?)
 +
* decide how to handle directories like /etc that are mounted in ram at /dev/rw/etc and populated with items from /etc.ro (leave as is, or create a script to simplify the setup for hard disk booting?)
 +
** Kevin's done this, we just need to document
 +
*** DONE
 +
* modify init scripts to make them appropriate for hard disk booting (e.g., remove the "Enter a password for the default user" prompt)
 +
** This appears to be done
 +
* finish setting up networking
 +
* create a patch against the original singularity image for /sbin/singularity-init and other modified configuration files for automating the fossilize process
 +
* package up any binary additions with list-packages (see the package instructions in the wiki)
 +
* last but not least, keep track of all the changes we make!
 +
 
 +
Good luck!  Direct questions and comments to [mailto:tobias@cs.earlham.edu tobias@cs.earlham.edu].
 +
 
 +
 
 +
=Intel Letter=
 +
Dr. Stephen Wheat, Director<br>
 +
HPC Platform Office<br>
 +
Intel, USA<br>
 +
 
 +
Dr. Henry Neeman of the OU Supercomputing Center for Education & Research (OSCER) suggested that we write you about the following issue.
 +
 
 +
For the past several years, the National Computational Science Institute (www.computationalscience.org) has been teaching workshops on Computational Science & Engineering, and on Parallel & Cluster Computing, to hundreds of faculty across the United States.  Our subteam has taken responsibility for teaching the Parallel & Cluster Computing workshops, including three held at the University of Oklahoma and co-sponsored by OSCER, hosted by Dr. Neeman.  He believes that there may be substantial synergy between our goals and Intel's.
 +
 
 +
Recently we have been tasked by the SuperComputing conference series to design and implement the education program for the SC07-SC09 conferences.  As you may be aware, the overwhelming majority of the High Performance Computing (HPC) resources deployed currently are dedicated to research rather than education -- yet the nation faces a critical shortage of HPC expertise, largely because of the lack of a broad enough base of university faculty trained in HPC pedagogy.
 +
 
 +
To address this situation, our group spends a significant portion of our time designing and implementing software and hardware solutions to support teaching parallel and cluster computing and CSE.  The Bootable Cluster CD (http://bccd.cs.uni.edu) and Little-Fe (http://cluster.earlham.edu/projects.html) are two manifestations of our work.  The BCCD is a live CD that transforms an x86 based lab into an ad-hoc computational cluster.  Little-Fe is an inexpensive, portable, 4-8 node computational cluster.  The principle cost component of the Little-Fe design is the motherboard and CPUs.  Our design is based on small form-factor motherboards, such as the Intel D945GPMLKR Media Series boards.
 +
 
 +
In order to support computational science curriculum development and delivery we are gearing-up to build a number of Little-Fe units, approximately 20, for use by science faculty across the country.  These faculty members, working with their undergraduate student researchers, will develop curriculum modules and deliver workshops and presentations in a variety of venues.  The curriculum and workshops are preparatory activities for the education program we are implementing for SC07-SC09.
 +
 
 +
Because of financial considerations, we currently find ourselves forced to use low cost non-Intel components in our Little-Fe units.  However, we are aware that Intel has been a longtime supporter of HPC research and education, and that you in particular have been an advocate for precisely the kind of work that our team has been pursuing.
 +
 
 +
In light of these points,  we wonder if Intel might be interested in either donating a number of these boards and CPUs or permitting us to purchase them at a discount?  In exchange we could provide Intel with appropriate credit on both the physical units and in our articles about the project.
 +
 
 +
Thank-you for your time.
 +
 
 +
Paul Gray<br>
 +
David Joiner<br>
 +
Thomas Murphy<br>
 +
Charles Peck
 +
 
 +
 
 +
==Intel design==
 +
Intel® Desktop Board D945GPMLKR Media Series
 +
http://www.intel.com/products/motherboard/d945gpm/index.htm
 +
microATX (9.60 inches by 9.60 inches [243.84 millimeters by 243.84 millimeters])
 +
10/100/1000 interface and 10/100 interface
  
 
[[HPCWireArticle|HPC Wire article (very stale now)]]
 
[[HPCWireArticle|HPC Wire article (very stale now)]]
Line 23: Line 293:
 
[[LittleFe:Setup|Setup]]
 
[[LittleFe:Setup|Setup]]
  
=== To Be Sorted ===
+
 
 +
 
 +
 
 +
 
 +
 
 +
=To Be Sorted=
 
* [[Design Notes|Design Notes]]
 
* [[Design Notes|Design Notes]]
 
* [[LittleFe:Initial Setup|Initial Setup]]
 
* [[LittleFe:Initial Setup|Initial Setup]]

Revision as of 17:46, 12 February 2025

N O T I C E

Much of the content below is stale, there are a few good nuggets though. We're going to harvest those and move them to the new LittleFe website at some point RSN.

------------------------------------------------------------------------------------------------

Application for an Intel/EAPF LittleFe

Notes:

  • Flavors of application:
    • Designated: Assembled unit delivered to your institution.
    • OU Build-out: Attend the Intermediate Parallel Programming and Cluster Computing workshop at OU on XXX
    • SC11 Build-out: Attend the Build-out hosted by the SC Education Program at SC11 in Seattle
  • Items the three application types have in common:
    • Individual and institutional commitment as shown with a letter outlining their commitment to incorporating the LittleFe/BCCD into their curriculum and to the development of new parallel programming or cluster computing curriculum modules. On institutional letterhead signed by someone(s) with the authority to make those commitments.
    • Take-back clause, that is after one year of quarterly check-ins if the plans outlined in the letter have not been met we can recall the unit.
      This and related policies will need to be vetted by the "granting agency", ACM/EAPF.  Check with Donna Capo.
      We also need to identify who pays shipping, if take-back is necessary.
  • Items that are different for each application:
    • The build-it-yourself options require that a team of two people (faculty and student or two faculty) apply.

Designated

  • This option is only available to institutions selected by representatives of Intel and the SC Education Program. Between 5 and 10 LittleFe units will be available through this mechanism. Institutions can request to be part of the OU or SC11 build group.

Build-out at Intermediate Parallel Workshop @ OU in Norman, Oklahoma

  • Ability to arrive by Sunday morning for the build session which will take place Sunday afternoon.
  • Built units are FOB OU. Recipients are responsible for all shipping costs from OU back to their home institution, typically an extra bag charge for recipients that arrived by airplane.
  • Between 5 and 10 LittleFe units will be available through this mechanism.

Build-out at SC11 in Seattle, Washington

  • Availability all or part of the two build slots on each of Sunday and Monday afternoon, and for the three build slots on each of Tuesday and Wednesday afternoon of the conference. Preference may be given to institutions with needed build slot availability.
  • Built units are FOB SC11. Recipients are responsible for all shipping costs from SC11 back to their home institution, typically an extra bag charge for recipients that arrived by airplane.
  • Between 5 and 10 LittleFe units will be available through this mechanism.


How to contribute the the liberation package

Building a test liberation.tar.gz

su to root on hopper. All the liberation build environment is checked out into /root/bccd-liberation/ In order to build a test version of the liberation package:

 
cvs update
cvs commit
./deploy-liberation-pkg.sh <dst path> noupload

This will do the checkout and taring of the liberation package for you. It will end up in the destination path you provide. In order to install this package copy it to a littlefe (wget,scp), untar it into /usr/local/ and use the instructions but skip the step invovling downloading the liberation package. http://www.littlefe.net/mediawiki/index.php/Liberation_Instructions

Things worth editing

There are a few important things worth editing in the bccd-liberation checkout There are three overlays. These are directory structures that will be coppied over the standard bccd either on the server (lf0), clients (lf n>0) or both.

  • x86-server
  • x86-client
  • x86-common

Beyond this there are two scripts that are run: liberate and prepareserver. Commands that go into liberate should be commands that are needed to copy the bccd onto a single machine. The commands that go into prepareserver are commands that are needed to setup lf0 as a server for pxe booting the client nodes. Also anything that edits the clients install should also go into prepareserver.

Tagging & deploying a release

To tag and deploy a liberation package release edit deploy-liberation-pkg.sh and change the $TAG variable. Change this to whatever string you want to tag the release with.

Now, run deploy-liberation-pkg.sh without any arguments. This will build liberation.tar.gz, liberation.tar.gz.sig and upload these files to bccd.cs.uni.edu.

Little-Fe PPC

This is page contains information about the 4 node PPC (Pegasos) version of Little-Fe.

This version of Little-Fe PPC is based on a Debian GNU/Linux installation. It employs UnionFS to facilitate consolidation of system and cluster software on a single hard drive (attached to lf0). All other nodes netboot from the main image by masking the server-specific files with a lightweight overlay.

  • lf0 can be found at lf-ppc.cluster.earlham.edu

Documentation and bug reporting

We are documenting the production of Little-Fe in two ways: First, bugs are filed in bugzilla (and hopefully fixed). Second, we're putting flesh on a set of instructions to build a replica of the original Little-Fe PPC. The latter is probably (but not necessarily) based on the former. The distinction is mainly that Bugzilla will show how things were done wrong, while the wiki-based instructions will show how to do things right the first time.

Adding/Setting up a new node in the Debian Unified Root

These are current as of May 19, 2006.

Server Configuration

  • add MAC addresses for the 100Mbit and Gbit network interfaces to /etc/dhcp3/dhcpd.conf
  • restart dhcp with /etc/init.d/dhcp3-server restart

Client Firmware

These are the current client firmware settings necessary to boot lf[1-n] via the Debian unified root setup. These must be set on every single client node in order to netboot successfully. If they are not there already, add or correct the following lines in nvedit:

setenv boot-device eth:dhcp,0.0.0.0,,0.0.0.0
setenv boot-file vmlinuz-2.6.15.6 init=/linuxrc root=/dev/nfs ip=dhcp console=ttyS1,115200n1

After this is setup, type setenv auto-boot? true at the main firmware prompt (not in nvedit). Reboot to read in the new environment variables or set them manually and then type boot.

Creating a new Little-Fe PPC

Follow the instructions on the Diskless Cluster Setup page.

Related pages


Fossilizing the BCCD

Your mileage may vary, and we are not updating this page any longer. We use it internally for reference, but we are now working on Liberating the BCCD (see the main Cluster Computing Group page).

This section outlines the steps required to disassemble a BCCD ISO, manifest it on a hard disk drive, and boot from that hard drive. Most or all of this must be done as root.

Mount the Images

These scripts, used for the lnx-bbc project, might prove to be helpful in working with the BCCD images: FossilScripts

The Basic Images

cd /mnt # or where ever
mkdir bccd
mount -t iso9660 -o loop bccd-ppc-2005-08-30T00-0500.iso bccd

# on PPC
mkdir initrd
gunzip < bccd/boot/root.bin > initrd.ext2
mount -t ext2 -o loop initrd.ext2 initrd

# on x86
mkdir lnx
mount -o loop bccd/lnx.img lnx
mkdir root
gunzip < lnx/root.bin > root.ext2
mount -o loop root.ext2 root

The singularity

First, decompress the singularity with the cloop utility extract_compressed_fs:

wget http://developer.linuxtag.net/knoppix/sources/cloop_0.66-1.tar.gz
tar xzf cloop_0.66-1.tar.gz
cd cloop-0.66
vim Makefile # add APPSONLY=1 at the top
make zcode
make extract_compressed_fs
./extract_compressed_fs ../bccd/singularity > ../singularity.romfs
cd ..

The latest currently-available version of cloop (2.01) doesn't work for this purpose; others might (I didn't experiment), but 0.66 definitely does.

Next, mount the singularity (you must have romfs support compiled into the kernel):

mkdir singularity
mount -t romfs -o loop singularity.romfs singularity

Extract the singularity

cd singularity
tar cf - . | (cd /path/to/destination/partition;tar xvf -)

Create a working initrd

Create an initrd for fossilized booting with the linuxrc at http://ppckernel.org/~tobias/bccd/linuxrc:

cd /mnt/root # or where ever you mounted root.ext2 (from root.bin)
wget http://ppckernel.org/~tobias/bccd/linuxrc # replace the existing linuxrc
chmod a+x linuxrc
cd ..
umount root
gzip < root.ext2 > /path/to/destination/partition/boot/root.bin

Edit singularity-init

Add / remount read-write hook

Edit /sbin/singularity-init to remount / read-write during init, using the following command:

debug "Remounting / read-write..."
mount -o rw,remount /dev/root /

This can be placed somewhere around the proc mount command.

Prepare for Fossilization of /mnt/rw

Comment out lines concerning /mnt/rw

# mount -n -t tmpfs none /mnt/rw

Add network setup to singularity-init

ifconfig eth0 inet 192.168.10.1 netmask 255.255.255.0 broadcast 192.168.10.255 up
route add default gw 192.168.10.1 eth0

Configure the bootloader

Configure your bootloader (e.g., yaboot, lilo, or grub) as follows:

  • boot the kernel /boot/vmlinux on PowerPC or /boot/bzImage on x86
  • use the initrd /boot/root.bin
  • execute the init script /linuxrc.

Here is a sample lilo.conf.

Setup Compatibility Nodes

Add the following to /linuxrc:

  • /sbin/devfsd /dev

De-Obfuscation

Remove Unneeded Symlinks

The deal is that the BCCD is now on a different (read/writeable) medium: a harddisk. Let's un-obfuscate some of the workings. An ls -l on / will reveal a few symlinks: /etc, /home, /local, /tmp, and /var. All of these point to an appropriate directory in /mnt/rw. What happens is that since the CD is not writeable, it creates a ramdisk, copies files from /etc.ro/ to /mnt/rw/etc/ (change etc accordingly), and then the /etc symlink becomes a writeable medium.

Here's the works:

rm /etc /home /local /tmp /var
mkdir /etc /home /local /tmp /var
cd /etc.ro   && tar cf - . | (cd /etc/;   tar vf -)
cd /home.ro  && tar cf - . | (cd /home/;  tar vf -)
cd /local.ro && tar cf - . | (cd /local/; tar vf -)
cd /tmp.ro   && tar cf - . | (cd /tmp/;   tar vf -)
cd /var.ro   && tar cf - . | (cd /var/;   tar vf -)

You're almost done, except you should remove the place in the scripts where the bootup copies the files from /<dir>.ro/. Just comment out the lines in /sbin/singularity-init that do the copying (around line 105):

# cp -a /etc.ro /mnt/rw/etc
# cp -a /var.ro /mnt/rw/var

While you're editing /sbin/singularity-init, also comment out these lines:

# rsync -plarv /lib/mozilla-1.6/plugins.ro/ /mnt/rw/plugins/
# chmod 1777 /mnt/rw/tmp
# debug "Making /mnt/rw/tmp/build links"
# mkdir -p /mnt/rw/tmp/build/
# mkdir -p /mnt/rw/tmp/build/staging
# mkdir -p /mnt/rw/tmp/build/staging/singularity
# mkdir -p /mnt/rw/tmp/build/staging/singularity/image
# ln -s /lib /mnt/rw/tmp/build/staging/singularity/image/lib

Configure gcc Environment

Though the BCCD is now fossilized onto the harddrive, the gcc environment does not know this as it was compiled for the CD. It will look for files in (effectively) /tmp/build/staging/singularity/image/lib ... the directories and symlink creation that we just commented out. Since /tmp is a fossilized directory, just create a symlink inside of it:

mkdir -p /tmp/build/staging/singularity/image
cd /tmp/build/staging/singularity/image/
ln -s /lib

TODO

  • fix the mounting commands so that / is only mounted once (?)
  • decide how to handle directories like /etc that are mounted in ram at /dev/rw/etc and populated with items from /etc.ro (leave as is, or create a script to simplify the setup for hard disk booting?)
    • Kevin's done this, we just need to document
      • DONE
  • modify init scripts to make them appropriate for hard disk booting (e.g., remove the "Enter a password for the default user" prompt)
    • This appears to be done
  • finish setting up networking
  • create a patch against the original singularity image for /sbin/singularity-init and other modified configuration files for automating the fossilize process
  • package up any binary additions with list-packages (see the package instructions in the wiki)
  • last but not least, keep track of all the changes we make!

Good luck! Direct questions and comments to tobias@cs.earlham.edu.


Intel Letter

Dr. Stephen Wheat, Director
HPC Platform Office
Intel, USA

Dr. Henry Neeman of the OU Supercomputing Center for Education & Research (OSCER) suggested that we write you about the following issue.

For the past several years, the National Computational Science Institute (www.computationalscience.org) has been teaching workshops on Computational Science & Engineering, and on Parallel & Cluster Computing, to hundreds of faculty across the United States. Our subteam has taken responsibility for teaching the Parallel & Cluster Computing workshops, including three held at the University of Oklahoma and co-sponsored by OSCER, hosted by Dr. Neeman. He believes that there may be substantial synergy between our goals and Intel's.

Recently we have been tasked by the SuperComputing conference series to design and implement the education program for the SC07-SC09 conferences. As you may be aware, the overwhelming majority of the High Performance Computing (HPC) resources deployed currently are dedicated to research rather than education -- yet the nation faces a critical shortage of HPC expertise, largely because of the lack of a broad enough base of university faculty trained in HPC pedagogy.

To address this situation, our group spends a significant portion of our time designing and implementing software and hardware solutions to support teaching parallel and cluster computing and CSE. The Bootable Cluster CD (http://bccd.cs.uni.edu) and Little-Fe (http://cluster.earlham.edu/projects.html) are two manifestations of our work. The BCCD is a live CD that transforms an x86 based lab into an ad-hoc computational cluster. Little-Fe is an inexpensive, portable, 4-8 node computational cluster. The principle cost component of the Little-Fe design is the motherboard and CPUs. Our design is based on small form-factor motherboards, such as the Intel D945GPMLKR Media Series boards.

In order to support computational science curriculum development and delivery we are gearing-up to build a number of Little-Fe units, approximately 20, for use by science faculty across the country. These faculty members, working with their undergraduate student researchers, will develop curriculum modules and deliver workshops and presentations in a variety of venues. The curriculum and workshops are preparatory activities for the education program we are implementing for SC07-SC09.

Because of financial considerations, we currently find ourselves forced to use low cost non-Intel components in our Little-Fe units. However, we are aware that Intel has been a longtime supporter of HPC research and education, and that you in particular have been an advocate for precisely the kind of work that our team has been pursuing.

In light of these points, we wonder if Intel might be interested in either donating a number of these boards and CPUs or permitting us to purchase them at a discount? In exchange we could provide Intel with appropriate credit on both the physical units and in our articles about the project.

Thank-you for your time.

Paul Gray
David Joiner
Thomas Murphy
Charles Peck


Intel design

Intel® Desktop Board D945GPMLKR Media Series
http://www.intel.com/products/motherboard/d945gpm/index.htm
microATX (9.60 inches by 9.60 inches [243.84 millimeters by 243.84 millimeters])
10/100/1000 interface and 10/100 interface

HPC Wire article (very stale now)

Hardware Manifest

FAQ

Setup




To Be Sorted