CiSE-EOT-Article

From Earlham CS Department
Revision as of 22:45, 17 January 2008 by Charliep (talk | contribs)
Jump to navigation Jump to search

Overview

CiSE Special Issue on SC/HPC Education

  • Computing in Science and Engineering (CiSE) is publishing a special issue in 2008 to highlight the role of education in developing a skilled and knowledgeable workforce capable of harnessing the power of tomorrow's high performance computing (HPC) infrastructure to solve global scale, multi-disciplinary problems critical to society and to the world.
  • The 2008 CiSE special issue on HPC will incorporate successful and innovative strategies from high school through graduate school, from all fields including traditionally under-represented fields of study, and from all institutions. Addressing society's most pressing and complex problems will only be realized with the next generation of scientists, technologists, engineers and mathematicians being well educated and experienced in adapting and using the ever-advancing HPC environments. We are interested in HPC educational problems and solutions facing all people, in particular minorities, women, and people with disabilities.

Background Information

Old Notes

The Paper

Title

Essential attributes of delivering successful HPC EOT

Authors

Dave Joiner, Charlie Peck, Tom Murphy

Introduction

  1. While the current models for delivering education, outreach, and training (EOT) in the high performance computing (HPC) realm have served the community well they often lack key attributes necessary to support the next generation of HPC users. The next generation of HPC users is going to be more diverse along many axis (enumerate).
  2. Once upon a time we described HPC education as broken…is it still?
  3. We predicted wonderful things coming down the pipe, what has come down the pipe and how has it evolved from our original ideas?
  4. Here we focus on the mechanisms and attributes of the delivery of good HPC EOT, not on the content.
  5. In order to broaden participation in HPC and computational science generally, more of our EOT efforts need to be directed towards faculty and students at smaller institutions, faculty who are primarily teachers, working with the next generation of scientists. These users require materials designed for learning how to use HPC and computational methods at a variety of levels, not just as the tools of a research scientist. These users require materials designed for learning how to use HPC and computational methods at a variety of levels, not just as the tools of a research scientist.

Challenges of modern HPC EOT

Who are the stakeholders?

A little knowledge is said to be a dangerous thing. Imagine how dangerous you can be with a little knowledge of CS, math, AND physics.

Computational Science and Engineering is by its very nature multi- inter- and cross-disciplinary, with practitioners in the fields of mathematics, computer science, and domain specific sciences having to work together (or at least use the product of each others work) in a tight nit collaboration.

For years the traditional training of computational scientists has been left largely to graduate advisers and a few specialized courses. However, there is a growing trend towards teaching computation at the undergraduate level. The Krell institute attempts to survey and list every existing computational science program, and their list shows 16 undergraduate degree programs specifically in computational science, and tens of schools with significant coursework available for undergraduate students. However, new programs are being developed and offered every year, and the current list often does not include these new programs or programs outside of traditional departments, such as multi-disciplinary centers (cite private communication). In New Jersey alon, there are 2 undergraduate degree programs, with a 3rd in the pipeline, that do not show up on the current list.

(cite Krell survey, can we get a more recent number/reference? I know of at least 3 programs in NJ alone not on the Krell list)

(can we contact Krell and make sure we are looking at the most recent list? Can we get their choice of a reference for the list?)

Possibly the biggest challenge to getting computation into undergraduate curriculum is the politics. If you are planning on developing a computational physics degree, for example, with one third of the coursework in CS, one third in Physics, and one third in mathematics, which department gets credit for the degree, and why would the other two departments have any incentive to modify their curriculum for what will likely initially be a low enrollment new program in another department? No one gets tenure, after all, for making another departments classes better.

Creating a separate department has its own hurdles, jumping from the frying pan of owning only one third of the students courses to owning significantly less than one third, as the bulk of the students coursework is still going to come from courses in existing departments in CS, math, and <name your science here>.

In addition to traditional undergraduate and undergraduate students, we also have to be concerned with the ongoing process of retraining existing professionals for whom the advance of computing was unforeseen. As computing continues to encroach into new fields and into old processes, the existing workforce faces the challenge of catching up. This training largely falls on professional organizations.

Finally, there are a large body of non-scientists who have a lack of computational literacy. This is not exactly new, we've been dealing with a mathematically illiterate public for some time, however, computational literacy in some ways has lower barriers than mathematical literacy. The "eye-candy" factor cannot be ignored, both of technology with blinking lights and computational products that are animated and less abstract than mathematical products. A 3D animation of the formation and motion of a cloud system, for example, can convey information to the novice that is more immediate than a isocontour plot of pressures across a flattened map.

Broader engagement

  1. For the purposes of EOT the current working definition of "underserved group" is too narrow. For the next generation of EOT it not only needs to cover the traditional areas of gender and race but also geography, specifically including rural areas which generally have not been well served by either technology build-out or educational efforts. These communities are full of smaller colleges and universities which could make very good use of both EOT offerings and computational resources.
  2. If we are to broaden access to computational methods for a wide range of traditionally underserved groups (TUGs), the next generation of HPC EOT will need to do a better job of supporting first generation HPC consumers. This audience has much more modest needs for computational power but requires more human capital in the form of support for workshops, virtual rounds, curriculum materials, and software interfaces optimized for HPC pedagogy.

New fields

  1. Humanities, arts, and social sciences
    1. The disciplinary focus of HPC EOT activities has largely been in the natural sciences. Going forward we should be looking to engage a much wider audience, particularly building on efforts in the humanities, arts, and social science (HASS) communities. This will require changes at both the high level, making the language and processes of Grid computing more accessible to this wider audience, and at lower levels where software interfaces, curriculum materials, and support structure will need to be provided for those disciplines.
  2. Engineering

Elements

Built on a nationally recognized curriculum, e.g. CSERD

As is often the case in the era of web, there is not so much a problem with the lack of lessons and activities as there is an inability to find lessons and activities. They often are out there if you know where to look. However, the quest is time-consuming and the quality varies. Teachers need a place where they can go to find the good stuff.

This is a big challenge, however, as there is little agreement on what the good stuff is. NOt only is there no agreement on what the good stuff is, there is no agreement on criteria by which to judge the good stuff, and a debate within the library community as to whether judging quality is even an achievable goal.

There has been some progress on this front, however. A number of efforts have been underway to define standards of quality for a nationally recognized curriculum.

(cite educational technology standards, cite Ralph Regular standards. Describe)

Additionally, efforts are underway in the digital library community to apply a recognized set of standards for computational modeling (Verification and Validation) to computational lessons (Verification, Validation, and Accreditation). The Computational Science Education Reference Desk (http://cserd.nsdl.org) is collecting computational science activities, organizing and meta-tagging them, and sharing them through the National Science Digital Library. In addition, CSERD is applying the principles of Verification, Validation, and Accreditation to provide a comprehensive list of reviewed computational science activities for classroom use.

Ubiquitous access to supercomputing resources

Access to activities, however, is meaningless without equal access to computational resources. Giveng students access to high performance resources not only allows them to practice HPC skills, but also gives them an opportunity to see the computing power that will be on the desktop when they are young professionals. This trend has held true recently not only in raw computing power but in architectural design, as the move towards commodity clusters in the HPC market has been mirrored with the move towards multi-core architectures in the desktop market.

HPC hardware in most institutions, however, is limited to access to a small group of faculty and graduate students performing research, if it exists on campus at all. What campuses do have in spades, however, is internet connected PCs. Students can make use of those internet connected PCs in three ways. First, the most destructive use, they could reformat the hard drives on the PCs and install open source linux operating systems and parallel computing tools. A bit extreme, to be sure, this has often been done with machines being down-sized from the campus computing pool. However, a deviation from the ubiquitous leftover cluster is the portable cluster, possibly made out of parts stripped from old machines, possibly built from newly ordered parts. The design we have been using is called "LittleFe," a play on words based on the common nickname "Big Iron" given to supercomputers. Essentially, you remove the most volume-consuming and weight consuming component of the commodity cluster, the cases. Using plywood mounted micro sized motherboards on aluminum frames, we have created 4-8 nodes portable clusters for the cost of a single laptop that are travel ruggedized and can be checked as airline luggage. Second, a less destructive option, students could use live-CD operating systems or network-booted systems to temporarily boot lab PCs as diskless nodes. The Bootable Cluster CD is a live-CD clustering solution designed for campus PC labs. A variation of the live-CD solution is the virtualized cluster, where students run a virtual computer node on lab PCs. The virtualization solution has benefits and drawbacks. It can be always-on and doesn't interfere with normal lab use, but often masks the hardwre underneath and does not allow students to make full use of the machines. Finally, least destructive of all, students could use the internet connected PCs to log on to resources elsewhere in the world and do their work there, through the use of grid middleware.

Including undergraduate and graduate students as assistant instructors

  1. Cost-effectively supports a lower instructor-participant ratio
  2. Broadens our reach in terms of age, gender, racial, and discipline diversity


Authenticity of experience

For the student, authenticity in the problems they face and the experience of their education is important. We use problem based learning as one approach to increase the authenticity of the student experience, through the creation of "supercomputer based labs" in which students make use of professional computing tools and libraries to solve problems related to course content and learning goals.

The use of problems as a teaching tool in HPC is particularly interesting given the cross-disciplinary nature of the field. Students solving the same problem need to be able to master at least some aspect of another field, but different students will still take away and give to the problem from their own perspective.

We have a series of lessons based around the study of protein structure using GROMACS. Depending on the students perspective, many different questions could arise from the same problem. What is the best way to optimize compiler directives for a given hardware? What hardware configuration results in the greatest performance for a given cluster? How does that relate to computing power per dollar or per megawatt?

Broad range of materials aimed at a variety of starting points

  1. Bringing the interfaces to the users, rather than bringing the users to the interfaces.


A current effort is using these tools will be a workshop with middle and high school teachers from the Office of Diné Science, Math and Technology, who are responsible for education within the Navajo Nation. The workshop is focusing on developing computational awareness using the Shodor Foundation's Interactivate tools, which has been an ongoing fertile entry point into computational science for teachers and students for over a decade. The workshop is also train teachers to use existing Vensim and NetLogo system dynamics models with their students. Using the tools to run existing models is a wonderful way to ease students towards being able to develop their own models. The system dynamics tools are complex and sophisticated, but students are quite capable of following narrow cookbook usage instructions o which they can expand depending on their interest and ability.

An into to computation science course was taught at Contra Costa College in Fall 2007 to a class of predominantly high school sophomores. Middle College High School is collocated at the community college where its students are able to take college courses. A handful of he students were ultimately able to be competent developing Vensim models, after getting over the mental barrier they were doing integral calculus without benefit of calculus or even a pre-calculus foundation. System dynamics codes rely on understanding difference equations. If we had the ability to turn mathematics education inside out, we would precede a course in calculus with one on system dynamics modeling. Integral calculus will then be more easily understood as taking the size of the time-step to zero, rather than just taking it as close to zero as necessary for the model to produce accurate results relative to the acceptable amount of error. The biggest difficulty with the class was it hinged on a student being able to reason their way to a solution, rather than regurgitating desired data. This was something of a epiphany for the instructor which will facilitate the next teaching of the course. It is also a strong argument for formally including system dynamics modeling at the high school level to foster higher reasoning skills.


Long-term engagement through repeated contacts at workshops and other events

Another important factor in authenticity is in relation to professional activity, and a key factor in professional activity is engagement in professional organizations. Through workshops hosted by the National Computational Science Institute, the SCXY conference, and other organizations, we have worked with over XXXXX faculty in fields including XXXX, XXXXX, XXXX, and XXXX.


Through the NCSI/SC workshops, we have seen a number of repeat participants, with our most successful participants coming to multiple workshops and progressing in their level of participation throughout. Typically our participants will show up the first time with a desire to see a lot of things and get a lot of hands on instruction. We know, however, we have made real impact when our participants come back to the workshop and tell us that they would really just like a machine in the corner, our help when they need it, and one week not in their office in which to get work done.

Year-round support via Virtual Rounds (ref Henry's paper)

Participation of all stakeholder disciplines

We can report on an approach to the challenge of bringing different departments together at Kean with some success, and that is to focus not just on the undergraduate, but the introductory undergraduate. The New Jersey Center for Science, Technology, and Math Education is a multi-disciplinary home for a series of science and math based degree programs which have as their core an integrated approach to math, computation, and science. By combining a number of new targeted degree programs, the challenge of low enrollment that a new degree program traditionally faces is mitigated, allowing us at the introductory level to offer special sections of math, CS, physics, biology, and chemistry--allowing the faculty who have for long said that they wanted <name your course in another department here> taught differently to live and breath in the same department as the faculty member actually teaching that course, with an expectation that people in different disciplines would (*gasp*) talk to each other.

One outcome of this is that we can state to the need in any political process resulting in incorporating computation into undergraduate education the value of working with faculty in multiple departments. It's a lot easier to go into a <physics, chemistry, biology> class planning on teaching a lesson requiring skills in numerical <calculations of cross products, integration of the changing pressure in a compressing cylinder, correlation between presence of a specific genotype with phenotype> if the students have seen computational environments and used them when learning <linear algebra,calculus,statistics>.

Another component of broadening engagement is the consideration of novel delivery mechanisms. To this end we have begun to work with 3D Internet technologies, specifically metaverses, as a tool for EOT. There are three specific models we are considering in this space: the attractor model, the rounds model, and the science based interactive simulation model.

Attractor -

Rounds -

Science Simulation -

Big audience (Second Life) vs open toolset with more appropriate primitives and controls (Qwaq/Croquet).

Support

  1. Why should you believe what we say?

Conclusion

  1. Unanswered questions
  2. Directives to readers

Possible References

References - check these to make sure what we're saying is either new or emphasized

  • CSERD
  • BCCD
  • LittleFe
  • PPoPP paper
  • HPC Wire articles
  • TeraGrid user statistics
  • Scaffolding paper
  • Henry's rounds paper (is there such an animal?)
  • AJP paper
  • Something Wonderful This Way Comes (full article), a CiSe article from Paul and Tom
  • Mary Beth's papers
  • Ralph Regula school standards