From Earlham CS Department
Math/CS 484 -- The goal of our Ford/Knight project is to distill and organize the principles of visualizing large data sets. Modern science is often done by small groups of people that come from diverse backgrounds, e.g. a mathematician, a biologist, and a computer scientist. We plan to solicit input in the form of example data sets to work with from each of the natural and social science departments on campus. This work will provide a foundation for a course, or course module, which we hope to offer in the future. Must see instructor for registration.
16) Course Reflection
In addition to the standard evaluation form please reflect for a bit and write-up a short bit that addresses these questions.
- How did the course compare to your expectations of the course?
- What did you find most interesting/useful? Least interesting/useful?
- What do we need to "package" so that other students or faculty could gain from what we've done?
- What's the best format for delivering this material? In-situ for a class or classes? 1 credit class, etc. On-demand sessions?
Please turn this in with your course evaluation form to the envelope in Bobbi's office before the end of the day on Tuesday 11 December.
15) Third Visualization Project
Find a story and build a visualization to support it. You may choose the data sets, although you must incorporate at least three. You can choose to analyze/visualize one variable over multiple data sets or multiple variables over multiple data sets, include geocoding or not, etc. Find the common thread(s) that tie your data sets together and tells the story you want to tell.
Work in pairs:
- Mobeen and Ivan
- Dee and Leif
- Emily and Tristan
- Ryan and Elena
- Mikel and Alex
Use one or more of these toolchains:
Write-up a plan for your work, include a short description of the story you are telling, the specific data sets employed, and a sketch of the visualization. This is due in class on Tuesday 4 December. Please bring a printout of your plan to class. Come to class on Thursday 29 November with questions, ideas, etc.
The final visualization (PDF, etc. and script(s)) is due in class on Thursday 6 December. Come to class prepared to give a crisp (< 8 minute) presentation about your visualization. We will be advertising this class session to science students and faculty and encouraging them to attend by bribing them with free pizza.
14) Second Visualization Project Redux
Take the feedback you received on Tuesday morning and working with your partner improve your ice core data set visualization. Due in class on Thursday 15 November. Remember to upload your modified script and PDF, PNG, etc. to the wiki.
13) Second Visualization Project
Find a story and build a visualization to support it based on ice core data sets. There are many available, e.g. from multiple locations in Antarctica and other locations. These data sets typically contain depth, measurements of particulate matter, atmospheric chemical compositions, and various climate and date proxies. You can choose to analyze/visualize one variable over multiple locations or multiple variables over a single location. Include at least three dimensions, e.g. location on the earth, depth/date, and climate proxy, or depth/date, chemical marker, and climate proxy, etc.
Work in pairs:
- Mobeen and Mikel
- Dee and Leif
- Emily and Tristan
- Ryan and Elena
- Ivan and Alex
Use one or more of these toolchains:
Write-up a plan for your work, include a short description of the story you are telling, the specific data sets employed, and a sketch of the visualization. This is due in class on Tuesday 6 November. Please bring a printout of your plan to class.
The final visualization (PDF, etc. and script(s)) is due in class on Tuesday 13 November. Come to class prepared to give a crisp (< 5 minute) presentation about your visualization.
12a) gnuplot Redux
Take the feedback you received on your gnuplot visualization and re-work it. Put the updated output and the script in the usual place appropriately labeled. Due before the start of class on Tuesday 6 November.
12) Getting Started with gnuplot
Due in class on Tuesday 30 October
- Identify 3 (or more) data sets that you can use to tell a story with an environmental theme.
- Develop your visualization using at least 25 unique commands in your gnuplot script.
- Use color, bonus points for 3D.
- Post your script and the output (PNG, JPG, etc.) on the student solutions wiki page /before class/ on Tuesday 30 October.
- Come to class on Tuesday prepared to give a < 5 minute crisp presentation about your visualization.
- You should know what your theme/data sets are by class on Thursday 25 October.
11) Science Magazine Review
Due in class on Thursday 25 October
- Browse the issue of Science that is on reserve for this class in Wildman. Find what you believe is a really well done viz, and a really poorly done one. Come to class prepared to give a short (< 5 minute) tour of the two of them explaining what they are, why they are good, and why they are bad.
10) Getting Started with R
Due in class on Thursday 18 October.
- First R lab - Post your first R visualizations /before 12p on Thursday/ to the student solutions page on the wiki, and then during class on Thursday you should briefly describe/discuss each in turn (a maximum of 5 minutes each). Make sure you watch the time so all of you have an opportunity to present your work.
- Explore, or re-explore as the case may be, the R galleries. Look at the scripts that produce the visualizations and figure-out how you might leverage some of those patterns.
Due in class on Tuesday 16 October.
- Chapters 1 and 2 in Designing Data Visualizations (previously assigned)
- Chapters 1 and 2 in Visualize This (previously assigned)
- Overview, Form and Structure, Process and Time in Visual Strategies (previously assigned)
- Part II (chapters 3, 4, 5, 6) in Designing Data Visualizations
8) First Visualization (redux)
Due in class on Tuesday 9 October. Use the feedback you received from the class and the professors to refine and improve your first visualization. Post the revised version using your placeholder on the Student Solutions page and bring a printout of it to class. Come to class prepared to give a crisp 4 minute before and after presentation to the class.
Finish the reading that was assigned earlier.
7) First Visualization
Due in class on Tuesday 2 October, both a printout and the visualization posted on the wiki. Come to class prepared to spend about 5 minutes presenting your viz to the class on Tuesday morning.
6) Plan for First Visualization
The write-up of the plan for your first visualization project is due in class on Tuesday 25 September. This should include:
- The question you are going to answer or story you are going to tell
- The data sets you will use (including URLs if available)
- Any numerical summaries you will produce
- A hand drawn draft of the visualization
To prepare for this you should read/watch the following items before you design your visualization or write-up your plan.
- David McCandless:
- The beauty of data visualizations (TED) - http://www.ted.com/talks/david_mccandless_the_beauty_of_data_visualization.html
- Military spending - http://www.guardian.co.uk/news/datablog/2010/apr/01/information-is-beautiful-military-spending
- Chapters 1 and 2 in Designing Data Visualizations (on reserve in the science library)
- Chapters 1 and 2 in Visualize This (on reserve in the science library)
5) Second Critique Tour
- For this critique tour we will use IBM's Many Eyes project, http://www-958.ibm.com/software/data/cognos/manyeyes/ Before you start spend a minute looking around the site and explore the data sets, tools, etc. that are available.
- Browse the visualizations focusing on ones based on scientific data/questions, http://www-958.ibm.com/software/data/cognos/manyeyes/visualizations?sort=rating
- Identify three (or more) visualizations that share a theme, question, or underlying data set(s). Use the evolving guidelines, Evaluating Infographics to produce a critique of each of the visualizations that you choose. Write-up each of those critiques.
- Due in class on Thursday 20 September.
4) First Critique Tour
This assignment is to be done in-class on Tuesday 11 September, 2012. In pairs review/critique one of these infographics from http://visual.ly/
- Human Languages on the Internet - Ivan, Mikel
- The Internet in 2015 - Leif, Dee
- Worldwide Internet Usage - Elena, Emily
- Technology and eCommerce - Tristan, Alex
- Responsive Web Design - Mobeen, Ryan
Each group should:
- Evaluate the infographic using the criteria listed below.
- Locate a second infographic, on Visual.ly or elsewhere, that covers roughly the same ground and evaluate it similarly.
- Prepare and deliver a 4 minute presentation which summarizes your findings during the last portion of class this morning.
Consider the guidelines we are developing, Evaluating Infographics, as you examine the infographics.
3) First Workshop - Histograms
This assignment is designed to consolidate your knowledge with histograms and give you experience generating one with a modest data set. You must do the work by hand, you can optionally use a software tool to produce it as well. Make sure you document each step of your work. This workshop is due Thursday 13 September.
2) First Lab - Measuring the Real World
Measuring the real world, the PDF. This lab is due Sunday 9 September at 3p US-ET. Turn in a (BW) printout of your writeup and visualization, along with the URL of the on-line (color) version of the visualization if it is available. Put the paper copy in Charlie's Box A in the wooden tower in the Math/CS/Physics lounge on the West end of second floor of Dennis Hall at Earlham College in Richmond, IN, US (planet Earth).
1) First Reading and Tips and Techniques Tour
Listed below are the assignments for each chunk, note that everyone should read the startup materials.
- Startup - Everyone
- Web site - Leif
- Making presentations - Mikel
- News graphics - Ivan
- Financial Data - Elena
- Decision making - Emily
- Narrative - Dee
- Aesthetics - Tristan
- Graphic design - Alex
- Scientific and engineering - Mobeen
- Animations - Ryan
As you read your chunks look for bits of guidance, advice, technique, etc. that you feel are useful. Summarize each of these in our Tips and Techniques Google Doc, make sure each entry contains an appropriate citation and follows the pattern/example at the top of the document. This tour is due Sunday 2 September.
Visualization Galleries (some with embedded tools, e.g. Many Eyes and Gapminder)
- Visually - http://visual.ly/
- IBM's Many Eyes - http://www-958.ibm.com/software/data/cognos/manyeyes/
- Tableau Public Visualization Software - http://www.tableausoftware.com/public/
- R Gallery - http://gallery.r-enthusiasts.com/
- R codes for figures in the book _R Graphics_ -- http://www.stat.auckland.ac.nz/~paul/RGraphics/rgraphics.html
- Hans Rosling's Gapminder - http://www.gapminder.org/
- Thinking with Google - http://www.thinkwithgoogle.com/insights/library/infographics/
- R graphics tutorials from the author of Visualize This - http://flowingdata.com/category/tutorials/
- A very useful R blog:
- general, with some excellent examples - http://blog.revolutionanalytics.com/graphics/
- geographic maps - http://blog.revolutionanalytics.com/2009/10/geographic-maps-in-r.html
- Download a pdf copy of A Practical Guide to Geostatistical Mapping -- http://spatial-analyst.net/book/
- Amazon - http://aws.amazon.com/datasets
- Google - http://www.google.com/publicdata/directory
- US Census - http://www.census.gov/main/www/access.html
- Project Gutenberg - http://www.gutenberg.org/
- US Government public data - http://www.data.gov/
- UK Government public data - http://data.gov.uk/
- IBM's Many Eyes - http://www-958.ibm.com/software/data/cognos/manyeyes/datasets/
Advice and Technique
- http://mazamascience.com/WorkingWithData/?p=958 - A script based introduction to R
- Sourceforge, examples, manual - http://gnuplot.sourceforge.net/
- Wikipedia gnuplot diagrams (many with source) - http://commons.wikimedia.org/wiki/Category:Gnuplot_diagrams
- Tutorial, FAQ - http://t16web.lanl.gov/Kawano/gnuplot/index-e.html
- Project - http://www.gnuplot.info/
- Thursday 23 August
- Anscombe's data sets - http://en.wikipedia.org/wiki/Anscombe's_quartet
- Sunday 26 August (retrieve notes from board pictures)
- Relative error, absolute error, systematic error, and related topics
- Standard deviation
- Precision and accuracy
- Thursday 30 August (harvest from Mic)
- Tuesday 4 September (harvest notes from board picture)
- Thursday 6 September
- Answered questions about first lab.
- Demonstrated how to upload files to the wiki, used for lab reports in PDF form.
- Tuesday 11 September
- Discussion about when to aggregate, how many readings to take and related issues
- First critique tour (in-class)
- Thursday 13 September
- Last of the first critique tour presentations
- Discuss next critique tour
- Tuesday 18 September
- Thursday 20 September
- Tuesday 25 September
- In-class review and critique lab
- Thursday 27 September
- Return and review first lab
- Q and A about first visualization project
- Tuesday 2 October
- First visualization presentations
- Thursday 4 October
- First visualization presentations (two stragglers)
- Tuesday 9 October
- First visualization presentations (redux)
- Tuesday 16 October
- R tour
- Thursday 18 October
- Mic and Charlie at the board meeting, class reviewed R stuff
- Tuesday 23 October
- Thursday 25 October
- Tuesday 30 October
- Reviewed gnuplot lab, distributed Second Viz Project