Difference between revisions of "CS382:Unit-mashup"

From Earlham CS Department
Jump to navigation Jump to search
(Lecture Notes)
(Quiz Questions)
 
(14 intermediate revisions by 4 users not shown)
Line 21: Line 21:
  
 
== Lecture Notes ==
 
== Lecture Notes ==
* Introduction
+
==== Introduction ====
** At this point students have already created/worked with a couple models and created basic graphs to visualize them.  Talk about how even with just the simple models created so far, understanding the data is hard without having a visual representation of it.
+
At this point students have already created/worked with a couple models and created basic graphs to visualize them.  Talk about how even with just the simple models created so far, understanding the data is hard without having a visual representation of it.  
** Visualization is a graphical representation of data for the purpose of allowing humans to understand aspects of the data.
 
*** Couple of illustrative but basic graphs as examples.
 
**** [http://upload.wikimedia.org/wikipedia/commons/2/29/Minard.png| March of Napoleon]
 
**** [http://www.gapminder.org/| Gapminder]
 
** Tufte's aspects of visualization, just a run through (From "The Visual Display of Quantitative Information"):
 
*** Show the data.
 
*** Induce the viewer to think about the substance rather than about the methodology, graphic design, the technology of graphic production, or something else.
 
*** Avoid distorting what the data have to say.
 
*** Present many numbers in a small space.
 
*** Make large data sets coherent.
 
*** Encourage the eye to compare different pieces of data.
 
*** Reveal the data at several levels of detail, from broad overview to the fine structure.
 
*** Serve a reasonable clear purpose: description, exploration, tabulation, or decoration.
 
*** Be closely integrated with the statistical and verbal descriptions of a data set.
 
** Show some more complex examples like the Napoleon one, an interesting mashup.
 
* Issues of visualization
 
** Objective. There is always a goal or objective when visualizing by which one can judge effectiveness. In this class I don't think things like marketing should be mentioned but certainly the difference between using visualization to explore data and to explain data to others.
 
** Data Selection. When given a set of data, often one wants to single in on a subset of that data to look at.
 
** Psychology. Visualization is fundamentally about how humans perceive visual information so you have to think about the ways in which you want to take advantage of human psychology.
 
** Systemization. While elaborate visualizations like the Napoleon one are very compelling, in Computer Science we are often more interested in visualizations that can be systematically generated.
 
* Go through a couple of examples of creating visualizations referring back to Tufte's list and the issues.
 
* Types of Visualizations  (A sampling)
 
** Tables
 
** Graphs
 
** Charts
 
** Sparklines
 
** Time Series
 
** Data maps and mashups
 
  
<font color="red">Seems a bit short.  Acquiring data, conditioning data, tools to use for those and visualization.  
+
Visualization is a graphical representation of data for the purpose of allowing humans to understand aspects of the data. Just having data isn't enough.  We have to understand what the data means.  Without visualization our models are a lot less useful because we have no other way to understand what's happening in our models short of complicated numerical analysisIt also gives us a way to share useful information by compacting it into a powerful visualization that communicates all the information quickly.
  
Consider showing really good graphics (Napoleon, earthquake video, etc.) and really bad ones (Tufte's examples) as part of the lectureMuch easier to show good and bad then explain it.</font>
+
Show [http://www.gapminder.org/| Gapminder] and go through a good example.  Talk about how this way of presenting the data makes information
 +
immediately obvious.
 +
 
 +
Edward Tufte was one of the early pioneers of Visualization.
 +
 
 +
Tufte's aspects of visualization, just a run through (From "The Visual Display of Quantitative Information"):
 +
* Show the data.
 +
* Induce the viewer to think about the substance rather than about the methodology, graphic design, the technology of graphic production, or something else.
 +
* Avoid distorting what the data have to say.
 +
* Present many numbers in a small space.
 +
* Make large data sets coherent.
 +
* Encourage the eye to compare different pieces of data.
 +
* Reveal the data at several levels of detail, from broad overview to the fine structure.
 +
* Serve a reasonable clear purpose: description, exploration, tabulation, or decoration.
 +
* Be closely integrated with the statistical and verbal descriptions of a data set.
 +
 
 +
==== Issues of Visualization ====
 +
 
 +
* Objective. There is always a goal or objective when visualizing by which one can judge effectiveness. In this class I don't think things like marketing should be mentioned but certainly the difference between using visualization to explore data and to explain data to others.
 +
* Data Selection. When given a set of data, often one wants to single in on a subset of that data to look at.
 +
* Psychology. Visualization is fundamentally about how humans perceive visual information so you have to think about the ways in which you want to take advantage of human psychology.
 +
* Systemization. While elaborate visualizations like the Napoleon one are very compelling, in Computer Science we are often more interested in visualizations that can be systematically generated.
 +
 
 +
==== Types of Visualizations ====
 +
 
 +
We are all familiar with common visualizations like graphs and tables, but now with computers we can create much more advanced visualizations that give us more information.
 +
 
 +
* 3d physical models that give us interactive 3d structures that are potentially updated by new information in real time to allow us to give us a lot of information but also allow us to choose what aspect of the data to focus on. For example visualizing a model of a car crashing into something can give people a chance to actually see whats happening to all the different parts of the car from potentially a number of different angles.
 +
 +
* Animated models allow use to use time as a dimension so we can put together more information.  The animation could either represent how the model changes over time, or represent some other aspect of the modelThe Gapminder visualization uses time to allow us to look at the different aspects of the world at each point in time so that not only can we see how the different countries interact but how those interactions have changed throughout the years.
 +
 
 +
* Data maps and mashups are another new type of visualization the comes from now having detailed maps of the world with respect to a variety of different information sources.  We have everything from satellite photos to road maps to weather data to census data.  Using all these different sources combined on a single map we can see easily see how the different aspects correlate.  In the lab students will be collecting temperature data and mashing it with satellite images.
  
 
== Lab ==  
 
== Lab ==  
Use online tools to generate tabular data from the U.S. Census and then use Spreadsheet software and Google mashups to explore visualization.
+
Learning spreadsheet visualization tools and Google Maps to gain, respectively, immediately practical and useful skills and an alternate way to think about data.
 +
 
 +
Highlevel outline: Students will be instructed to sample temperatures at several different points within some region of campus. They will pick their own points, recording each with a provided GPS unit. They will then enter the data into a spreadsheet and graph the results using different kinds of graphs. After that, each group will combine their data into one Google Map, putting pushpins in for each sample point and coloring the pin appropriately.
  
 
==== Process ====
 
==== Process ====
* Part 1
+
# With provided thermometer, go to your group's assigned region ( [http://maps.google.com/maps/ms?ie=UTF8&msa=0&msid=116517761457321127401.000467a02c69be02ec8e4&ll=39.822949,-84.913845&spn=0.006254,0.009656&t=h&z=17 Region Map] )
** Open up Spreadsheet with a small set of provided data. <font color="blue">Do you have the 'provided data' already?</font>
+
# Pick ten points in your region and sample their temperatures. Record each point's coordinates. Be sure to pick points such that you will get a variety of temperatures (ie, pick tree and building shaded spots, sunny parking lots, points near steam tunnel exhaust grates)
** Generate a sequence of 3 or 4 graphs and charts based on the data.
+
# Enter data into Open Office, with a column for coordinates and a column for temperature readings. Generate bar graphs based on this data.
*** For each graph identify pros and cons of that type of graph for this data.
+
# Average your temperatures into one value and add it to the collaborative class Google Docs spreadsheet. Graph the averages.
*** Identify best graph type for data.  <font color="blue">Will some of this have been covered already?  How will they know what they're looking for in pros/cons?</font>
+
# Using your datapoints, add push pins to the collaborative class Google Docs map. Color the pins based on temperature ranges. Work with the other lab groups to come up with a sensible color scheme based on the range of temperatures you've found.
* Part 2
 
** Open a browser to the [http://usa.ipums.org/usa/ | IPUMS website] <font color="darkmagenta">Break down the site navigation: Where do they need to go? How do they register? etc</font>
 
*** Select a specific (provided) set of criteria <font color="blue">What is the criteria?</font>
 
*** Generate the data based on the criteria <font color="blue">???</font>
 
*** Download data in text form
 
*** Import data into Spreadsheet <font color="darkmagenta">This too will need a step-by-step breakdown</font>
 
** Given your experience from Part 1, what type of visualization do you think is most appropriate and why.
 
** Generate a given type of graph.
 
*** What, if anything, interesting can you see from the graph?
 
** Use tool to generate google mashup. <font color="blue">This is very vague.</font>  <font color="darkmagenta">What is "tool"? Specify what the tools are, where we find them and how we use them</font>
 
*** What differences in terms of visual information are there between the graph and the mashup?
 
* Part 3
 
** Go back to the IPUMS site and generate data based on your own set of criteria.
 
** Create a graph and mashup as in Part 2.  Use general visualization guidelines to maximize the effectiveness of your visualizations. <font color="blue">What do you mean by 'general visualization guidelines'?  Or is this something that will have been covered at this point?</font>  <font color="darkmagenta">This section will be a lot more lucid with something for someone to sit down and try. Evaluation of their visualizations aside, we need to know: what are the provided data for part 1? How do we navigate the IPUMS site? What are these "criteria" for choosing the data? How do we get the data into a spreadsheet? What is "Tool"? (is it kind of like [http://www.penny-arcade.com/comic/2009/3/9/|"Book"]?) Where do we find Tool? How do we use Tool to make a mashup?</font>
 
  
 
==== Write-up ====
 
==== Write-up ====
* Notes from lab including results from Part 3  <font color="blue">You might want to be a little more specific because people are lazy.  Or they might go, uh, I didn't take any notes...</font> <font color="darkmagenta">These notes might be their answers to the questions like "identify pros and cons" or "what's interesting about this graph"?</font>
+
* Which tool was most appropriate for visualizing this data and why?  
* What types of visualizations seem most appropriate for visualizing this type of census data and why?
+
* What was the most difficult part of the lab? Why?
* What could you determine from the visualizations you created in Part 3?
+
* Explore alternative ways to get the same data (instead of getting it yourself). Are there other data sources that would provide granular enough temperature data (Hint: use google)?
* What are the advantages and disadvantages of mashups vs traditional graphs?
+
* Describe the collaborative process with Google Spreadsheets and Google Maps. What was difficult? What was easier? Compare and contrast this experience with non-collaborative software. How would each group's data been collected into one document?
* When creating the visualizations for Part 3, what guidelines did you follow to make the visualization more effective and how did you follow them?
 
* In what way could your visualizations be improved given more time or more advanced tools?
 
 
 
<font color="blue">This section is very good!</font> <font color="darkmagenta">Ditto</font>
 
  
 
==== Software ====  
 
==== Software ====  
* Spreadsheet software.
+
* Web Browser
* Web browser.
+
* Google account
* Access to IPUMS.
+
* Open Office
  
 
==== Bill of Materials ====
 
==== Bill of Materials ====
* None assuming we don't go and buy spreadsheet software and/or web browser.
+
* GPS
 +
* Thermometer
 +
 
 +
==== Lab Notes ====
 +
* It is impossible to create 3 axis graphs in open office.  Perhaps add more detail on how to create a working bar graph in OO.
 +
* GPS is mighty inaccurate.  I am working on trying to get the differential working but perhaps it would be advantageous to have points taken from multiple areas.
 +
* We need more thermometers.
 +
* I didn't do the google docs part as I had no collaborators.
 +
* The instructions need to be more detailed on how to add points to a map on google maps
 +
Samuel Wein
  
 
== Evaluation ==  
 
== Evaluation ==  
 
==== CRS Questions ====  
 
==== CRS Questions ====  
* Whats the best type of visualization for X set of data?
+
What would be the best type of visualization for understanding the relationship between geographical region and political affiliation:
* XXX
+
* Bar graph
* XXX
+
* Table
 +
* <b>Mashup</b>
 +
* 3d model
  
 
==== Quiz Questions ====  
 
==== Quiz Questions ====  
* XXX A question.
+
* Explain why visualization is useful when creating models.
 +
* How is time a useful dimension in a visualization?
 +
* What criteria should one use for choosing a type of visualization for a particular set of data?
  
 
= Visualization - Metadata =  
 
= Visualization - Metadata =  
XXX This section contains information about the goals of the unit and the approaches taken to meet them.
 
  
 
== Scheduling ==  
 
== Scheduling ==  
Line 114: Line 113:
  
 
== Concepts, Techniques and Tools ==  
 
== Concepts, Techniques and Tools ==  
XXX This is a placeholder for a list of items from the context page.
+
 
  
 
== General Education Alignment ==
 
== General Education Alignment ==
Line 157: Line 156:
  
 
== Inquiry Based Learning ==  
 
== Inquiry Based Learning ==  
XXX Some prose.
+
At the moment there isn't a whole lot of this in the unit. In the lab the students will have an opportunity to explore what they can do with google maps and with graphs.
  
 
= Visualization Mechanics =  
 
= Visualization Mechanics =  
 
== To Do ==
 
== To Do ==
<font color="red">Consider doing something based on IBM's Many Eyes tool.</font>
+
 
 +
Try to find a tool to allow 3d graphs for the lab
 +
 
 
== Comments ==
 
== Comments ==
  
Line 171: Line 172:
 
Seriously consider OpenOffice</font>
 
Seriously consider OpenOffice</font>
 
= Authorship =  
 
= Authorship =  
Matthew Edlefsen
+
Matt Edlefsen
 +
Nate Smith

Latest revision as of 13:54, 7 May 2009

Visualization

Overview

The goal of this unit is to teach students to:

  • Understand the goals of visualization.
  • Know what the issues involved in visualization are.
  • Be able to recognize and reason about the different types of visualization.
  • Be introduced to a sampling of the tools used to visualize data.

Background Reading for Teachers and TAs

Reading Assignments for Students

  • Needs to be created I think Agreed.

Reference Material

Lecture Notes

Introduction

At this point students have already created/worked with a couple models and created basic graphs to visualize them. Talk about how even with just the simple models created so far, understanding the data is hard without having a visual representation of it.

Visualization is a graphical representation of data for the purpose of allowing humans to understand aspects of the data. Just having data isn't enough. We have to understand what the data means. Without visualization our models are a lot less useful because we have no other way to understand what's happening in our models short of complicated numerical analysis. It also gives us a way to share useful information by compacting it into a powerful visualization that communicates all the information quickly.

Show Gapminder and go through a good example. Talk about how this way of presenting the data makes information immediately obvious.

Edward Tufte was one of the early pioneers of Visualization.

Tufte's aspects of visualization, just a run through (From "The Visual Display of Quantitative Information"):

  • Show the data.
  • Induce the viewer to think about the substance rather than about the methodology, graphic design, the technology of graphic production, or something else.
  • Avoid distorting what the data have to say.
  • Present many numbers in a small space.
  • Make large data sets coherent.
  • Encourage the eye to compare different pieces of data.
  • Reveal the data at several levels of detail, from broad overview to the fine structure.
  • Serve a reasonable clear purpose: description, exploration, tabulation, or decoration.
  • Be closely integrated with the statistical and verbal descriptions of a data set.

Issues of Visualization

  • Objective. There is always a goal or objective when visualizing by which one can judge effectiveness. In this class I don't think things like marketing should be mentioned but certainly the difference between using visualization to explore data and to explain data to others.
  • Data Selection. When given a set of data, often one wants to single in on a subset of that data to look at.
  • Psychology. Visualization is fundamentally about how humans perceive visual information so you have to think about the ways in which you want to take advantage of human psychology.
  • Systemization. While elaborate visualizations like the Napoleon one are very compelling, in Computer Science we are often more interested in visualizations that can be systematically generated.

Types of Visualizations

We are all familiar with common visualizations like graphs and tables, but now with computers we can create much more advanced visualizations that give us more information.

  • 3d physical models that give us interactive 3d structures that are potentially updated by new information in real time to allow us to give us a lot of information but also allow us to choose what aspect of the data to focus on. For example visualizing a model of a car crashing into something can give people a chance to actually see whats happening to all the different parts of the car from potentially a number of different angles.
  • Animated models allow use to use time as a dimension so we can put together more information. The animation could either represent how the model changes over time, or represent some other aspect of the model. The Gapminder visualization uses time to allow us to look at the different aspects of the world at each point in time so that not only can we see how the different countries interact but how those interactions have changed throughout the years.
  • Data maps and mashups are another new type of visualization the comes from now having detailed maps of the world with respect to a variety of different information sources. We have everything from satellite photos to road maps to weather data to census data. Using all these different sources combined on a single map we can see easily see how the different aspects correlate. In the lab students will be collecting temperature data and mashing it with satellite images.

Lab

Learning spreadsheet visualization tools and Google Maps to gain, respectively, immediately practical and useful skills and an alternate way to think about data.

Highlevel outline: Students will be instructed to sample temperatures at several different points within some region of campus. They will pick their own points, recording each with a provided GPS unit. They will then enter the data into a spreadsheet and graph the results using different kinds of graphs. After that, each group will combine their data into one Google Map, putting pushpins in for each sample point and coloring the pin appropriately.

Process

  1. With provided thermometer, go to your group's assigned region ( Region Map )
  2. Pick ten points in your region and sample their temperatures. Record each point's coordinates. Be sure to pick points such that you will get a variety of temperatures (ie, pick tree and building shaded spots, sunny parking lots, points near steam tunnel exhaust grates)
  3. Enter data into Open Office, with a column for coordinates and a column for temperature readings. Generate bar graphs based on this data.
  4. Average your temperatures into one value and add it to the collaborative class Google Docs spreadsheet. Graph the averages.
  5. Using your datapoints, add push pins to the collaborative class Google Docs map. Color the pins based on temperature ranges. Work with the other lab groups to come up with a sensible color scheme based on the range of temperatures you've found.

Write-up

  • Which tool was most appropriate for visualizing this data and why?
  • What was the most difficult part of the lab? Why?
  • Explore alternative ways to get the same data (instead of getting it yourself). Are there other data sources that would provide granular enough temperature data (Hint: use google)?
  • Describe the collaborative process with Google Spreadsheets and Google Maps. What was difficult? What was easier? Compare and contrast this experience with non-collaborative software. How would each group's data been collected into one document?

Software

  • Web Browser
  • Google account
  • Open Office

Bill of Materials

  • GPS
  • Thermometer

Lab Notes

  • It is impossible to create 3 axis graphs in open office. Perhaps add more detail on how to create a working bar graph in OO.
  • GPS is mighty inaccurate. I am working on trying to get the differential working but perhaps it would be advantageous to have points taken from multiple areas.
  • We need more thermometers.
  • I didn't do the google docs part as I had no collaborators.
  • The instructions need to be more detailed on how to add points to a map on google maps

Samuel Wein

Evaluation

CRS Questions

What would be the best type of visualization for understanding the relationship between geographical region and political affiliation:

  • Bar graph
  • Table
  • Mashup
  • 3d model

Quiz Questions

  • Explain why visualization is useful when creating models.
  • How is time a useful dimension in a visualization?
  • What criteria should one use for choosing a type of visualization for a particular set of data?

Visualization - Metadata

Scheduling

Should come before anything too complicated, but after basic modeling concepts.

Concepts, Techniques and Tools

General Education Alignment

Analytical Reasoning Requirement

Abstract Reasoning

From the [Catalog Description] Courses qualifying for credit in Abstract Reasoning typically share these characteristics:

  • They focus substantially on properties of classes of abstract models and operations that apply to them.
    • None.
  • They provide experience in generalizing from specific instances to appropriate classes of abstract models.
    • None.
  • They provide experience in solving concrete problems by a process of abstraction and manipulation at the abstract level. Typically this experience is provided by word problems which require students to formalize real-world problems in abstract terms, to solve them with techniques that apply at that abstract level, and to convert the solutions back into concrete results.
    • None.

Quantitative Reasoning

From the [Catalog Description] General Education courses in Quantitative Reasoning foster students' abilities to generate, interpret and evaluate quantitative information. In particular, Quantitative Reasoning courses help students develop abilities in such areas as:

  • Using and interpreting formulas, graphs and tables.
    • Complete. They will be doing many graphs and tables in this Unit.
  • Representing mathematical ideas symbolically, graphically, numerically and verbally.
    • Partial. This unit definitely attempts to represent something graphically, but I don't think quite in the way that they mean.
  • Using mathematical and statistical ideas to solve problems in a variety of contexts.
    • Partial. Looks at using statistical ideas to solve problems in the single context of Visualization.
  • Using simple models such as linear dependence, exponential growth or decay, or normal distribution.
    • None.
  • Understanding basic statistical ideas such as averages, variability and probability.
    • None.
  • Making estimates and checking the reasonableness of answers.
    • None.
  • Recognizing the limitations of mathematical and statistical methods.
    • Partial. Visualization does speak to the limitations of both visualization itself and the model a visualization represents.

Scientific Inquiry Requirement

From the [Catalog Description] Scientific inquiry:

  • Develops students' understanding of the natural world.
    • None.
  • Strengthens students' knowledge of the scientific way of knowing — the use of systematic observation and experimentation to develop theories and test hypotheses.
    • None.
  • Emphasizes and provides first-hand experience with both theoretical analysis and the collection of empirical data.
    • Complete. Deals with collection of data from data sources and theoretical analysis of how to visualize it.

Scaffolded Learning

This unit asks students to take the types of considerations they used to build graphs not only in the previous couple units but during their entire academic history and extend them into a more general framework of visualization.

Inquiry Based Learning

At the moment there isn't a whole lot of this in the unit. In the lab the students will have an opportunity to explore what they can do with google maps and with graphs.

Visualization Mechanics

To Do

Try to find a tool to allow 3d graphs for the lab

Comments

Fixed both. With a tool as sporty as Google Earth available to do geographic visualizations wouldn't it be nice to use that too in conjunction with the Census data?

Include a visualization with KML and Google Earth

Seriously consider OpenOffice

Authorship

Matt Edlefsen Nate Smith