Difference between revisions of "CS382:Unit-mashup"

From Earlham CS Department
Jump to navigation Jump to search
(Lab activity - materials, process and software)
(Quiz Questions)
 
(59 intermediate revisions by 6 users not shown)
Line 1: Line 1:
= Data Visualization with Mashups =
+
= Visualization =  
* <font color="blue">Where's your abstract?  I'm wondering if you're supposed to be focusing solely on mashups or if there are other types of visualization (like WebMO type stuff computational chemistry visualization) that should be included.</font>
+
== Overview ==
 +
The goal of this unit is to teach students to:
 +
* Understand the goals of visualization.
 +
* Know what the issues involved in visualization are.
 +
* Be able to recognize and reason about the different types of visualization.
 +
* Be introduced to a sampling of the tools used to visualize data.
  
== Background reading, Resources ==
+
== Background Reading for Teachers and TAs ==
 
* [http://davidhuynh.net/media/papers/2007/iswc2007-potluck.pdf web tool for non-programmers for making mashups]
 
* [http://davidhuynh.net/media/papers/2007/iswc2007-potluck.pdf web tool for non-programmers for making mashups]
 
* [http://media.wiley.com/product_data/excerpt/12/04705151/0470515112.pdf chapter 1 of book on power of geo mashups]
 
* [http://media.wiley.com/product_data/excerpt/12/04705151/0470515112.pdf chapter 1 of book on power of geo mashups]
 +
* [http://en.wikipedia.org/wiki/Information_visualization Wikipedia page on Information Visualization]
 +
* [http://en.wikipedia.org/wiki/Visualization_(computer_graphics) Wikipedia page on Visualization]
 +
* "The Visual Display of Quantitative Information" by Edward Tufte
 +
* "The Elements of Graphing Data" by William Cleveland
  
== Lecture notes - outline form ==
+
== Reading Assignments for Students ==
=== Introduction ===
+
* Needs to be created I think <font color="red">Agreed.</font>
* Models and other datasets describing the world are very large and/or complex.
 
** Example the US Censes is X rows and Y columns
 
* In most cases with datasets this large you can't just stare at the raw data and get a feel for what it's saying.
 
* Even to develop statistical methods to get useful information you still need a notion of what you're looking for.
 
* Visualization is a way we can get a look at general trends or anomalies in an intuitive way.
 
* Visualization works better with larger data sets that you can clump.
 
  
=== Types of visualizations ===
+
== Reference Material ==
* Overlays (geographical) - example
 
* Semantic webs - example
 
* Geometrical ( graphical environments where the size/shape/movement/etc of objects is tied to data ) - example
 
* more
 
  
=== Process ===
+
== Lecture Notes ==
* Given a certain problem or question, determine what general catagories of information are needed.
+
==== Introduction ====
** What body of information is needed.
+
At this point students have already created/worked with a couple models and created basic graphs to visualize them. Talk about how even with just the simple models created so far, understanding the data is hard without having a visual representation of it.  
* Data collection
 
** finding good sources of data
 
** methods of collecting ( online databases, field work, etc... )
 
** coordinating multiple data sources
 
* Determaning what tools/type of visualization is most appropriate
 
* Encoding data to be useful (KML,etc..)
 
* Drawing general conclusions from the visualization
 
* Use more exact methods like statistics to show truthiness.
 
  
== Classroom response questions - at least three ==
+
Visualization is a graphical representation of data for the purpose of allowing humans to understand aspects of the data. Just having data isn't enough.  We have to understand what the data means.  Without visualization our models are a lot less useful because we have no other way to understand what's happening in our models short of complicated numerical analysis.  It also gives us a way to share useful information by compacting it into a powerful visualization that communicates all the information quickly.
* Whats the best type of visualization for X set of data?
 
  
== Lab activity - materials, process and software ==
+
Show [http://www.gapminder.org/| Gapminder] and go through a good exampleTalk about how this way of presenting the data makes information
Buliding a google mashup/KML document tying 2 or more datasets together. Datasets will be provided but each group would have distinct information to useTools for retrieving and inputting data would be provided but the students would still have to learn KML and interacting with Google Earth/maps. The databases are prefereably something local/personal that can provide interesting results when visualized geographically.
+
immediately obvious.
  
Possible Data sources:
+
Edward Tufte was one of the early pioneers of Visualization.
* Quaker meetings, <metric X> mashed up on map ( [http://quakermeetings.com database; we own it] )
 
* Using WebDB to map dorms to major, possibly over time (This one may not work due to not being able to access relevant data)
 
* US Censes
 
* [http://code.google.com/apis/kml/documentation KML documentation]
 
* [http://prefuse.org Prefux visualization suite]
 
  
<font color="blue">Good start, though needs to be narrowed down. Perhaps you guys can look into each of these and evaluate its feasibility.</font>
+
Tufte's aspects of visualization, just a run through (From "The Visual Display of Quantitative Information"):
 +
* Show the data.
 +
* Induce the viewer to think about the substance rather than about the methodology, graphic design, the technology of graphic production, or something else.
 +
* Avoid distorting what the data have to say.
 +
* Present many numbers in a small space.
 +
* Make large data sets coherent.
 +
* Encourage the eye to compare different pieces of data.
 +
* Reveal the data at several levels of detail, from broad overview to the fine structure.
 +
* Serve a reasonable clear purpose: description, exploration, tabulation, or decoration.
 +
* Be closely integrated with the statistical and verbal descriptions of a data set.
  
== Scheduling - early, late, dependencies on other units, length of unit ==
+
==== Issues of Visualization ====
=== Timing ===
 
Doesn't matter
 
=== Length ===
 
One week.
 
  
== Archived stuff ==
+
* Objective. There is always a goal or objective when visualizing by which one can judge effectiveness. In this class I don't think things like marketing should be mentioned but certainly the difference between using visualization to explore data and to explain data to others.
 +
* Data Selection. When given a set of data, often one wants to single in on a subset of that data to look at.
 +
* Psychology. Visualization is fundamentally about how humans perceive visual information so you have to think about the ways in which you want to take advantage of human psychology.
 +
* Systemization. While elaborate visualizations like the Napoleon one are very compelling, in Computer Science we are often more interested in visualizations that can be systematically generated.
 +
 
 +
==== Types of Visualizations ====
 +
 
 +
We are all familiar with common visualizations like graphs and tables, but now with computers we can create much more advanced visualizations that give us more information.
 +
 
 +
* 3d physical models that give us interactive 3d structures that are potentially updated by new information in real time to allow us to give us a lot of information but also allow us to choose what aspect of the data to focus on. For example visualizing a model of a car crashing into something can give people a chance to actually see whats happening to all the different parts of the car from potentially a number of different angles.
 +
 +
* Animated models allow use to use time as a dimension so we can put together more information.  The animation could either represent how the model changes over time, or represent some other aspect of the model.  The Gapminder visualization uses time to allow us to look at the different aspects of the world at each point in time so that not only can we see how the different countries interact but how those interactions have changed throughout the years.
 +
 
 +
* Data maps and mashups are another new type of visualization the comes from now having detailed maps of the world with respect to a variety of different information sources.  We have everything from satellite photos to road maps to weather data to census data.  Using all these different sources combined on a single map we can see easily see how the different aspects correlate.  In the lab students will be collecting temperature data and mashing it with satellite images.
 +
 
 +
== Lab ==
 +
Learning spreadsheet visualization tools and Google Maps to gain, respectively, immediately practical and useful skills and an alternate way to think about data.
 +
 
 +
Highlevel outline: Students will be instructed to sample temperatures at several different points within some region of campus. They will pick their own points, recording each with a provided GPS unit. They will then enter the data into a spreadsheet and graph the results using different kinds of graphs. After that, each group will combine their data into one Google Map, putting pushpins in for each sample point and coloring the pin appropriately.
 +
 
 +
==== Process ====
 +
# With provided thermometer, go to your group's assigned region ( [http://maps.google.com/maps/ms?ie=UTF8&msa=0&msid=116517761457321127401.000467a02c69be02ec8e4&ll=39.822949,-84.913845&spn=0.006254,0.009656&t=h&z=17 Region Map] )
 +
# Pick ten points in your region and sample their temperatures. Record each point's coordinates. Be sure to pick points such that you will get a variety of temperatures (ie, pick tree and building shaded spots, sunny parking lots, points near steam tunnel exhaust grates)
 +
# Enter data into Open Office, with a column for coordinates and a column for temperature readings. Generate bar graphs based on this data.
 +
# Average your temperatures into one value and add it to the collaborative class Google Docs spreadsheet. Graph the averages.
 +
# Using your datapoints, add push pins to the collaborative class Google Docs map. Color the pins based on temperature ranges. Work with the other lab groups to come up with a sensible color scheme based on the range of temperatures you've found.
 +
 
 +
==== Write-up ====
 +
* Which tool was most appropriate for visualizing this data and why?
 +
* What was the most difficult part of the lab? Why?
 +
* Explore alternative ways to get the same data (instead of getting it yourself). Are there other data sources that would provide granular enough temperature data (Hint: use google)?
 +
* Describe the collaborative process with Google Spreadsheets and Google Maps. What was difficult? What was easier? Compare and contrast this experience with non-collaborative software. How would each group's data been collected into one document?
 +
 
 +
==== Software ====
 +
* Web Browser
 +
* Google account
 +
* Open Office
 +
 
 +
==== Bill of Materials ====
 +
* GPS
 +
* Thermometer
 +
 
 +
==== Lab Notes ====
 +
* It is impossible to create 3 axis graphs in open office.  Perhaps add more detail on how to create a working bar graph in OO.
 +
* GPS is mighty inaccurate.  I am working on trying to get the differential working but perhaps it would be advantageous to have points taken from multiple areas.
 +
* We need more thermometers.
 +
* I didn't do the google docs part as I had no collaborators.
 +
* The instructions need to be more detailed on how to add points to a map on google maps
 +
Samuel Wein
 +
 
 +
== Evaluation ==
 +
==== CRS Questions ====
 +
What would be the best type of visualization for understanding the relationship between geographical region and political affiliation:
 +
* Bar graph
 +
* Table
 +
* <b>Mashup</b>
 +
* 3d model
 +
 
 +
==== Quiz Questions ====
 +
* Explain why visualization is useful when creating models.
 +
* How is time a useful dimension in a visualization?
 +
* What criteria should one use for choosing a type of visualization for a particular set of data?
 +
 
 +
= Visualization - Metadata =
 +
 
 +
== Scheduling ==
 +
Should come before anything too complicated, but after basic modeling concepts.
 +
 
 +
== Concepts, Techniques and Tools ==
 +
 
 +
 
 +
== General Education Alignment ==
 +
=== Analytical Reasoning Requirement ===
 +
==== Abstract Reasoning ====
 +
From the [[http://www.earlham.edu/curriculumguide/academics/analytical.html Catalog Description]] ''Courses qualifying for credit in Abstract Reasoning typically share these characteristics:''
 +
* ''They focus substantially on properties of classes of abstract models and operations that apply to them.''
 +
** None.
 +
* ''They provide experience in generalizing from specific instances to appropriate classes of abstract models.''
 +
** None.
 +
* ''They provide experience in solving concrete problems by a process of abstraction and manipulation at the abstract level. Typically this experience is provided by word problems which require students to formalize real-world problems in abstract terms, to solve them with techniques that apply at that abstract level, and to convert the solutions back into concrete results.''
 +
** None.
 +
 
 +
==== Quantitative Reasoning ====
 +
From the [[http://www.earlham.edu/curriculumguide/academics/analytical.html Catalog Description]] ''General Education courses in Quantitative Reasoning foster students' abilities to generate, interpret and evaluate quantitative information. In particular, Quantitative Reasoning courses help students develop abilities in such areas as:''
 +
* ''Using and interpreting formulas, graphs and tables.''
 +
** Complete. They will be doing many graphs and tables in this Unit.
 +
* ''Representing mathematical ideas symbolically, graphically, numerically and verbally.''
 +
** Partial. This unit definitely attempts to represent something graphically, but I don't think quite in the way that they mean.
 +
* ''Using mathematical and statistical ideas to solve problems in a variety of contexts.''
 +
** Partial. Looks at using statistical ideas to solve problems in the single context of Visualization.
 +
* ''Using simple models such as linear dependence, exponential growth or decay, or normal distribution.''
 +
** None.
 +
* ''Understanding basic statistical ideas such as averages, variability and probability.''
 +
** None.
 +
* ''Making estimates and checking the reasonableness of answers.''
 +
** None.
 +
* ''Recognizing the limitations of mathematical and statistical methods.''
 +
** Partial. Visualization does speak to the limitations of both visualization itself and the model a visualization represents.
 +
 
 +
=== Scientific Inquiry Requirement ===
 +
From the [[http://www.earlham.edu/curriculumguide/academics/scientific.html Catalog Description]] ''Scientific inquiry:''
 +
* ''Develops students' understanding of the natural world.''
 +
** None.
 +
* ''Strengthens students' knowledge of the scientific way of knowing — the use of systematic observation and experimentation to develop theories and test hypotheses.''
 +
** None.
 +
* ''Emphasizes and provides first-hand experience with both theoretical analysis and the collection of empirical data.''
 +
** Complete. Deals with collection of data from data sources and theoretical analysis of how to visualize it.
 +
 
 +
== Scaffolded Learning ==
 +
This unit asks students to take the types of considerations they used to build graphs not only in the previous couple units but during their entire academic history and extend them into a more general framework of visualization.
 +
 
 +
== Inquiry Based Learning ==
 +
At the moment there isn't a whole lot of this in the unit. In the lab the students will have an opportunity to explore what they can do with google maps and with graphs.
 +
 
 +
= Visualization Mechanics =
 +
== To Do ==
 +
 
 +
Try to find a tool to allow 3d graphs for the lab
 +
 
 +
== Comments ==
 +
 
 +
Fixed both.
 +
<font color="red">With a tool as sporty as Google Earth available to do geographic visualizations wouldn't it be nice to use that too in conjunction with the Census data?
 +
 
 +
Include a visualization with KML and Google Earth
 +
 
 +
Seriously consider OpenOffice</font>
 +
= Authorship =
 +
Matt Edlefsen
 +
Nate Smith

Latest revision as of 12:54, 7 May 2009

Visualization

Overview

The goal of this unit is to teach students to:

  • Understand the goals of visualization.
  • Know what the issues involved in visualization are.
  • Be able to recognize and reason about the different types of visualization.
  • Be introduced to a sampling of the tools used to visualize data.

Background Reading for Teachers and TAs

Reading Assignments for Students

  • Needs to be created I think Agreed.

Reference Material

Lecture Notes

Introduction

At this point students have already created/worked with a couple models and created basic graphs to visualize them. Talk about how even with just the simple models created so far, understanding the data is hard without having a visual representation of it.

Visualization is a graphical representation of data for the purpose of allowing humans to understand aspects of the data. Just having data isn't enough. We have to understand what the data means. Without visualization our models are a lot less useful because we have no other way to understand what's happening in our models short of complicated numerical analysis. It also gives us a way to share useful information by compacting it into a powerful visualization that communicates all the information quickly.

Show Gapminder and go through a good example. Talk about how this way of presenting the data makes information immediately obvious.

Edward Tufte was one of the early pioneers of Visualization.

Tufte's aspects of visualization, just a run through (From "The Visual Display of Quantitative Information"):

  • Show the data.
  • Induce the viewer to think about the substance rather than about the methodology, graphic design, the technology of graphic production, or something else.
  • Avoid distorting what the data have to say.
  • Present many numbers in a small space.
  • Make large data sets coherent.
  • Encourage the eye to compare different pieces of data.
  • Reveal the data at several levels of detail, from broad overview to the fine structure.
  • Serve a reasonable clear purpose: description, exploration, tabulation, or decoration.
  • Be closely integrated with the statistical and verbal descriptions of a data set.

Issues of Visualization

  • Objective. There is always a goal or objective when visualizing by which one can judge effectiveness. In this class I don't think things like marketing should be mentioned but certainly the difference between using visualization to explore data and to explain data to others.
  • Data Selection. When given a set of data, often one wants to single in on a subset of that data to look at.
  • Psychology. Visualization is fundamentally about how humans perceive visual information so you have to think about the ways in which you want to take advantage of human psychology.
  • Systemization. While elaborate visualizations like the Napoleon one are very compelling, in Computer Science we are often more interested in visualizations that can be systematically generated.

Types of Visualizations

We are all familiar with common visualizations like graphs and tables, but now with computers we can create much more advanced visualizations that give us more information.

  • 3d physical models that give us interactive 3d structures that are potentially updated by new information in real time to allow us to give us a lot of information but also allow us to choose what aspect of the data to focus on. For example visualizing a model of a car crashing into something can give people a chance to actually see whats happening to all the different parts of the car from potentially a number of different angles.
  • Animated models allow use to use time as a dimension so we can put together more information. The animation could either represent how the model changes over time, or represent some other aspect of the model. The Gapminder visualization uses time to allow us to look at the different aspects of the world at each point in time so that not only can we see how the different countries interact but how those interactions have changed throughout the years.
  • Data maps and mashups are another new type of visualization the comes from now having detailed maps of the world with respect to a variety of different information sources. We have everything from satellite photos to road maps to weather data to census data. Using all these different sources combined on a single map we can see easily see how the different aspects correlate. In the lab students will be collecting temperature data and mashing it with satellite images.

Lab

Learning spreadsheet visualization tools and Google Maps to gain, respectively, immediately practical and useful skills and an alternate way to think about data.

Highlevel outline: Students will be instructed to sample temperatures at several different points within some region of campus. They will pick their own points, recording each with a provided GPS unit. They will then enter the data into a spreadsheet and graph the results using different kinds of graphs. After that, each group will combine their data into one Google Map, putting pushpins in for each sample point and coloring the pin appropriately.

Process

  1. With provided thermometer, go to your group's assigned region ( Region Map )
  2. Pick ten points in your region and sample their temperatures. Record each point's coordinates. Be sure to pick points such that you will get a variety of temperatures (ie, pick tree and building shaded spots, sunny parking lots, points near steam tunnel exhaust grates)
  3. Enter data into Open Office, with a column for coordinates and a column for temperature readings. Generate bar graphs based on this data.
  4. Average your temperatures into one value and add it to the collaborative class Google Docs spreadsheet. Graph the averages.
  5. Using your datapoints, add push pins to the collaborative class Google Docs map. Color the pins based on temperature ranges. Work with the other lab groups to come up with a sensible color scheme based on the range of temperatures you've found.

Write-up

  • Which tool was most appropriate for visualizing this data and why?
  • What was the most difficult part of the lab? Why?
  • Explore alternative ways to get the same data (instead of getting it yourself). Are there other data sources that would provide granular enough temperature data (Hint: use google)?
  • Describe the collaborative process with Google Spreadsheets and Google Maps. What was difficult? What was easier? Compare and contrast this experience with non-collaborative software. How would each group's data been collected into one document?

Software

  • Web Browser
  • Google account
  • Open Office

Bill of Materials

  • GPS
  • Thermometer

Lab Notes

  • It is impossible to create 3 axis graphs in open office. Perhaps add more detail on how to create a working bar graph in OO.
  • GPS is mighty inaccurate. I am working on trying to get the differential working but perhaps it would be advantageous to have points taken from multiple areas.
  • We need more thermometers.
  • I didn't do the google docs part as I had no collaborators.
  • The instructions need to be more detailed on how to add points to a map on google maps

Samuel Wein

Evaluation

CRS Questions

What would be the best type of visualization for understanding the relationship between geographical region and political affiliation:

  • Bar graph
  • Table
  • Mashup
  • 3d model

Quiz Questions

  • Explain why visualization is useful when creating models.
  • How is time a useful dimension in a visualization?
  • What criteria should one use for choosing a type of visualization for a particular set of data?

Visualization - Metadata

Scheduling

Should come before anything too complicated, but after basic modeling concepts.

Concepts, Techniques and Tools

General Education Alignment

Analytical Reasoning Requirement

Abstract Reasoning

From the [Catalog Description] Courses qualifying for credit in Abstract Reasoning typically share these characteristics:

  • They focus substantially on properties of classes of abstract models and operations that apply to them.
    • None.
  • They provide experience in generalizing from specific instances to appropriate classes of abstract models.
    • None.
  • They provide experience in solving concrete problems by a process of abstraction and manipulation at the abstract level. Typically this experience is provided by word problems which require students to formalize real-world problems in abstract terms, to solve them with techniques that apply at that abstract level, and to convert the solutions back into concrete results.
    • None.

Quantitative Reasoning

From the [Catalog Description] General Education courses in Quantitative Reasoning foster students' abilities to generate, interpret and evaluate quantitative information. In particular, Quantitative Reasoning courses help students develop abilities in such areas as:

  • Using and interpreting formulas, graphs and tables.
    • Complete. They will be doing many graphs and tables in this Unit.
  • Representing mathematical ideas symbolically, graphically, numerically and verbally.
    • Partial. This unit definitely attempts to represent something graphically, but I don't think quite in the way that they mean.
  • Using mathematical and statistical ideas to solve problems in a variety of contexts.
    • Partial. Looks at using statistical ideas to solve problems in the single context of Visualization.
  • Using simple models such as linear dependence, exponential growth or decay, or normal distribution.
    • None.
  • Understanding basic statistical ideas such as averages, variability and probability.
    • None.
  • Making estimates and checking the reasonableness of answers.
    • None.
  • Recognizing the limitations of mathematical and statistical methods.
    • Partial. Visualization does speak to the limitations of both visualization itself and the model a visualization represents.

Scientific Inquiry Requirement

From the [Catalog Description] Scientific inquiry:

  • Develops students' understanding of the natural world.
    • None.
  • Strengthens students' knowledge of the scientific way of knowing — the use of systematic observation and experimentation to develop theories and test hypotheses.
    • None.
  • Emphasizes and provides first-hand experience with both theoretical analysis and the collection of empirical data.
    • Complete. Deals with collection of data from data sources and theoretical analysis of how to visualize it.

Scaffolded Learning

This unit asks students to take the types of considerations they used to build graphs not only in the previous couple units but during their entire academic history and extend them into a more general framework of visualization.

Inquiry Based Learning

At the moment there isn't a whole lot of this in the unit. In the lab the students will have an opportunity to explore what they can do with google maps and with graphs.

Visualization Mechanics

To Do

Try to find a tool to allow 3d graphs for the lab

Comments

Fixed both. With a tool as sporty as Google Earth available to do geographic visualizations wouldn't it be nice to use that too in conjunction with the Census data?

Include a visualization with KML and Google Earth

Seriously consider OpenOffice

Authorship

Matt Edlefsen Nate Smith