Difference between revisions of "CS382:Unit-mashup"

From Earlham CS Department
Jump to navigation Jump to search
(Lecture 1)
Line 20: Line 20:
  
 
== Lecture Notes ==
 
== Lecture Notes ==
=== Lecture 1 ===
 
 
* Introduction
 
* Introduction
 
** At this point students have already created/worked with a couple models and created basic graphs to visualize them.  Talk about how even with just the simple models created so far, understanding the data is hard without having a visual representation of it.
 
** At this point students have already created/worked with a couple models and created basic graphs to visualize them.  Talk about how even with just the simple models created so far, understanding the data is hard without having a visual representation of it.
Line 42: Line 41:
 
** Systemization. While elaborate visualizations like the Napoleon one are very compelling, in Computer Science we are often more interested in visualizations that can be systematically generated.
 
** Systemization. While elaborate visualizations like the Napoleon one are very compelling, in Computer Science we are often more interested in visualizations that can be systematically generated.
 
* Go through a couple of examples of creating visualizations referring back to Tufte's list and the issues.
 
* Go through a couple of examples of creating visualizations referring back to Tufte's list and the issues.
 
+
* Types of Visualizations  (A sampling)
=== Lecture 2 ===
+
** Tables
* Backgrou
+
** Graphs
** Models and other datasets describing the world are very large and/or complex.
+
** Charts
*** Example the US Censes is X rows and Y columns
+
** Sparklines
** In most cases with datasets this large you can't just stare at the raw data and get a feel for what it's saying.
+
** Time Series
** Even to develop statistical methods to get useful information you still need a notion of what you're looking for.
+
** Data maps and mashups
** Visualization is a way we can get a look at general trends or anomalies in an intuitive way.
 
** Visualization works better with larger data sets that you can clump.
 
* Types of visualizations
 
** Overlays (geographical) - example
 
** Semantic webs - example
 
** Geometrical ( graphical environments where the size/shape/movement/etc of objects is tied to data ) - example
 
** more
 
* Process
 
** Given a certain problem or question, determine what general catagories of information are needed.
 
*** What body of information is needed.
 
** Data collection
 
*** finding good sources of data
 
*** methods of collecting ( online databases, field work, etc... )
 
*** coordinating multiple data sources
 
** Determaning what tools/type of visualization is most appropriate
 
** Encoding data to be useful (KML,etc..)
 
** Drawing general conclusions from the visualization
 
** Use more exact methods like statistics to show truthiness.
 
  
 
== Lab ==  
 
== Lab ==  
Building a google mashup/KML document tying 2 or more datasets together. Datasets will be provided but each group would have distinct information to use. Tools for retrieving and inputting data would be provided but the students would still have to learn KML and interacting with Google Earth/maps. The databases are prefereably something local/personal that can provide interesting results when visualized geographically.
+
Use online tools to generate tabular data from the U.S. Census and then use R to explore visualization.
  
Possible Data sources:
+
* Use a web site for generating census tables and walked through generating a predetermined table.
* Quaker meetings, <metric X> mashed up on map ( [http://quakermeetings.com database; we own it] )
+
* Load up R and generate simple pie charts and bar graphs using the predetermined table
* Using WebDB to map dorms to major, possibly over time (This one may not work due to not being able to access relevant data)
+
* Use provided tools in R for generating other types of more complex graphs (e.g. Trellis plots), apply them to the data, and then explain the differences between them.
* US Censes
+
* Use the web site to come up with your own data sets and play around with generating different visualizations in R.
* [http://code.google.com/apis/kml/documentation KML documentation]
+
* Come up with 1 or 2 interesting examples and explain why you used the visualization you used and what you learned from the visualization.
* [http://prefuse.org Prefux visualization suite]
 
  
 
==== Software ====  
 
==== Software ====  
XXX What title, version, supported platforms, license, etc.
+
* R
  
 
==== Bill of Materials ====  
 
==== Bill of Materials ====  
XXX A list of all the required stuff with quantities and cost estimates.
 
  
 
== Evaluation ==  
 
== Evaluation ==  
Line 141: Line 120:
  
 
= To Do =
 
= To Do =
* Charlie and Matt are going to talk today (3/9/09) at 3:00.
 

Revision as of 09:21, 23 March 2009

Data Visualization

Overview

The goal of this unit is to teach students to:

  • Understand the goals of visualization.
  • Know what the issues involved in visualization are.
  • Be able to recognize and reason about the different types of visualization.
  • Be introduced to a sampling of the tools used to visualize data.

Background Reading for Teachers and TAs

Reading Assignments for Students

Reference Material

Lecture Notes

  • Introduction
    • At this point students have already created/worked with a couple models and created basic graphs to visualize them. Talk about how even with just the simple models created so far, understanding the data is hard without having a visual representation of it.
    • Visualization is a graphical representation of data for the purpose of allowing humans to understand aspects of the data.
      • Couple of illustrative but basic graphs as examples.
    • Tufte's aspects of visualization, just a run through (From "The Visual Display of Quanitative Information"):
      • Show the data.
      • Induce the viewer to think about the substance rather than about the methodology, graphic design, the technology of graphic production, or something else.
      • Avoid distorting what the data have to say.
      • Present many numbers in a small space.
      • Make large data sets coherent.
      • Encourage the eye to compare different pieces of data.
      • Reveal the data at several levels of detail, from broad overview to the fine structure.
      • Serve a reasonable clear purpose: description, exploration, tabulation, or decoration.
      • Be closely integrated with the statistical and verbal descriptions of a data set.
    • Show some more complex examples like the Napoleon one, an interesting mashup.
  • Issues of visualization
    • Objective. There is always a goal or objective when visualizing by which one can judge effectiveness. In this class I don't think things like marketing should be mentioned but certainly the difference between using visualization to explore data and to explain data to others.
    • Data Selection. When given a set of data, often one wants to single in on a subset of that data to look at.
    • Psychology. Visualization is fundamentally about how humans perceive visual information so you have to think about the ways in which you want to take advantage of human psychology.
    • Systemization. While elaborate visualizations like the Napoleon one are very compelling, in Computer Science we are often more interested in visualizations that can be systematically generated.
  • Go through a couple of examples of creating visualizations referring back to Tufte's list and the issues.
  • Types of Visualizations (A sampling)
    • Tables
    • Graphs
    • Charts
    • Sparklines
    • Time Series
    • Data maps and mashups

Lab

Use online tools to generate tabular data from the U.S. Census and then use R to explore visualization.

  • Use a web site for generating census tables and walked through generating a predetermined table.
  • Load up R and generate simple pie charts and bar graphs using the predetermined table
  • Use provided tools in R for generating other types of more complex graphs (e.g. Trellis plots), apply them to the data, and then explain the differences between them.
  • Use the web site to come up with your own data sets and play around with generating different visualizations in R.
  • Come up with 1 or 2 interesting examples and explain why you used the visualization you used and what you learned from the visualization.

Software

  • R

Bill of Materials

Evaluation

CRS Questions

  • Whats the best type of visualization for X set of data?
  • XXX
  • XXX

Quiz Questions

  • XXX A question.

Visualizing Data - Metadata

XXX This section contains information about the goals of the unit and the approaches taken to meet them.

Scheduling

Doesn't matter.

Concepts and Techniques

XXX This is a placeholder for a list of items from the context page.

General Education Alignment

  • Analytical Reasoning Requirement
    • Abstract Reasoning - From the [Catalog Description] Courses qualifying for credit in Abstract Reasoning typically share these characteristics:
      • They focus substantially on properties of classes of abstract models and operations that apply to them.
        • XXX Analysis of this unit's support or not for this item.
      • They provide experience in generalizing from specific instances to appropriate classes of abstract models.
        • XXX Analysis of this unit's support or not for this item.
      • They provide experience in solving concrete problems by a process of abstraction and manipulation at the abstract level. Typically this experience is provided by word problems which require students to formalize real-world problems in abstract terms, to solve them with techniques that apply at that abstract level, and to convert the solutions back into concrete results.
        • XXX Analysis of this unit's support or not for this item.
    • Quantitative Reasoning - From the [Catalog Description] General Education courses in Quantitative Reasoning foster students' abilities to generate, interpret and evaluate quantitative information. In particular, Quantitative Reasoning courses help students develop abilities in such areas as:
      • Using and interpreting formulas, graphs and tables.
        • XXX Analysis of this unit's support or not for this item.
      • Representing mathematical ideas symbolically, graphically, numerically and verbally.
        • XXX Analysis of this unit's support or not for this item.
      • Using mathematical and statistical ideas to solve problems in a variety of contexts.
        • XXX Analysis of this unit's support or not for this item.
      • Using simple models such as linear dependence, exponential growth or decay, or normal distribution.
        • XXX Analysis of this unit's support or not for this item.
      • Understanding basic statistical ideas such as averages, variability and probability.
        • XXX Analysis of this unit's support or not for this item.
      • Making estimates and checking the reasonableness of answers.
        • XXX Analysis of this unit's support or not for this item.
      • Recognizing the limitations of mathematical and statistical methods.
        • XXX Analysis of this unit's support or not for this item.
  • Scientific Inquiry Requirement - From the [Catalog Description] Scientific inquiry:
    • Develops students' understanding of the natural world.
      • XXX Analysis of this unit's support or not for this item.
    • Strengthens students' knowledge of the scientific way of knowing — the use of systematic observation and experimentation to develop theories and test hypotheses.
      • XXX Analysis of this unit's support or not for this item.
    • Emphasizes and provides first-hand experience with both theoretical analysis and the collection of empirical data.
      • XXX Analysis of this unit's support or not for this item.

Scaffolded Learning

XXX Some prose.

Inquiry Based Learning

XXX Some prose.

To Do