Difference between revisions of "CS382:Unit-mashup"

From Earlham CS Department
Jump to navigation Jump to search
m
Line 1: Line 1:
 
= Data Visualization with Mashups =
 
= Data Visualization with Mashups =
  
== Background reading, one or more pointers/documents and a brief synopsis of what's covered in them ==
+
== Background reading, Resources ==
 
* [http://davidhuynh.net/media/papers/2007/iswc2007-potluck.pdf web tool for non-programmers for making mashups]
 
* [http://davidhuynh.net/media/papers/2007/iswc2007-potluck.pdf web tool for non-programmers for making mashups]
 
* [http://media.wiley.com/product_data/excerpt/12/04705151/0470515112.pdf chapter 1 of book on power of geo mashups]
 
* [http://media.wiley.com/product_data/excerpt/12/04705151/0470515112.pdf chapter 1 of book on power of geo mashups]
* [http://code.google.com/apis/kml/documentation KML documentation]
+
 
 
== Lecture notes - outline form ==
 
== Lecture notes - outline form ==
* what data do I need?
+
=== Introduction ===
* where do I find data?
+
* Models and other datasets describing the world are very large and/or complex.
 +
** Example the US Censes is X rows and Y columns
 +
* In most cases with datasets this large you can't just stare at the raw data and get a feel for what it's saying.
 +
* Even to develop statistical methods to get useful information you still need a notion of what you're looking for.
 +
* Visualization is a way we can get a look at general trends or anomalies in an intuitive way.
 +
* Visualization works better with larger data sets that you can clump.
 +
 
 +
=== Types of visualizations ===
 +
* Overlays (geographical) - example
 +
* Semantic webs - example
 +
* Geometrical ( graphical environments where the size/shape/movement/etc of objects is tied to data ) - example
 +
* more
 +
 
 +
=== Process ===
 +
* Given a certain problem or question, determine what general catagories of information are needed.
 +
** What body of information is needed.
 +
* Data collection
 +
** finding good sources of data
 +
** methods of collecting ( online databases, field work, etc... )
 
** coordinating multiple data sources
 
** coordinating multiple data sources
* how do i encode that data to be useful?
+
* Determaning what tools/type of visualization is most appropriate
* what can i discover through visualization?
+
* Encoding data to be useful (KML,etc..)
* how much data do i need to be useful?
+
* Drawing general conclusions from the visualization
* examples
+
* Use more exact methods like statistics to show truthiness.
  
 
== Classroom response questions - at least three ==
 
== Classroom response questions - at least three ==
 +
 
== Lab activity - materials, process and software ==
 
== Lab activity - materials, process and software ==
 +
Buliding a google mashup/KML document tying 2 or more datasets together. Datasets will be provided but each group would have distinct information to use.  Tools for retrieving and inputting data would be provided but the students would still have to learn KML and interacting with Google Earth/maps. The databases are prefereably something local/personal that can provide interesting results when visualized geographically.
 +
 +
Possible Data sources:
 
* Quaker meetings, <metric X> mashed up on map ( [http://quakermeetings.com database; we own it] )
 
* Quaker meetings, <metric X> mashed up on map ( [http://quakermeetings.com database; we own it] )
 
* Using WebDB to map dorms to major, possibly over time (This one may not work due to not being able to access relevant data)
 
* Using WebDB to map dorms to major, possibly over time (This one may not work due to not being able to access relevant data)
 +
* US Censes
 +
* [http://code.google.com/apis/kml/documentation KML documentation]
 +
* [Prefux visualization suite |http://prefuse.org]
  
 
== Scheduling - early, late, dependencies on other units, length of unit ==
 
== Scheduling - early, late, dependencies on other units, length of unit ==
Line 23: Line 48:
 
Doesn't matter
 
Doesn't matter
 
=== Length ===
 
=== Length ===
One week; but if we find enough material it could be (and would serve well as) two
+
One week.
  
 
== Archived stuff ==
 
== Archived stuff ==

Revision as of 10:50, 18 February 2009

Data Visualization with Mashups

Background reading, Resources

Lecture notes - outline form

Introduction

  • Models and other datasets describing the world are very large and/or complex.
    • Example the US Censes is X rows and Y columns
  • In most cases with datasets this large you can't just stare at the raw data and get a feel for what it's saying.
  • Even to develop statistical methods to get useful information you still need a notion of what you're looking for.
  • Visualization is a way we can get a look at general trends or anomalies in an intuitive way.
  • Visualization works better with larger data sets that you can clump.

Types of visualizations

  • Overlays (geographical) - example
  • Semantic webs - example
  • Geometrical ( graphical environments where the size/shape/movement/etc of objects is tied to data ) - example
  • more

Process

  • Given a certain problem or question, determine what general catagories of information are needed.
    • What body of information is needed.
  • Data collection
    • finding good sources of data
    • methods of collecting ( online databases, field work, etc... )
    • coordinating multiple data sources
  • Determaning what tools/type of visualization is most appropriate
  • Encoding data to be useful (KML,etc..)
  • Drawing general conclusions from the visualization
  • Use more exact methods like statistics to show truthiness.

Classroom response questions - at least three

Lab activity - materials, process and software

Buliding a google mashup/KML document tying 2 or more datasets together. Datasets will be provided but each group would have distinct information to use. Tools for retrieving and inputting data would be provided but the students would still have to learn KML and interacting with Google Earth/maps. The databases are prefereably something local/personal that can provide interesting results when visualized geographically.

Possible Data sources:

Scheduling - early, late, dependencies on other units, length of unit

Timing

Doesn't matter

Length

One week.

Archived stuff