Difference between revisions of "Visualizations"
(→(DRAFT !!) Tentative Schedule) |
(→Examples) |
||
Line 73: | Line 73: | ||
== Examples == | == Examples == | ||
+ | * Good and Bad Statistical Graphs -- http://www.datavis.ca/gallery/ | ||
* Eurozone debt - http://www.bbc.co.uk/news/business-15748696 | * Eurozone debt - http://www.bbc.co.uk/news/business-15748696 | ||
* Wikileaks US embassy cables - http://datavisualization.ch/datasets/wikileaks-us-embassy-cables/ | * Wikileaks US embassy cables - http://datavisualization.ch/datasets/wikileaks-us-embassy-cables/ |
Revision as of 16:00, 21 August 2012
Contents
Short-term To Do List
- Figure-out books for the library to purchase, probably put them on reserve through the fall (charlie)
- Look at on-line courses in this area (mic)
- Consider Tufte in Chicago in August (mic)
Overview
Math/CS 484 -- The goal of our Ford/Knight project is to distill and organize the principles of visualizing large data sets. Modern science is often done by small groups of people that come from diverse backgrounds, e.g. a mathematician, a biologist, and a computer scientist. We plan to solicit input in the form of example data sets to work with from each of the natural and social science departments on campus. This work will provide a foundation for a course, or course module, which we hope to offer in the future. Must see instructor for registration.
(DRAFT !!) Tentative Schedule
- Week 1 -- Visualization Basics
- lab on data collection
- begin work on course products
- guide -- do's and don'ts for good infographics
- transferable vignettes
- ??
- Week 2 -- Visualization Basics
- lab on turning reports into data into information
- continue work on course products
- Week 3 -- Exploratory Data Analysis
- lab on EDA -- numerical and graphical summaries
- continue work on course products
- Week 4 -- Exploratory Data Analysis
- lab on EDA
- continue work on course products
- Week 5 -- Visualization Tools (notice the links below)
- Tools assignment -- low tech, high tech
- continue work on course products
- Week 6 -- Visualization Tools
- Tools assignment -- critical reviews of existing visualizations
- continue work on course products
- Week 7 -- Visualization Tools
- Tools assignment
- continue work on course products
- Week 8 -- Visualization Tools
- Tools assignment
- continue work on course products
- Week 9 -- Projects
- Projects assignment
- continue work on course products
- Week 10 -- Projects
- Projects assignment -- documenting choices and assumptions
- continue work on course products
- Week 11 -- Projects
- Projects assignment
- continue work on course products
- Week 12 -- Projects
- Projects assignment
- continue work on course products
- Week 13 -- Projects
- Projects assignment
- continue work on course products
- Week 14 -- Projects
- Projects assignment
- continue work on course products
- Week 15 -- Projects
- Projects presentation
- complete work on course products
Examples
- Good and Bad Statistical Graphs -- http://www.datavis.ca/gallery/
- Eurozone debt - http://www.bbc.co.uk/news/business-15748696
- Wikileaks US embassy cables - http://datavisualization.ch/datasets/wikileaks-us-embassy-cables/
- Stopping SOPA and PIPA - http://visual.ly/stop-sopa
- Auto accident statistics in Britain - http://www.bbc.co.uk/news/magazine-16631597
- A snapshot of the rapidly changing world of computing, communications and technology - http://www.nytimes.com/interactive/2011/12/06/science/1206-world.html?ref=science
- Words by the millions - http://www.nytimes.com/2012/03/25/business/words-by-the-millions-sorted-by-software.html?_r=1&ref=technology
- county health ratings - http://www.countyhealthrankings.org/app
- live wind map - http://hint.fm/wind/index.html
- Factual - http://www.nytimes.com/2012/03/25/business/factuals-gil-elbaz-wants-to-gather-the-data-universe.html?ref=technology
- worldwide health data - http://www.youtube.com/watch?v=jbkSRLYSojo&feature=player_embedded
- Obama's budget proposal - http://www.nytimes.com/interactive/2012/02/13/us/politics/2013-budget-proposal-graphic.html?emc=eta1
- Interactive earthquake map - http://pnsn.org/tremor
- http://visual.ly/education-vs-incarceration - and their tool for building vizs
- shot analysis for NBA finals - http://www.nytimes.com/interactive/2012/06/11/sports/basketball/nba-shot-analysis.html
- European debt -- http://www.aljazeera.com/indepth/interactive/2012/06/20126127221845926.html
- Map of the Market (link behaves oddly, but you can get there) -- http://www.smartmoney.com/map-of-the-market/
- Gallery of R Visualizations -- http://addictedtor.free.fr/graphiques/
- nice quicktime example of the "starchart" Filmfinder -- http://hcil2.cs.umd.edu/video/1994/1994_visualinfo.mpg -- dated but very good
- 2010 U.S. Election Visualizations -- http://www.csc.ncsu.edu/faculty/healey/US_election/
- Gun-related deaths by US State -- http://www.aljazeera.com/indepth/interactive/2012/07/2012726141159587596.html
- Minard's Map of French Wine -- http://en.wikipedia.org/wiki/File:Minard%E2%80%99s_map_of_French_wine_exports_for_1864.jpg#file
- Minard's Map of Napoleon's Russian Invasion -- http://en.wikipedia.org/wiki/File:Minard.png#file
- Krulwich - http://www.npr.org/blogs/krulwich/2012/03/21/149095154/mirror-mirror-on-the-wall-do-the-data-tell-it-all?sc=fb&cc=fp
- Defections of Syrian Leaders -- http://www.aljazeera.com/indepth/interactive/syriadefections/2012730840348158.html
Press
NPR did a couple of interesting segments on Big Data, visualizations, and the search of mathematicians and others who can do that stuff. (December, 2011)
- Part 1 - http://www.npr.org/2011/11/29/142521910/the-digital-breadcrumbs-that-lead-to-big-data?ps=rs
- Part 2 - http://www.npr.org/2011/11/30/142893065/the-search-for-analysts-to-make-sense-of-big-data
New York Times article from December, 2011 on bioinformatics and visualization, MicJ
Other
- http://www.r-bloggers.com/how-the-new-york-times-uses-r-for-data-visualization/
- At some point nyt.com supported a "viz lab" where people could use their data sets to build their own visualizations. I can't find a current reference to this now. 20 January 2012
- IBM's Many Eyes -
- http://www.cc.gatech.edu/~stasko/7450/syllabus.html
Presentations
- David McCandless: The beauty of data visualizations (TED) - http://www.ted.com/talks/david_mccandless_the_beauty_of_data_visualization.html
- What we learned from 5 million books (TED) - http://www.ted.com/talks/what_we_learned_from_5_million_books.html
- Google's ngram interface: http://books.google.com/ngrams/
- Baby names -- NameVoyager (http://www.babynamewizard.com/voyager)
- Wordle (http://www.wordle.net/ )
- Raw Milk Laws in the US (http://farmtoconsumer.org/raw_milk_map.htm)
- International Milk Production (http://chartsbin.com/view/1492)
- Perception in Visualization -- http://www.csc.ncsu.edu/faculty/healey/PP/
Keywords
- infographics
- Big data
- work flow(s)
The People
- Mic Jackson, Mathematics & Environmental Science
- Charlie Peck, Computer Science
- Diana Ainembabazi
- Ivan Babic
- Leif DeJong
- Ryan Lake
- Mobeen Ludin
- Emily Pavlovic
- Mikel Qafa
- Alex Reid
- Elena Sergienko
- Tristan Wright
Tools
- GPlates - plate tectonics visualizations, multi-platform (http://www.gplates.org/)
- open source visualization toolkits
- Prefuse ( http://prefuse.org/ ),
- Flare ( http://flare.prefuse.org/ )
- Protovis ( http://vis.stanford.edu/protovis/ )
- groundbreaking visualization projects
- Many Eyes ( http://www.many-eyes.com )
- IBM Visualization and Behavior Group (http://researcher.watson.ibm.com/researcher/view_project.php?id=3419)
- a review of Tableau software (http://infosthetics.com/archives/2010/06/social_visualization_software_review_tableau_public.html)
- another (http://bitools.org/tableau-software/)
- a Tableau competitor (http://www.inetsoft.com/info/alternative_to_tableau_visualization_dashboards/?utm_vendor=google&utm_source=northamerica&utm_campaign=visual&utm_medium=search&utm_content=12577228682&utm_term=tableau%20software%20review&gclid=CKPZmvbyoLECFQ8CQAody2v2bg)
- Polaris interactive database visualization (http://www.graphics.stanford.edu/projects/polaris/)
- Spotfire (http://www.cs.umd.edu/hcil/spotfire/)
Topics
- Long-term turtle size, sex, age, climate by year from Western Nebraska (JohnI)
- Von Bertalanthy (sp) growth model, special case of Fisher models?
- Long-term iguana size, sex, age, climate (8 years only) from Bahamas (Exumas island) (JohnI)
- Von Bertalanthy (sp) growth model, special case of Fisher models?
- Why do turtles lay the number, size, type and frequency of eggs that they do?
- What are the common patterns?
- Which dimensions aren't accounted for?
- Latitude and longitude?
- Habitat?
- Phylogeny?
- Climate?
- What other data sets are available?
- How to distinguish between variations within a species vs different species
- Standardized morphometric data (AOT moristic data, e.g. counts of number of scales between body parts), size standardized
- Currently using multivariate statistics, about 25 variables
- Looking for one image with all populations and variables
- Looking for structure
- Phylogenetic reconstruction, visualizing trees with multiple models (JohnI)
Techniques
- Principle component analysis
- Discriminate function analysis
- Data conditioning and translation, CSV and XML
- Gridded and non-gridded data
- Ideas that Michael suggested
Sources
- Mic's books
- Charlie's books
- Dave's viz workshop at Kean
- Web sources
- The Organisation for Economic Co-operation and Development (OECD) statistics -- http://www.oecd.org/statistics/
Schedule
- Looking for 2-3 hours of meeting time, possibly one shorter and one longer
- Noon on Monday, Thursday, or Friday
- 4p-7p Monday, Wednesday, Thursday, Friday (modulo sport practice)
The Plan
1) Planning items
- Are there any field trip opportunities?
- Figure-out what books to order
- Figure-out what are the likely conference opportunities?
- Are there any other tools besides R that we should be considering?
- GRASS?
2) Things to learn
- Is there a somewhat canonical process or technique that one can reliably apply to go from readings -> data -> information? At which stage(s) is/are a visualization helpful?
- How to utilize geocoding attributes?
- How to utilize timestamp attributes?
3) Things to read
4) Things to do during the class
5) Questions
- Which parts of statistics do people need to know?
- correlation for PCA
- What linear algebra do people need to know?
- matrix operations for PCA
6) Tools
- R under Linux/OSX
7) Possible sources for data sets
- John Iverson
- turtle birthing data
- phylogenetic reconstruction
- Mike Deibel
- Kathy Milar
- Meg Streepy
- GPlates - visualizing plate tectonics