Difference between revisions of "Robbie-big-data"

From Earlham CS Department
Jump to navigation Jump to search
Line 1: Line 1:
* Project title
+
Snapshot of Youtube as of Feb. 22, 2008
* Project data set
+
749362 videos crawled
  
 
===== Project Tasks =====
 
===== Project Tasks =====
Line 11: Line 11:
  
 
===== Tech Details =====
 
===== Tech Details =====
* Node: as9
+
*Used postgres instance on my personal computer
* Path to storage space: /scratch/big-data/robbie
+
*All data is in rdbean08/Public on the ACLs
 
 
 
===== Results =====
 
===== Results =====
* The visualization(s)
+
* Visualization:
 +
Most Prolific Uploaders
 +
Total Comments of Top Uploaders
 +
Most Popular Tags by Uploads
 +
Most Popular Tags by Views
 +
Number of Uploads Over Time
 +
Number of Views Over Time
 
* The story
 
* The story

Revision as of 11:19, 9 December 2011

Snapshot of Youtube as of Feb. 22, 2008 749362 videos crawled

Project Tasks
  1. Identifying and downloading the target data set
  2. Data cleaning and pre-processing
  3. Load the data into your Postgres instance
  4. Develop queries to explore your ideas in the data
  5. Develop and document the model function you are exploring in the data
  6. Develop a visualization to show the model/patterns in the data
Tech Details
  • Used postgres instance on my personal computer
  • All data is in rdbean08/Public on the ACLs
Results
  • Visualization:
Most Prolific Uploaders
Total Comments of Top Uploaders
Most Popular Tags by Uploads
Most Popular Tags by Views
Number of Uploads Over Time
Number of Views Over Time
  • The story