Difference between revisions of "Tristan-big-data"
Jump to navigation
Jump to search
Line 1: | Line 1: | ||
− | * | + | * Examining Trends in a Performance Sport |
− | * | + | * Data set: WCA Database |
===== Project Tasks ===== | ===== Project Tasks ===== | ||
#Identifying and downloading the target data set | #Identifying and downloading the target data set | ||
+ | The WCA Dataset was easily downloaded as a set of SQL inserts. The file can be downloaded from [[here | http://worldcubeassociation.org/results/misc/export.html]]. | ||
+ | |||
#Data cleaning and pre-processing | #Data cleaning and pre-processing | ||
+ | The issue was that the .sql file was in MS-SQL or OracleSQL, so some mass modifications to the file had to be made. Primarily it was with changing smallint(n) to int, and `tablename` without the `. | ||
+ | |||
#Load the data into your Postgres instance | #Load the data into your Postgres instance | ||
+ | It took a few times to get everything from the script all working, but the script was successfully run on my directory on BigFe. | ||
+ | |||
#Develop queries to explore your ideas in the data | #Develop queries to explore your ideas in the data | ||
#Develop and document the model function you are exploring in the data | #Develop and document the model function you are exploring in the data |
Revision as of 10:22, 3 December 2011
- Examining Trends in a Performance Sport
- Data set: WCA Database
Project Tasks
- Identifying and downloading the target data set
The WCA Dataset was easily downloaded as a set of SQL inserts. The file can be downloaded from http://worldcubeassociation.org/results/misc/export.html.
- Data cleaning and pre-processing
The issue was that the .sql file was in MS-SQL or OracleSQL, so some mass modifications to the file had to be made. Primarily it was with changing smallint(n) to int, and `tablename` without the `.
- Load the data into your Postgres instance
It took a few times to get everything from the script all working, but the script was successfully run on my directory on BigFe.
- Develop queries to explore your ideas in the data
- Develop and document the model function you are exploring in the data
- Develop a visualization to show the model/patterns in the data
Tech Details
- Node: as3
- Path to storage space: /scratch/big-data/tristan
Results
- The visualization(s)
- The story