Difference between revisions of "Mobeen-big-data"

Revision as of 08:57, 14 December 2011

This data set contains 10000054 ratings and 95580 tags applied to 10681 movies by 71567 users of the online movie recommender service MovieLens.
Link to data set: http://www.grouplens.org/node/12

The Big-Data-M contains the follwing directories and files:

# Backupfiles: The Backupfiles directory contains the data set that was downloaded from Movielens.

# Clean_Data: The Clean_Data directory has all the data files that were formated by using the perl/python scripts.

# Q_results: The Q_results directory has

The original data was in the .dat format. one perl script and a python script was written to change the formate and clean the data.

For this project my aim was to discover the movie genres time line. In more words, I wanted to find out at what period of time people watch what type of movies. I also tried to look for the pattern

@@ Line 11: / Line 11: @@
 The Big-Data-M contains the follwing directories and files:
-*  '''Directories: Backupfiles  Clean_Data  Q_results    Scripts'''
+*  '''Directories:'''
+'''# Backupfiles:''' The Backupfiles directory contains the data set that was downloaded from Movielens.
-*  '''Files:       bigdata.sql  movies.csv  ratings.csv  tags.csv'''
+'''# Clean_Data:'''  The Clean_Data directory has all the data files that were formated by using the perl/python scripts.
- - The Backupfiles directory contains the data set that was downloaded from Movielens.
+'''# Q_results:'''   The Q_results directory has
+# Scripts
-  - The Clean_Data directory has all the data files that were formated by using the perl/python scripts.
+*  '''Files:       bigdata.sql  movies.csv  ratings.csv  tags.csv'''
-  - The Q_results directory has
 ==== 2. Data cleaning and per-processing ====