From Earlham CS Department
Jump to navigation Jump to search

CS430 - Database Systems

Big Data Project

The Big Data Project is built of components, each of which will be built during a future assignment. The first component is the directories, one of data sets and a second of query interfaces to data sets.

The data sets should be public and freely available, and large under some definition of that word. Query interfaces are web based tools that allow people to explore one or more datasets, usually with visualizations. Social Explorer and Google's ngrams are good examples of query interfaces.

For the first assignment you should identify and describe two data sets and one query interface. The data set entries can only be related to the query interface entry if they are separately available as stand-alone entities. This is an assignment by template, I've provided one of each type of entry in the documents linked below.

Stake your claim to entries sooner rather than later. The protocol is to check to see if something is taken, if not then fill-in a title, etc. as a placeholder until you complete the entry.

Be sure to edit by section rather than page to minimize update conflicts.