Annotated-directory-big-data

From Earlham CS Department
Revision as of 11:26, 4 October 2011 by Charliep (talk | contribs)
Jump to navigation Jump to search

This is an annotated directory of public, freely available, "large" data sets. For now they are in no particular order.

Google ngrams

  • URL - http://books.google.com/ngrams/datasets
  • Description - The ngram databases on which Google's ngram viewer is built. A variety of corpora are available, e.g. by language, the "Google Million", English fiction, etc. Each set contains a list of ngrams, frequency, and date information.
  • Curator - CharlieP

Another Data Set

  • URL -
  • Description -
  • Curator -

Another Data Set

  • URL -
  • Description -
  • Curator -

Another Data Set

  • URL -
  • Description -
  • Curator -

Another Data Set

  • URL -
  • Description -
  • Curator -

Another Data Set

  • URL -
  • Description -
  • Curator -

Another Data Set

  • URL -
  • Description -
  • Curator -

Another Data Set