Browsing by Subject "Publicly accessible"
Now showing items 1-1 of 1
-
Exploratory analysis of a terabyte scale web corpus
(2014)In this paper we present a preliminary analysis over the largest publicly accessible web dataset: The Common Crawl Corpus. We measure nine web characteristics from two levels of granularity using MapReduce and we comment ...