Parcourir par sujet "Web dataset"
Voici les éléments 1-1 de 1
-
Exploratory analysis of a terabyte scale web corpus
(2014)In this paper we present a preliminary analysis over the largest publicly accessible web dataset: The Common Crawl Corpus. We measure nine web characteristics from two levels of granularity using MapReduce and we comment ...