Enabling persistent identification of groups of duplicates in data aggregators

Data aggregators harvest, deduplicate and make available content from disparate data sources in different domains, such as cultural information, academic, and scientific content. The availability of aggregated data in the form of Linked Data is subject to the evolution of information at the data sources, thus proper handling is necessary for published data to comply with Linked Data guidelines, such as persistent identification through time. In this paper we present the problem of disambiguating groups of duplicates in settings where the Information Space is regenerated at its whole in every harvesting cycle of data aggregation and propose an approach that aims at providing persistent identifiers for groups through time. © 2016 IEEE.

URI

http://hdl.handle.net/11615/70421

Collections

Δημοσιεύσεις σε περιοδικά, συνέδρια, κεφάλαια βιβλίων κλπ. [19705]