Afficher la notice abrégée

dc.creatorHalkidi, M.en
dc.creatorKoutsopoulos, I.en
dc.description.abstractExtraction of patterns out of streaming data that are generated from geographically dispersed devices is a major challenge in data mining. The sequential, distributed fashion in which data become available to the decision maker, together with the fact that the decision maker needs to rely only on recently received data due to storage and communication constraints, render the objective of keeping track of data evolution a nontrivial one. We consider a set of distributed nodes that communicate directly with a central location. We address the problem of clustering distributed streaming data through a two-level clustering approach. We adopt belief propagation techniques to perform stream clustering at both levels. At the node level, a batch of data arrives at each time slot, and the goal is to maintain a set of salient data (local exemplars) at each time slot, which best represents the data received up to that slot. At each epoch, the local exemplars from distributed nodes are sent to the central location, which in turn performs a second-level clustering on them to derive a data synopsis global for the whole system. The local exemplars that emerge from the second level clustering procedure are fed back to the nodes with appropriately modified weights which reflect their importance in global clustering. As demonstrated by our experiments, the two-level belief propagation-based clustering approach together with the feedback is ideal for handling data from different nodes, as it has the same performance in terms of clustering quality with the case where the clustering is performed on the raw data sent from nodes to the central location. © 2011 IEEE.en
dc.subjectBelief propagationen
dc.subjectClustering approachen
dc.subjectClustering procedureen
dc.subjectClustering qualityen
dc.subjectCommunication constraintsen
dc.subjectData evolutionen
dc.subjectData synopsisen
dc.subjectDecision makersen
dc.subjectDistributed nodesen
dc.subjectDistributed streamingen
dc.subjectGlobal clusteringen
dc.subjectSecond levelen
dc.subjectStreaming dataen
dc.subjectTime slotsen
dc.subjectDecision makingen
dc.subjectInformation managementen
dc.subjectData handlingen
dc.titleOnline clustering of distributed streaming data using belief propagation techniquesen

Fichier(s) constituant ce document


Il n'y a pas de fichiers associés à ce document.

Ce document figure dans la(les) collection(s) suivante(s)

Afficher la notice abrégée