Εμφάνιση απλής εγγραφής

dc.creatorAkritidis L., Fevgas A., Tsompanopoulou P., Bozanis P.en
dc.date.accessioned2023-01-31T07:30:38Z
dc.date.available2023-01-31T07:30:38Z
dc.date.issued2020
dc.identifier10.1142/S0218213020600088
dc.identifier.issn02182130
dc.identifier.urihttp://hdl.handle.net/11615/70360
dc.description.abstractBig Data analytics is presently one of the most emerging areas of research for both organizations and enterprises. The requirement for deployment of efficient machine learning algorithms over huge amounts of data led to the development of parallelization frameworks and of specialized libraries (like Mahout and MLlib) which implement the most important among these algorithms. Moreover, the recent advances in storage technology resulted in the introduction of high-performing devices, broadly known as Solid State Drives (SSDs). Compared to the traditional Hard Drives (HDDs), SSDs offer considerably higher performance and lower power consumption. Motivated by these appealing features and the growing necessity for efficient large-scale data processing, we compared the performance of several machine learning algorithms on MapReduce clusters whose nodes are equipped with HDDs, SSDs, and devices which implement the latest 3D XPoint technology. In particular, we evaluate several dataset preprocessing methods like vectorization and dimensionality reduction, two supervised classifiers, Naive Bayes and Linear Regression, and the popular k-Means clustering algorithm. We use an experimental cluster equipped with the three aforementioned storage devices under different configurations, and two large datasets, Wikipedia and HIGGS. The experiments showed that the benefits which derive from the usage of SSDs depend on the cluster setup and the nature of the applied algorithms. © 2020 World Scientific Publishing Company.en
dc.language.isoenen
dc.sourceInternational Journal on Artificial Intelligence Toolsen
dc.source.urihttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85086830601&doi=10.1142%2fS0218213020600088&partnerID=40&md5=29904be167078461b120d1de03e80822
dc.subjectClassification (of information)en
dc.subjectData Analyticsen
dc.subjectDimensionality reductionen
dc.subjectK-means clusteringen
dc.subjectLarge dataseten
dc.subjectLearning systemsen
dc.subjectVirtual storageen
dc.subjectLarge-scale data processingen
dc.subjectLower-power consumptionen
dc.subjectMapReduce clustersen
dc.subjectParallelizationsen
dc.subjectPre-processing methoden
dc.subjectSolid state drivesen
dc.subjectStorage technologyen
dc.subjectSupervised classifiersen
dc.subjectLearning algorithmsen
dc.subjectWorld Scientific Publishing Co. Pte Ltden
dc.titleEvaluating the Effects of Modern Storage Devices on the Efficiency of Parallel Machine Learning Algorithmsen
dc.typeconferenceItemen


Αρχεία σε αυτό το τεκμήριο

ΑρχείαΜέγεθοςΤύποςΠροβολή

Δεν υπάρχουν αρχεία που να σχετίζονται με αυτό το τεκμήριο.

Αυτό το τεκμήριο εμφανίζεται στις ακόλουθες συλλογές

Εμφάνιση απλής εγγραφής