Mostra i principali dati dell'item

dc.creatorProvatas N., Konstantinou I., Koziris N.en
dc.date.accessioned2023-01-31T09:50:47Z
dc.date.available2023-01-31T09:50:47Z
dc.date.issued2021
dc.identifier10.1109/BigData52589.2021.9672001
dc.identifier.isbn9781665439022
dc.identifier.urihttp://hdl.handle.net/11615/78383
dc.description.abstractOver the last years, deep learning has gained an increase in popularity in various domains introducing complex models to handle the data explosion. However, while such model architectures can support the enormous amount of data, a single computing node cannot train the model using the whole data set in a timely fashion. Thus, specialized distributed architectures have been proposed, most of which follow data parallelism schemes, as the widely used parameter server approach. In this setup, each worker contributes to the training process in an asynchronous manner. While asynchronous training does not suffer from synchronization overheads, it introduces the problem of stale gradients which might cause the model to diverge during the training process. In this paper, we examine different data assignment schemes to workers, which facilitate the asynchronous learning approach. Specifically, we propose two different algorithms to perform the data sharding. Our experimental evaluation indicated that when stratification is taken into account the validation results present up to 6X less variance compared to standard sharding creation. When further data exploration for hidden stratification is performed, validation metrics can be slightly optimized. This method also achieves to reduce the variance of training and validation metrics by up to 8X and 2X respectively. © 2021 IEEE.en
dc.language.isoenen
dc.sourceProceedings - 2021 IEEE International Conference on Big Data, Big Data 2021en
dc.source.urihttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85125330204&doi=10.1109%2fBigData52589.2021.9672001&partnerID=40&md5=a1b9acff3ed689490f707361a0329d80
dc.subjectDeep learningen
dc.subjectComplex modelen
dc.subjectComputing nodesen
dc.subjectData explosionen
dc.subjectDeep learningen
dc.subjectDistributed trainingen
dc.subjectModeling architectureen
dc.subjectSingle computingen
dc.subjectTraining processen
dc.subjectValidation metricen
dc.subjectWorkers'en
dc.subjectInformation managementen
dc.subjectInstitute of Electrical and Electronics Engineers Inc.en
dc.titleIs Systematic Data Sharding able to Stabilize Asynchronous Parameter Server Training?en
dc.typeconferenceItemen


Files in questo item

FilesDimensioneFormatoMostra

Nessun files in questo item.

Questo item appare nelle seguenti collezioni

Mostra i principali dati dell'item