• English
    • Ελληνικά
    • Deutsch
    • français
    • italiano
    • español
  • español 
    • English
    • Ελληνικά
    • Deutsch
    • français
    • italiano
    • español
  • Login
Ver ítem 
  •   DSpace Principal
  • Επιστημονικές Δημοσιεύσεις Μελών ΠΘ (ΕΔΠΘ)
  • Δημοσιεύσεις σε περιοδικά, συνέδρια, κεφάλαια βιβλίων κλπ.
  • Ver ítem
  •   DSpace Principal
  • Επιστημονικές Δημοσιεύσεις Μελών ΠΘ (ΕΔΠΘ)
  • Δημοσιεύσεις σε περιοδικά, συνέδρια, κεφάλαια βιβλίων κλπ.
  • Ver ítem
JavaScript is disabled for your browser. Some features of this site may not work without it.
Todo DSpace
  • Comunidades & Colecciones
  • Por fecha de publicación
  • Autores
  • Títulos
  • Materias

Low power general purpose loop acceleration for NDP applications

Thumbnail
Autor
Tziouvaras A., Dimitriou G., Foukalas F., Stamoulis G.
Fecha
2020
Language
en
DOI
10.1145/3437120.3437288
Materia
Data handling
Data transfer
Dynamic random access storage
Energy utilization
Integrated circuit design
Application execution
Design and implements
Dynamic random access memory
Loop acceleration
Loop accelerators
Modern processors
Performance bottlenecks
Post layout simulation
Pipeline processing systems
Association for Computing Machinery
Mostrar el registro completo del ítem
Resumen
Modern processor architectures face a throughput scaling problem as the performance bottleneck shifts from the core pipeline to the data transfer operations between the dynamic random access memory (DRAM) and the processor chip. To address such issue researchers have proposed the near-data processing (NDP) paradigm in which the instruction execution is moved to the DRAM die thus, lowering the data movement between the processor and the DRAM. Previous NDP works focus on specific application types and thus the general purpose application execution paradigm is neglected. In this work we propose an NDP methodology for low power general purpose loop acceleration. For this reason we design and implement a hardware loop accelerator from the ground up to improve the throughput and lower the power consumption of general purpose loops. We adopt a novel loop scheduling approach which enables the loop accelerator to take advantage of the dataflow parallelism of the executing loop and we implement our design on the logic layer of a hybrid memory cube (HMC) DRAM. Post-layout simulations demonstrate an average speedup factor of 20.5x when executing kernels from various scientific fields while the energy consumption is reduced by a factor of 9.3x over the host CPU execution. © 2020 ACM.
URI
http://hdl.handle.net/11615/80272
Colecciones
  • Δημοσιεύσεις σε περιοδικά, συνέδρια, κεφάλαια βιβλίων κλπ. [19735]
htmlmap 

 

Listar

Todo DSpaceComunidades & ColeccionesPor fecha de publicaciónAutoresTítulosMateriasEsta colecciónPor fecha de publicaciónAutoresTítulosMaterias

Mi cuenta

AccederRegistro
Help Contact
DepositionAboutHelpContacto
Choose LanguageTodo DSpace
EnglishΕλληνικά
htmlmap