Exploring non-zero position constraints: algorithm-hardware co-designed DNN sparse training method | Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University

Open Access

Issue		JNWPU Volume 43, Number 1, February 2025


Page(s)		119 - 127
DOI		https://doi.org/10.1051/jnwpu/20254310119
Published online		18 April 2025

LEE J, KANG S, LEE J, et al. The hardware and algorithm co-design for energy-efficient dnn processor on edge/mobile devices[J]. IEEE Trans on Circuits and Systems, 2020, 67(10): 3458–3470. [Article] [Google Scholar]
CHOI S, SHIN J, KIM L S. A convergence monitoring method for DNN training of on-device task adaptation[C]//2021 IEEE/ACM International Conference on Computer Aided Design, 2021: 1–9 [Google Scholar]
CHOI S, SHIN J, KIM L S. Accelerating on-device DNN training workloads via runtime convergence monitor[J]. IEEE Trans on Computer-Aided Design of Integrated Circuits and Systems, 2023, 42(5): 1574–1587. [Article] [Google Scholar]
WANG Z H. Efficient on-device incremental learning by weight freezing[C]//2022 27th Asia and South Pacific Design Automation Conference, 2022: 538–543 [Google Scholar]
LI B. DQ-STP: an efficient sparse on-device training processor based on low-rank decomposition and quantization for DNN[J]. IEEE Trans on Circuits and Systems, 2024, 71(4): 1665–1678. [Article] [Google Scholar]
LU J, HUANG J, WANG Z. Theta: a high-efficiency training accelerator for dnns with triple-side sparsity exploration[J]. IEEE Trans on Very Large Scale Integration Systems, 2022, 30(8): 1034–1046 [Google Scholar]
YANG D, GHASEMAZAR A, REN X, et al. Procrustes: a dataflow and accelerator for sparse deep neural network training[C]//2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020: 711–724 [Google Scholar]
GONDIMALLA A, et al. SparTen: sparse tensor accelerator for convolutional neural networks[C]//Proceedings of the 52nd Annual IEEE/ACM International Symposium on Micnarchitecture, 2019: 151–165 [Google Scholar]
MAHMOUD M, EDO I, ZADEH A H, et al. Tensordash: exploiting sparsity to accelerate deep neural network training[C]//2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020: 781–795 [Google Scholar]
NVIDIA. NVIDIA ampere GA102 GPU architecture whitepaper[EB/OL]. (2020-09-16)[2024-10-09]. https://www.nvidia.com/content/PDF/nvidia-ampere-ge-102-gpu-architecture-whitepaper-v2.pdf [Google Scholar]
LIU Z, WHATMOUGH P N, ZHU Y, et al. S2TA: exploiting structured sparsity for energy-efficient mobile CNN acceleration[C]//2022 IEEE International Symposium on High-Performance Computer Architecture, 2022: 573–586 [Google Scholar]
WANG M, FAN X, ZHANG W, et al. Balancing memory-accessing and computing over sparse DNN accelerator via efficient data packaging[J]. Journal of Systems Architecture, 2021, 117(C): 102094. [Article] [Google Scholar]
LEE G, PARK H, KIM N, et al. : Acceleration of DNN backward propagation by selective computation of gradients[C]//2019 56th ACM/IEEE Design Automation Conference, 2019: 1–6 [Google Scholar]

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.