Volume 40, Number 2, April 2022
|Page(s)||344 - 351|
|Published online||03 June 2022|
- Zhao R, Hu Y, Dotzel J, et al. Improving neural network quantization without retraining using outlier channel splitting[C]//International Conference on Machine Learning, 2019 [Google Scholar]
- Wang K, Liu Z, Lin Y, et al. HAQ: hardware-aware automated quantization with mixed precision[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019 [Google Scholar]
- Ma Y, Cao Y, Vrudhula S, et al. Performance modeling for CNN inference accelerators on FPGAIEEE Trans on Computer-Aided Design of Integrated Circuits and Systems 2019394843856[Article] [Google Scholar]
- Azizimazreah A, Chen L. Shortcut mining: exploiting cross-layer shortcut reuse in DCNN accelerators[C]//2019 IEEE International Symposium on High Performance Computer Architecture, 2019 [Google Scholar]
- Hennessy J, Patterson D. A new golden age for computer architecture: domain-specific hardware/software co-design, enhanced[C]//ACM/IEEE 45th Annual International Symposium on Computer Architecture, 2018 [Google Scholar]
- Judd P, Albericio J, Hetherington T, et al. Stripes: bit-serial deep neural network computing[C]//2016 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016 [Google Scholar]
- Lee J, Kim C, Kang S, et al. UNPU: an energy-efficient deep neural network accelerator with fully variable weight bit precision[J]. IEEE Journal of Solid-State Circuits, 2018, 54(1): 173–185 [Article] [Google Scholar]
- Sharify S, Lascorz A D, Siu K, et al. Loom: exploiting weight and activation precisions to accelerate convolutional neural networks[C]//2018 55th ACM/ESDA/IEEE Design Automation Conference, 2018: 1–6 [Google Scholar]
- Sharma H, Park J, Suda N, et al. Bit fusion: bit-level dynamically composable architecture for accelerating deep neural network[C]//2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture, 2018: 764–775 [Google Scholar]
- Yin S, Tang S, Lin X, et al. A high throughput acceleration for hybrid neural networks with efficient resource management on FPGA[J]. IEEE Trans on Computer-Aided Design of Integrated Circuits and Systems, 2018, 38(4): 678–691 [Article] [Google Scholar]
- Ma Y, Cao Y, Vrudhula S, et al. Automatic compilation of diverse CNNs onto high-performance FPGA accelerators[J]. IEEE Transations on Computer-Aided Design of Integrated Circuits and Systems, 2018, 39(2): 424–437 [Article] [Google Scholar]
- Wei X, Yu C H, Zhang P, et al. Automated systolic array architecture synthesis for high throughput CNN inference on FPGAs[C]//Proceedings of the 54th Annual Design Automation Conference, 2017 [Google Scholar]
- Guo K, Sui L, Qiu J, et al. Angel-eye: a complete design flow for mapping CNN onto embedded FPGA[J]. IEEE Trans on Computer-Aided Design of Integrated Circuits and Systems, 2017, 37(1): 35–47 [Article] [Google Scholar]
- Azizimazreah A, Chen L. Polymorphic accelerators for deep neural networks[J]. IEEE Trans on Computers, 2022, 71(3): 534–546 [Article] [CrossRef] [Google Scholar]
- Dong Z, Yao Z, Arfeen D, et al. HAWQ-v2:hessian aware trace-weighted quantization of neural networks[J]. Advances in Neural Information Processing Systems, 2020, 33: 18518–18529 [Article] [Google Scholar]
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.