Follow
Zhen Lin
Title
Cited by
Cited by
Year
Enabling efficient preemption for SIMT architectures with lightweight context switching
Z Lin, L Nyland, H Zhou
SC'16: Proceedings of the International Conference for High Performance …, 2016
472016
Accelerate GPU concurrent kernel execution by mitigating memory pipeline stalls
H Dai, Z Lin, C Li, C Zhao, F Wang, N Zheng, H Zhou
2018 IEEE international symposium on high performance computer architecture …, 2018
462018
Automatic data placement into GPU on-chip memory resources
C Li, Y Yang, Z Lin, H Zhou
2015 IEEE/ACM International Symposium on Code Generation and Optimization …, 2015
422015
Implementation and evaluation of deep neural networks (DNN) on mainstream heterogeneous systems
J Gu, M Zhu, Z Zhou, F Zhang, Z Lin, Q Zhang, M Breternitz
Proceedings of 5th Asia-Pacific Workshop on Systems, 1-7, 2014
342014
In-place zero-space memory protection for cnn
H Guan, L Ning, Z Lin, X Shen, H Zhou, SH Lim
Advances in Neural Information Processing Systems 32, 2019
242019
Scatter-and-gather revisited: High-performance side-channel-resistant AES on GPUs
Z Lin, U Mathur, H Zhou
Proceedings of the 12th Workshop on General Purpose Processing Using GPUs, 2-11, 2019
142019
Selectively GPU cache bypassing for un-coalesced loads
C Zhao, F Wang, Z Lin, H Zhou, N Zheng
2016 IEEE 22nd International Conference on Parallel and Distributed Systems …, 2016
142016
Coordinated CTA combination and bandwidth partitioning for GPU concurrent kernel execution
Z Lin, H Dai, M Mantor, H Zhou
ACM Transactions on Architecture and Code Optimization (TACO) 16 (3), 1-27, 2019
132019
Exploring memory persistency models for gpus
Z Lin, M Alshboul, Y Solihin, H Zhou
2019 28th International Conference on Parallel Architectures and Compilation …, 2019
122019
GPU performance vs. thread-level parallelism: Scalability analysis and a novel way to improve TLP
Z Lin, M Mantor, H Zhou
ACM Transactions on Architecture and Code Optimization (TACO) 15 (1), 1-21, 2018
102018
GLES: A practical GPGPU optimizing compiler using data sharing and thread coarsening
Z Lin, X Gao, H Wan, B Jiang
Languages and Compilers for Parallel Computing: 27th International Workshop …, 2015
72015
The Demand for a Sound Baseline in GPU Memory Architecture Research
H Dai, C Li, Z Lin, H Zhou
Proceedings of the Workshop on Duplicating, Deconstructing and Debunking (WDDD), 2017
42017
Poster: Accelerate GPU concurrent kernel execution by mitigating memory pipeline stalls
H Dai, Z Lin, C Li, C Zhao, F Wang, N Zheng, H Zhou
2017 26th International Conference on Parallel Architectures and Compilation …, 2017
32017
The system can't perform the operation now. Try again later.
Articles 1–13