Analyzing CUDA workloads using a detailed GPU simulator
A Bakhoda, GL Yuan, WWL Fung, H Wong, TM Aamodt
2009 IEEE International Symposium on Performance Analysis of Systems and …, 2009
Demystifying GPU microarchitecture through microbenchmarking
H Wong, MM Papadopoulou, M Sadooghi-Alvandi, A Moshovos
2010 IEEE International Symposium on Performance Analysis of Systems …, 2010
Implications of historical trends in the electrical efficiency of computing
J Koomey, S Berard, M Sanchez, H Wong
IEEE Annals of the History of Computing 33 (3), 46-54, 2010
Comparing FPGA vs. custom CMOS and the impact on processor microarchitecture
H Wong, V Betz, J Rose
Proceedings of the 19th ACM/SIGDA international symposium on Field …, 2011
Pangaea: a tightly-coupled ia32 heterogeneous chip multiprocessor
H Wong, A Bracy, E Schuchman, TM Aamodt, JD Collins, PH Wang, ...
2008 International Conference on Parallel Architectures and Compilation …, 2008
Micro-benchmarking the GT200 GPU
MM Papadopoulou, M Sadooghi-Alvandi, H Wong
Computer Group, ECE, University of Toronto, Tech. Rep, 2009
Intel Ivy Bridge Cache Replacement Policy
H Wong
http://blog.stuffedcow.net/2013/01/ivb-cache-replacement/, 2013
Quantifying the gap between FPGA and custom CMOS to aid microarchitectural design
H Wong, V Betz, J Rose
IEEE Transactions on Very Large Scale Integration (VLSI) Systems 22 (10 …, 2013
A Comparison of Intel's 32nm and 22nm Core i5 CPUs: Power, Voltage, Temperature, and Frequency
H Wong
http://blog.stuffedcow.net/2012/10/intel32nm-22nm-core-i5-comparison/, 2012
High performance instruction scheduling circuits for out-of-order soft processors
H Wong, V Betz, J Rose
2016 IEEE 24th Annual International Symposium on Field-Programmable Custom …, 2016
Microarchitecture and circuits for a 200 mhz out-of-order soft processor memory system
H Wong, V Betz, J Rose
ACM Transactions on Reconfigurable Technology and Systems (TRETS) 10 (1), 1-22, 2016
Store-to-Load Forwarding and Memory Disambiguation in x86 Processors
H Wong
http://blog.stuffedcow.net/2014/01/x86-memory-disambiguation/, 2014
The performance potential for single application heterogeneous systems
H Wong, TM Aamodt
8th Workshop on Duplicating, Deconstructing, and Debunking, 2009
Efficient methods for out-of-order load/store execution for high-performance soft processors
H Wong, V Betz, J Rose
2013 International Conference on Field-Programmable Technology (FPT), 442-445, 2013
Measuring Reorder Buffer Capacity
H Wong
http://blog.stuffedcow.net/2013/05/measuring-rob-capacity/, 2013
High-performance instruction scheduling circuits for superscalar out-of-order soft processors
H Wong, V Betz, J Rose
ACM Transactions on Reconfigurable Technology and Systems (TRETS) 11 (1), 1-22, 2018
TLB and Pagewalk Coherence in x86 Processors
H Wong
http://blog.stuffedcow.net/2015/08/pagewalk-coherence/, 2015
Microbenchmarking Return Address Branch Prediction
H Wong
http://blog.stuffedcow.net/2018/04/ras-microbenchmarks/, 2018
A superscalar out-of-order x86 soft processor for fpga
HTH Wong
University of Toronto (Canada), 2017
Architectures and limits of GPU-CPU heterogeneous systems
HTH Wong
University of British Columbia, 2008
