Follow
Mahmoud Khairy
Mahmoud Khairy
AMD Research
Verified email at amd.com - Homepage
Title
Cited by
Cited by
Year
Accel-Sim: An extensible simulation framework for validated GPU modeling
M Khairy, Z Shen, TM Aamodt, TG Rogers
2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture …, 2020
219*2020
AccelWattch: A power modeling framework for modern GPUs
V Kandiah, S Peverelle, M Khairy, J Pan, A Manjunath, TG Rogers, ...
MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture …, 2021
582021
Efficient utilization of gpgpu cache hierarchy
M Khairy, M Zahran, AG Wassal
Proceedings of the 8th Workshop on General Purpose Processing using GPUS, 36-47, 2015
392015
A survey of architectural approaches for improving GPGPU performance, programmability and heterogeneity
M Khairy, AG Wassal, M Zahran
Journal of Parallel and Distributed Computing 127, 65-88, 2019
242019
Locality-centric data and threadblock management for massive GPUs
M Khairy, V Nikiforov, D Nellans, TG Rogers
2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture …, 2020
212020
A quantitative evaluation of contemporary gpu simulation methodology
A Jain, M Khairy, TG Rogers
Proceedings of the ACM on Measurement and Analysis of Computing Systems 2 (2 …, 2018
182018
Principal kernel analysis: A tractable methodology to simulate scaled GPU workloads
C Avalos Baddouh, M Khairy, RN Green, M Payer, TG Rogers
MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture …, 2021
142021
SACAT: Streaming-aware conflict-avoiding thrashing-resistant GPGPU cache management scheme
M Khairy, M Zahran, A Wassal
IEEE Transactions on Parallel and Distributed Systems 28 (6), 1740-1753, 2016
102016
TPU vs GPU vs Cerebras vs Graphcore: A fair comparison between ML hardware
M Khairy
https://khairy2011.medium.com/tpu-vs-gpu-vs-cerebras-vs-graphcore-a-fair …, 2020
42020
SST_GPU: An Execution-Driven CUDA Kernel Scheduler and Streaming-Multiprocessor Compute Model.
M Khairy, M Zhang, R Green, SD Hammond, RJ Hoekstra, T Rogers, ...
Sandia National Lab.(SNL-NM), Albuquerque, NM (United States), 2019
32019
SIMR: Single Instruction Multiple Request Processing for Energy-Efficient Data Center Microservices
M Khairy, A Alawneh, A Barnes, TG Rogers
2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO), 441-463, 2022
12022
A SIMT Analyzer for Multi-Threaded CPU Applications
A Alawneh, M Khairy, TG Rogers
2022 IEEE International Symposium on Performance Analysis of Systems and …, 2022
12022
System and methods for single instruction multiple request processing
TG Rogers, M Khairy
US Patent App. 18/072,492, 2023
2023
An Academic’s Attempt to Clear the Fog of the Machine Learning Accelerator War
M Khairy, T Rogers
https://www.sigarch.org/an-academics-attempt-to-clear-the-fog-of-the-machine …, 2021
2021
Balar: A SST GPU Component for Performance Modeling and Profiling.
C Hughes, SD Hammond, M Khairy, M Zhang, R Green, T Rogers, ...
Sandia National Lab.(SNL-NM), Albuquerque, NM (United States), 2019
2019
ISPASS 2023
A Ferreron, A Samajdar, A Gutierrez, A Shriraman, A Rodrigues, B Asgari, ...
SANDIA REPORT
M Zhang, M Khairy, T Rogers
The system can't perform the operation now. Try again later.
Articles 1–17