Richard Vuduc
Title
Cited by
Cited by
Year
Optimization of sparse matrix-vector multiplication on emerging multicore platforms
S Williams, L Oliker, R Vuduc, J Shalf, K Yelick, J Demmel
SC'07: Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, 1-12, 2007
8652007
OSKI: A library of automatically tuned sparse matrix kernels
R Vuduc, JW Demmel, KA Yelick
Journal of Physics: Conference Series 16 (1), 521, 2005
5942005
Model-driven autotuning of sparse matrix-vector multiply on GPUs
JW Choi, A Singh, RW Vuduc
ACM sigplan notices 45 (5), 115-126, 2010
4432010
Sparsity: Optimization framework for sparse matrix kernels
EJ Im, K Yelick, R Vuduc
The International Journal of High Performance Computing Applications 18 (1 …, 2004
3582004
Automatic performance tuning of sparse matrix kernels
RW Vuduc, JW Demmel
University of California, Berkeley, 2003
2882003
Self-adapting linear algebra algorithms and software
J Demmel, J Dongarra, V Eijkhout, E Fuentes, A Petitet, R Vuduc, ...
Proceedings of the IEEE 93 (2), 293-312, 2005
2482005
A performance analysis framework for identifying potential benefits in GPGPU applications
J Sim, A Dasgupta, H Kim, R Vuduc
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of …, 2012
2092012
A massively parallel adaptive fast-multipole method on heterogeneous architectures
I Lashuk, A Chandramowlishwaran, H Langston, TA Nguyen, R Sampath, ...
Proceedings of the Conference on High Performance Computing Networking …, 2009
1952009
Petascale direct numerical simulation of blood flow on 200k cores and heterogeneous architectures
A Rahimian, I Lashuk, S Veerapaneni, A Chandramowlishwaran, ...
SC'10: Proceedings of the 2010 ACM/IEEE International Conference for High …, 2010
1782010
Fast sparse matrix-vector multiplication by exploiting variable block structure
R Vuduc, HJ Moon
High Performance Computing and Communications, 807-816, 2005
1652005
Performance optimizations and bounds for sparse matrix-vector multiply
R Vuduc, JW Demmel, KA Yelick, S Kamil, R Nishtala, B Lee
SC'02: Proceedings of the 2002 ACM/IEEE Conference on Supercomputing, 26-26, 2002
1602002
On the limits of GPU acceleration
R Vuduc, A Chandramowlishwaran, J Choi, M Guney, A Shringarpure
Proceedings of the 2nd USENIX conference on Hot topics in parallelism 13, 2010
1582010
Many-thread aware prefetching mechanisms for GPGPU applications
J Lee, NB Lakshminarayana, H Kim, R Vuduc
2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 213-224, 2010
1492010
Falcon: fault localization in concurrent programs
S Park, RW Vuduc, MJ Harrold
Proceedings of the 32nd ACM/IEEE International Conference on Software …, 2010
1462010
A roofline model of energy
JW Choi, D Bedard, R Fowler, R Vuduc
2013 IEEE 27th International Symposium on Parallel and Distributed …, 2013
1382013
POET: Parameterized optimizations for empirical tuning
Q Yi, K Seymour, H You, R Vuduc, D Quinlan
2007 IEEE International Parallel and Distributed Processing Symposium, 1-8, 2007
1282007
Statistical models for empirical search-based performance tuning
R Vuduc, JW Demmel, JA Bilmes
International Journal of High Performance Computing Applications 18 (1), 65-94, 2004
1282004
When prefetching works, when it doesn’t, and why
J Lee, H Kim, R Vuduc
ACM Transactions on Architecture and Code Optimization (TACO) 9 (1), 1-29, 2012
1212012
When cache blocking of sparse matrix vector multiply works and why
R Nishtala, RW Vuduc, JW Demmel, KA Yelick
Applicable Algebra in Engineering, Communication and Computing 18 (3), 297-311, 2007
1192007
Tuned and wildly asynchronous stencil kernels for hybrid CPU/GPU systems
S Venkatasubramanian, RW Vuduc, none none
Proceedings of the 23rd international conference on Supercomputing, 244-255, 2009
1102009
The system can't perform the operation now. Try again later.
Articles 1–20