William Dally
William Dally
Bell Professor of Engineering, Stanford University; Chief Scientist, NVIDIA
Verified email at stanford.edu
Title
Cited by
Cited by
Year
Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding
S Han, H Mao, WJ Dally
arXiv preprint arXiv:1510.00149, 2015
59562015
SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size
FN Iandola, S Han, MW Moskewicz, K Ashraf, WJ Dally, K Keutzer
arXiv preprint arXiv:1602.07360, 2016
47392016
Route packets, not wires: on-chip inteconnection networks
WJ Dally, B Towles
Proceedings of the 38th annual design automation conference, 684-689, 2001
45192001
Principles and practices of interconnection networks
WJ Dally, BP Towles
Elsevier, 2004
43762004
Learning both weights and connections for efficient neural networks
S Han, J Pool, J Tran, WJ Dally
arXiv preprint arXiv:1506.02626, 2015
41042015
Deadlock-free message routing in multiprocessor interconnection networks
WJ Dally, CL Seitz
California Institute of Technology, 1988
29181988
EIE: Efficient inference engine on compressed deep neural network
S Han, X Liu, H Mao, J Pu, A Pedram, MA Horowitz, WJ Dally
ACM SIGARCH Computer Architecture News 44 (3), 243-254, 2016
20172016
Virtual-channel flow control
WJ Dally
IEEE Transactions on Parallel and Distributed systems 3 (2), 194-205, 1992
18941992
Exascale computing study: Technology challenges in achieving exascale systems
K Bergman, S Borkar, D Campbell, W Carlson, W Dally, M Denneau, ...
Defense Advanced Research Projects Agency Information Processing Techniques …, 2008
13962008
Performance analysis of k-ary n-cube interconnection networks
WJ Dally
IEEE transactions on Computers 39 (06), 775-785, 1990
13931990
Digital systems engineering
WJ Dally, WJ Dally, JW Poulton
Cambridge university press, 1998
13151998
The torus routing chip
WJ Dally, CL Seitz
Distributed computing 1 (4), 187-196, 1986
1273*1986
Memory access scheduling
S Rixner, WJ Dally, UJ Kapasi, P Mattson, JD Owens
ACM SIGARCH Computer Architecture News 28 (2), 128-138, 2000
12612000
The GPU computing era
J Nickolls, WJ Dally
IEEE micro 30 (2), 56-69, 2010
11772010
Trained ternary quantization
C Zhu, S Han, H Mao, WJ Dally
arXiv preprint arXiv:1612.01064, 2016
8352016
Deadlock-free adaptive routing in multicomputer networks using virtual channels
WJ Dally, H Aoki
IEEE transactions on Parallel and Distributed Systems 4 (4), 466-475, 1993
7741993
Scnn: An accelerator for compressed-sparse convolutional neural networks
A Parashar, M Rhu, A Mukkara, A Puglielli, R Venkatesan, B Khailany, ...
ACM SIGARCH Computer Architecture News 45 (2), 27-40, 2017
7402017
A delay model and speculative architecture for pipelined routers
LS Peh, WJ Dally
Proceedings HPCA Seventh International Symposium on High-Performance …, 2001
7152001
Design tradeoffs for tiled CMP on-chip networks
J Balfour, WJ Dally
ACM International conference on supercomputing 25th anniversary volume, 390-401, 2006
6982006
Deep gradient compression: Reducing the communication bandwidth for distributed training
Y Lin, S Han, H Mao, Y Wang, WJ Dally
arXiv preprint arXiv:1712.01887, 2017
6842017
The system can't perform the operation now. Try again later.
Articles 1–20