Ptq4vit: Post-training quantization for vision transformers with twin uniform quantization Z Yuan, C Xue, Y Chen, Q Wu, G Sun European conference on computer vision, 191-207, 2022 | 133 | 2022 |
Llm inference unveiled: Survey and roofline model insights Z Yuan, Y Shang, Y Zhou, Z Dong, Z Zhou, C Xue, B Wu, Z Li, Q Gu, ... arXiv preprint arXiv:2402.16363, 2024 | 57 | 2024 |
PTQ4ViT: Post-training quantization framework for vision transformers with twin uniform quantization Z Yuan, C Xue, Y Chen, Q Wu, G Sun arXiv preprint arXiv:2111.12293, 2021 | 39 | 2021 |
Latency-aware spatial-wise dynamic networks Y Han, Z Yuan, Y Pu, C Xue, S Song, G Sun, G Huang Advances in Neural Information Processing Systems 35, 36845-36857, 2022 | 23 | 2022 |
The dawn of ai-native eda: Promises and challenges of large circuit models L Chen, Y Chen, Z Chu, W Fang, TY Ho, Y Huang, S Khan, M Li, X Li, ... arXiv preprint arXiv:2403.07257, 2024 | 20 | 2024 |
The dawn of ai-native eda: Opportunities and challenges of large circuit models L Chen, Y Chen, Z Chu, W Fang, TY Ho, R Huang, Y Huang, S Khan, M Li, ... arXiv preprint arXiv:2403.07257, 2024 | 4 | 2024 |
Ptq-sl: Exploring the sub-layerwise post-training quantization Z Yuan, Y Chen, C Xue, C Zhang, Q Wang, G Sun arXiv preprint arXiv:2110.07809, 2021 | 3 | 2021 |
Large circuit models: opportunities and challenges L Chen, Y Chen, Z Chu, W Fang, TY Ho, R Huang, Y Huang, S Khan, M Li, ... Science China Information Sciences 67 (10), 200402, 2024 | 1 | 2024 |
Theseus: Exploring efficient wafer-scale chip design for large language models J Zhu, C Xue, Y Chen, Z Wang, G Sun arXiv preprint arXiv:2407.02079, 2024 | 1 | 2024 |
Theseus: Towards High-Efficiency Wafer-Scale Chip Design Space Exploration for Large Language Models J Zhu, C Xue, Y Chen, Z Wang, G Sun arXiv e-prints, arXiv: 2407.02079, 2024 | 1 | 2024 |
A Software-Hardware Co-design Solution for 3D Inner Structure Reconstruction X Li, Z Zhou, Q Zheng, G Sun, Q Wang, C Xue Proceedings of the 61st ACM/IEEE Design Automation Conference, 1-6, 2024 | | 2024 |
Oltron: Algorithm-Hardware Co-design for Outlier-Aware Quantization of LLMs with Inter-/Intra-Layer Adaptation C Xue, C Zhang, X Jiang, Z Gao, Y Lin, G Sun Proceedings of the 61st ACM/IEEE Design Automation Conference, 1-6, 2024 | | 2024 |