Matrix engines for high performance computing: A paragon of performance or grasping at straws? J Domke, E Vatai, A Drozd, P Chen, Y Oyama, L Zhang, S Salaria, ... 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2021 | 33 | 2021 |
A versatile software systolic execution model for GPU memory-bound kernels Peng Chen , Wahib Mohamed, Takizawa Shinichiro,Takano Ryousei, Matsuoka Satoshi Proceedings of the International Conference for High Performance Computing …, 2019 | 27 | 2019 |
Boosting the predictive performance with aqueous solubility dataset curation J Meng, P Chen, M Wahib, M Yang, L Zheng, Y Wei, S Feng, W Liu Scientific Data 9 (1), 71, 2022 | 12 | 2022 |
Automatic generation of high-performance convolution kernels on arm cpus for deep learning J Meng, C Zhuang, P Chen, M Wahib, B Schmidt, X Wang, H Lan, D Wu, ... IEEE Transactions on Parallel and Distributed Systems 33 (11), 2885-2899, 2022 | 9 | 2022 |
iFDK: a scalable framework for instant high-resolution image reconstruction Peng Chen , Wahib Mohamed, Takizawa Shinichiro,Takano Ryousei, Matsuoka Satoshi Proceedings of the International Conference for High Performance Computing …, 2019 | 8* | 2019 |
Efficient Algorithms for the Summed Area Tables Primitive on GPUs Peng Chen , Wahib Mohamed, Takizawa Shinichiro,Takano Ryousei, Matsuoka Satoshi IEEE International Conference on Cluster Computing (CLUSTER), 2018 | 8 | 2018 |
At the locus of performance: A case study in enhancing cpus with copious 3d-stacked cache J Domke, E Vatai, B Gerofi, Y Kodama, M Wahib, A Podobas, S Mittal, ... arXiv preprint arXiv:2204.02235, 2022 | 5 | 2022 |
Persistent Kernels for Iterative Memory-bound GPU Applications L Zhang, M Wahib, P Chen, J Meng, X Wang, S Matsuoka arXiv preprint arXiv:2204.02064, 2022 | 4 | 2022 |
Scalable FBP decomposition for cone-beam CT reconstruction P Chen, M Wahib, X Wang, T Hirofuchi, H Ogawa, A Biguri, R Boardman, ... Proceedings of the International Conference for High Performance Computing …, 2021 | 4 | 2021 |
Physics-Based Iterative Reconstruction for Dual Source and Flying Focal Spot Computed Tomography X Wang, RD MacDougall, P Chen, CA Bouman, SK Warfield arXiv e-prints, arXiv: 2001.09471, 2021 | 4* | 2021 |
PERKS: a Locality-Optimized Execution Model for Iterative Memory-bound GPU Applications L Zhang, M Wahib, P Chen, J Meng, X Wang, T Endo, S Matsuoka Proceedings of the 37th International Conference on Supercomputing, 167-179, 2023 | 3 | 2023 |
Evolutionary Architecture Search for Generative Adversarial Networks Based On Weight Sharing Y Xue, W Tong, F Neri, P Chen, T Luo, L Zhen, X Wang IEEE Transactions on Evolutionary Computation, 2023 | 2 | 2023 |
Revisiting Temporal Blocking Stencil Optimizations L Zhang, M Wahib, P Chen, J Meng, X Wang, T Endo, S Matsuoka Proceedings of the 37th International Conference on Supercomputing, 251-263, 2023 | 2 | 2023 |
Performance portable back-projection algorithms on cpus: Agnostic data locality and vectorization optimizations P Chen, M Wahib, X Wang, S Takizawa, T Hirofuchi, H Ogawa, ... Proceedings of the ACM International Conference on Supercomputing, 316-328, 2021 | 2 | 2021 |
At the locus of performance: Quantifying the effects of copious 3D-stacked cache on HPC workloads J Domke, E Vatai, B Gerofi, Y Kodama, M Wahib, A Podobas, S Mittal, ... ACM Transactions on Architecture and Code Optimization 20 (4), 1-26, 2023 | 1 | 2023 |
Ultra-Long Sequence Distributed Transformer X Wang, I Lyngaas, A Tsaris, P Chen, S Dash, MC Shekar, T Luo, ... arXiv preprint arXiv:2311.02382, 2023 | 1 | 2023 |
Simeuro: A Hybrid CPU-GPU Parallel Simulator for Neuromorphic Computing Chips H Zhang, NM Ho, YP Dogukan, P Chen, M Wahib, TT Nguyen, J Meng, ... IEEE Transactions on Parallel and Distributed Systems, 2023 | 1 | 2023 |
Image gradient decomposition for parallel and memory-efficient ptychographic reconstruction X Wang, A Tsaris, D Mukherjee, M Wahib, P Chen, M Oxley, ... Proceedings of the International Conference for High Performance Computing …, 2022 | 1 | 2022 |
Pushing the Limits for 2D Convolution Computation On CUDA-enabled GPUs P Chen, M Wahib, S Takizawa, S Matsuoka Technical Report, HPC-163, 2018 | 1 | 2018 |
Real-time High-resolution X-Ray Computed Tomography MW Du Wu, Peng Chen, Xiao Wang, Issac Lyngaas, Takaaki Miyajima, Toshio Endo ... In proceedings of ACM International Conference on Supercomputing (ICS 2024), 2024 | | 2024 |