Palm: Scaling language modeling with pathways A Chowdhery, S Narang, J Devlin, M Bosma, G Mishra, A Roberts, ... Journal of Machine Learning Research 24 (240), 1-113, 2023 | 3452 | 2023 |
Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023 | 511 | 2023 |
Efficiently scaling transformer inference R Pope, S Douglas, A Chowdhery, J Devlin, J Bradbury, J Heek, K Xiao, ... Proceedings of Machine Learning and Systems 5, 2023 | 136 | 2023 |
Palm: Scaling language modeling with pathways. arXiv 2022 A Chowdhery, S Narang, J Devlin, M Bosma, G Mishra, A Roberts, ... arXiv preprint arXiv:2204.02311 10, 2022 | 79 | 2022 |
Pareto-optimal quantized resnet is mostly 4-bit AA Abdolrashidi, L Wang, S Agrawal, J Malmaud, O Rybakov, C Leichner, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021 | 24 | 2021 |
4-bit conformer with native quantization aware training for speech recognition S Ding, P Meadowlark, Y He, L Lew, S Agrawal, O Rybakov arXiv preprint arXiv:2203.15952, 2022 | 20 | 2022 |
PaLM: Scaling Language Modeling with Pathways (2022), doi: 10.48550 A Chowdhery, S Narang, J Devlin, M Bosma, G Mishra, A Roberts, ... arXiv preprint ARXIV.2204.02311, 0 | 15 | |
Queuing analysis for multiple-antenna cognitive radio wireless networks with beamforming S Agrawal, V Rana, AK Jagannatham IEEE Signal Processing Letters 24 (3), 334-338, 2017 | 6 | 2017 |
Step: Learning n: M structured sparsity masks from scratch with precondition Y Lu, S Agrawal, S Subramanian, O Rybakov, C De Sa, A Yazdanbakhsh International Conference on Machine Learning, 22812-22824, 2023 | 5 | 2023 |
Training recipe for n: M structured sparsity with decaying pruning mask SC Kao, A Yazdanbakhsh, S Subramanian, S Agrawal, U Evci, T Krishna arXiv preprint arXiv:2209.07617, 2022 | 4 | 2022 |
JaxPruner: A concise library for sparsity research JH Lee, W Park, NE Mitchell, J Pilault, JSO Ceron, HB Kim, N Lee, ... Conference on Parsimony and Learning, 515-528, 2024 | 3 | 2024 |
Streaming Parrotron for on-device speech-to-speech conversion O Rybakov, F Biadsy, X Zhang, L Jiang, P Meadowlark, S Agrawal arXiv preprint arXiv:2210.13761, 2022 | 2 | 2022 |
USM-Lite: Quantization and Sparsity Aware Fine-Tuning for Speech Recognition with Universal Speech Models S Ding, D Qiu, D Rim, Y He, O Rybakov, B Li, R Prabhavalkar, W Wang, ... ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 1 | 2024 |
Training Recipe for N: M Structured Sparsity with Decaying Pruning Mask A Yazdanbakhsh, SC Kao, S Agrawal, S Subramanian, T Krishna, U Evci arXiv preprint arXiv:2209.07617, 2022 | 1 | 2022 |
Progressive Gradient Flow for Robust N: M Sparsity Training in Transformers AR Bambhaniya, A Yazdanbakhsh, S Subramanian, SC Kao, S Agrawal, ... arXiv preprint arXiv:2402.04744, 2024 | | 2024 |
Progressive Gradient Flow for Robust N: M Sparsity Training in Transformers A Rajeshkumar Bambhaniya, A Yazdanbakhsh, S Subramanian, SC Kao, ... arXiv e-prints, arXiv: 2402.04744, 2024 | | 2024 |
4-bit Conformer with Accurate Quantization Training for Speech Recognition S Ding, O Rybakov, P Meadowlark, S Agrawal, Y He, L Lew US Patent App. 18/186,774, 2023 | | 2023 |
Sparsify the Weights but Let the Gradients Flow! A Yazdanbakhsh, AR Bambhaniya, S Subramanian, SC Kao, S Agrawal, ... | | |