Roberta: A robustly optimized bert pretraining approach Y Liu arXiv preprint arXiv:1907.11692 364, 2019 | 29680* | 2019 |
Supervised contrastive learning for pre-trained language model fine-tuning B Gunel, J Du, A Conneau, V Stoyanov arXiv preprint arXiv:2011.01403, 2020 | 492 | 2020 |
Pretrained language models for biomedical and clinical tasks: understanding and extending the state-of-the-art P Lewis, M Ott, J Du, V Stoyanov Proceedings of the 3rd clinical natural language processing workshop, 146-157, 2020 | 240 | 2020 |
Self-training improves pre-training for natural language understanding J Du, E Grave, B Gunel, V Chaudhary, O Celebi, M Auli, V Stoyanov, ... arXiv preprint arXiv:2010.02194, 2020 | 169 | 2020 |
Box office prediction based on microblog J Du, H Xu, X Huang Expert Systems with Applications 41 (4), 1680-1689, 2014 | 118 | 2014 |
Pretrained encyclopedia: Weakly supervised knowledge-pretrained language model W Xiong, J Du, WY Wang, V Stoyanov arXiv preprint arXiv:1912.09637, 2019 | 114 | 2019 |
Larger-scale transformers for multilingual masked language modeling N Goyal, J Du, M Ott, G Anantharaman, A Conneau arXiv preprint arXiv:2105.00572, 2021 | 111 | 2021 |
Efficient large scale language modeling with mixtures of experts M Artetxe, S Bhosale, N Goyal, T Mihaylov, M Ott, S Shleifer, XV Lin, J Du, ... arXiv preprint arXiv:2112.10684, 2021 | 105 | 2021 |
Few-shot learning with multilingual generative language models XV Lin, T Mihaylov, M Artetxe, T Wang, S Chen, D Simig, M Ott, N Goyal, ... Proceedings of the 2022 Conference on Empirical Methods in Natural Language …, 2022 | 63 | 2022 |
Answering complex open-domain questions with multi-hop dense retrieval W Xiong, XL Li, S Iyer, J Du, P Lewis, WY Wang, Y Mehdad, W Yih, ... arXiv preprint arXiv:2009.12756, 2020 | 60 | 2020 |
Few-shot learning with multilingual language models XV Lin, T Mihaylov, M Artetxe, T Wang, S Chen, D Simig, M Ott, N Goyal, ... arXiv preprint arXiv:2112.10668, 2021 | 52 | 2021 |
Roberta: a robustly optimized BERT pretraining approach. CoRR abs Y Liu, M Ott, N Goyal, J Du, M Joshi, D Chen, O Levy, M Lewis, ... arXiv preprint arXiv:1907.11692 465, 1907 | 48 | 1907 |
RoBERTa: A robustly optimized BERT pretraining approach (arXiv: 1907.11692). arXiv Y Liu, M Ott, N Goyal, J Du, M Joshi, D Chen, O Levy, M Lewis, ... | 47 | 1907 |
RoBERTa: A robustly optimized BERT pretraining approach. arXiv [Preprint](2019) Y Liu, M Ott, N Goyal, J Du, M Joshi, D Chen, O Levy, M Lewis, ... arXiv preprint arXiv:1907.11692, 1907 | 43 | 1907 |
Speechmatrix: A large-scale mined corpus of multilingual speech-to-speech translations PA Duquenne, H Gong, N Dong, J Du, A Lee, V Goswani, C Wang, J Pino, ... arXiv preprint arXiv:2211.04508, 2022 | 32 | 2022 |
RoBERTa: a robustly optimized BERT pretraining approach. arXiv e-prints Y Liu, M Ott, N Goyal, J Du, M Joshi, D Chen, O Levy, M Lewis, ... arXiv preprint arXiv:1907.11692, 2019 | 29 | 2019 |
Knowledge-augmented language model and its application to unsupervised named-entity recognition A Liu, J Du, V Stoyanov arXiv preprint arXiv:1904.04458, 2019 | 28 | 2019 |
Improving in-context few-shot learning via self-supervised training M Chen, J Du, R Pasunuru, T Mihaylov, S Iyer, V Stoyanov, Z Kozareva arXiv preprint arXiv:2205.01703, 2022 | 24 | 2022 |
RoBERTa: A robustly optimized BERT pretraining approach, 2019, CoRR Y Liu, M Ott, N Goyal, J Du, M Joshi, D Chen, O Levy, M Lewis, ... arXiv preprint arXiv:1907.11692, 0 | 23 | |
Speech-to-speech translation for a real-world unwritten language PJ Chen, K Tran, Y Yang, J Du, J Kao, YA Chung, P Tomasello, ... arXiv preprint arXiv:2211.06474, 2022 | 18 | 2022 |