Attention is all you need A Vaswani Advances in Neural Information Processing Systems, 2017 | 138718 | 2017 |
An image is worth 16x16 words: Transformers for image recognition at scale A Dosovitskiy arXiv preprint arXiv:2010.11929, 2020 | 45331 | 2020 |
Attention is all you need [J] A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez Advances in neural information processing systems 30 (1), 261-272, 2017 | 4126 | 2017 |
Mlp-mixer: An all-mlp architecture for vision IO Tolstikhin, N Houlsby, A Kolesnikov, L Beyer, X Zhai, T Unterthiner, ... Advances in neural information processing systems 34, 24261-24272, 2021 | 2654 | 2021 |
Self-attention with relative position representations P Shaw, J Uszkoreit, A Vaswani arXiv preprint arXiv:1803.02155, 2018 | 2619 | 2018 |
Natural questions: a benchmark for question answering research T Kwiatkowski, J Palomaki, O Redfield, M Collins, A Parikh, C Alberti, ... Transactions of the Association for Computational Linguistics 7, 453-466, 2019 | 2614 | 2019 |
Image transformer N Parmar, A Vaswani, J Uszkoreit, L Kaiser, N Shazeer, A Ku, D Tran International conference on machine learning, 4055-4064, 2018 | 1997 | 2018 |
Advances in neural information processing systems A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, ... Attention is all you need, 2017 | 1911 | 2017 |
A decomposable attention model for natural language inference AP Parikh, O Täckström, D Das, J Uszkoreit arXiv preprint arXiv:1606.01933, 2016 | 1735 | 2016 |
Attention Is All You Need.(Nips), 2017 A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, ... arXiv preprint arXiv:1706.03762 10, S0140525X16001837, 2017 | 1454 | 2017 |
Universal transformers M Dehghani, S Gouws, O Vinyals, J Uszkoreit, Ł Kaiser arXiv preprint arXiv:1807.03819, 2018 | 961 | 2018 |
Music transformer CZA Huang, A Vaswani, J Uszkoreit, N Shazeer, I Simon, C Hawthorne, ... arXiv preprint arXiv:1809.04281, 2018 | 921 | 2018 |
Gomez Aidan N., Kaiser Łukasz, and Polosukhin Illia. 2017 V Ashish, S Noam, P Niki, U Jakob, J Llion Attention is all you need. In Advances in neural information processing …, 2017 | 869 | 2017 |
Object-centric learning with slot attention F Locatello, D Weissenborn, T Unterthiner, A Mahendran, G Heigold, ... Advances in neural information processing systems 33, 11525-11538, 2020 | 757 | 2020 |
Tensor2tensor for neural machine translation A Vaswani, S Bengio, E Brevdo, F Chollet, AN Gomez, S Gouws, L Jones, ... arXiv preprint arXiv:1803.07416, 2018 | 636 | 2018 |
How to train your vit? data, augmentation, and regularization in vision transformers A Steiner, A Kolesnikov, X Zhai, R Wightman, J Uszkoreit, L Beyer arXiv preprint arXiv:2106.10270, 2021 | 621 | 2021 |
Attention is all you need. Advances in neural information processing systems A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, ... Advances in neural information processing systems 30 (2017), 2017 | 617* | 2017 |
One model to learn them all L Kaiser, AN Gomez, N Shazeer, A Vaswani, N Parmar, L Jones, ... arXiv preprint arXiv:1706.05137, 2017 | 396 | 2017 |
Transforming machine translation: a deep learning system reaches news translation quality comparable to human professionals M Popel, M Tomkova, J Tomek, Ł Kaiser, J Uszkoreit, O Bojar, ... Nature communications 11 (1), 1-15, 2020 | 330 | 2020 |
Attention is all you need. CoRR abs/1706.03762 (2017) A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, AN Gomez, ... | 302 | 2017 |