An empirical study of incorporating pseudo data into grammatical error correction S Kiyono, J Suzuki, M Mita, T Mizumoto, K Inui arXiv preprint arXiv:1909.00502, 2019 | 176 | 2019 |
ESPnet-ST: All-in-one speech translation toolkit H Inaguma, S Kiyono, K Duh, S Karita, NEY Soplin, T Hayashi, ... arXiv preprint arXiv:2004.10234, 2020 | 175 | 2020 |
Encoder-decoder models can benefit from pre-trained masked language models in grammatical error correction M Kaneko, M Mita, S Kiyono, J Suzuki, K Inui arXiv preprint arXiv:2005.00987, 2020 | 165 | 2020 |
Lessons on parameter sharing across layers in transformers S Takase, S Kiyono arXiv preprint arXiv:2104.06022, 2021 | 78 | 2021 |
Rethinking perturbations in encoder-decoders for fast training S Takase, S Kiyono arXiv preprint arXiv:2104.01853, 2021 | 44 | 2021 |
Shape: Shifted absolute position embedding for transformers S Kiyono, S Kobayashi, J Suzuki, K Inui arXiv preprint arXiv:2109.05644, 2021 | 39 | 2021 |
Effective adversarial regularization for neural machine translation M Sato, J Suzuki, S Kiyono Proceedings of the 57th Annual Meeting of the Association for Computational …, 2019 | 37 | 2019 |
Massive exploration of pseudo data for grammatical error correction S Kiyono, J Suzuki, T Mizumoto, K Inui IEEE/ACM transactions on audio, speech, and language processing 28, 2134-2145, 2020 | 21 | 2020 |
Tohoku-AIP-NTT at WMT 2020 news translation task S Kiyono, T Ito, R Konno, M Morishita, J Suzuki Proceedings of the Fifth Conference on Machine Translation, 145-155, 2020 | 17 | 2020 |
A self-refinement strategy for noise reduction in grammatical error correction M Mita, S Kiyono, M Kaneko, J Suzuki, K Inui arXiv preprint arXiv:2010.03155, 2020 | 17 | 2020 |
On layer normalizations and residual connections in transformers S Takase, S Kiyono, S Kobayashi, J Suzuki arXiv preprint arXiv:2206.00330, 2022 | 15 | 2022 |
Pseudo zero pronoun resolution improves zero anaphora resolution R Konno, S Kiyono, Y Matsubayashi, H Ouchi, K Inui arXiv preprint arXiv:2104.07425, 2021 | 14 | 2021 |
B2t connection: Serving stability and performance in deep transformers S Takase, S Kiyono, S Kobayashi, J Suzuki arXiv preprint arXiv:2206.00330, 2022 | 12 | 2022 |
Source-side prediction for neural headline generation S Kiyono, S Takase, J Suzuki, N Okazaki, K Inui, M Nagata arXiv preprint arXiv:1712.08302, 2017 | 12 | 2017 |
Mixture of expert/imitator networks: Scalable semi-supervised learning framework S Kiyono, J Suzuki, K Inui Proceedings of the AAAI Conference on Artificial Intelligence 33 (01), 4073-4081, 2019 | 10 | 2019 |
Diverse lottery tickets boost ensemble from a single pretrained model S Kobayashi, S Kiyono, J Suzuki, K Inui arXiv preprint arXiv:2205.11833, 2022 | 9 | 2022 |
An empirical study of contextual data augmentation for japanese zero anaphora resolution R Konno, Y Matsubayashi, S Kiyono, H Ouchi, R Takahashi, K Inui arXiv preprint arXiv:2011.00948, 2020 | 9 | 2020 |
Unsupervised token-wise alignment to improve interpretation of encoder-decoder models S Kiyono, S Takase, J Suzuki, N Okazaki, K Inui, M Nagata Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and …, 2018 | 9 | 2018 |
Spike No More: Stabilizing the Pre-training of Large Language Models S Takase, S Kiyono, S Kobayashi, J Suzuki arXiv preprint arXiv:2312.16903, 2023 | 6 | 2023 |
Lessons on parameter sharing across layers in transformers. CoRR abs/2104.06022 (2021) S Takase, S Kiyono arXiv preprint arXiv:2104.06022, 2021 | 6 | 2021 |