End-to-end neural speaker diarization with self-attention Y Fujita, N Kanda, S Horiguchi, Y Xue, K Nagamatsu, S Watanabe 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2019 | 292 | 2019 |
End-to-end speaker diarization for an unknown number of speakers with encoder-decoder based attractors S Horiguchi, Y Fujita, S Watanabe, Y Xue, K Nagamatsu arXiv preprint arXiv:2005.09921, 2020 | 211 | 2020 |
Encoder-decoder based attractors for end-to-end neural diarization S Horiguchi, Y Fujita, S Watanabe, Y Xue, P Garcia IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 1493-1507, 2022 | 69 | 2022 |
Online end-to-end neural diarization with speaker-tracing buffer Y Xue, S Horiguchi, Y Fujita, S Watanabe, P García, K Nagamatsu 2021 IEEE Spoken Language Technology Workshop (SLT), 841-848, 2021 | 58 | 2021 |
End-to-end neural diarization: Reformulating speaker diarization as simple multi-label classification Y Fujita, S Watanabe, S Horiguchi, Y Xue, K Nagamatsu arXiv preprint arXiv:2003.02966, 2020 | 53 | 2020 |
Neural speaker diarization with speaker-wise chain rule Y Fujita, S Watanabe, S Horiguchi, Y Xue, J Shi, K Nagamatsu arXiv preprint arXiv:2006.01796, 2020 | 50 | 2020 |
Voice conversion for emotional speech: Rule-based synthesis with degree of emotion controllable in dimensional space Y Xue, Y Hamada, M Akagi Speech Communication 102, 54-67, 2018 | 47 | 2018 |
Simultaneous speech recognition and speaker diarization for monaural dialogue recordings with target-speaker acoustic models N Kanda, S Horiguchi, Y Fujita, Y Xue, K Nagamatsu, S Watanabe 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 31-38, 2019 | 45 | 2019 |
Towards neural diarization for unlimited numbers of speakers using global and local attractors S Horiguchi, S Watanabe, P García, Y Xue, Y Takashima, Y Kawaguchi 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 98-105, 2021 | 43 | 2021 |
The Hitachi-JHU DIHARD III system: Competitive end-to-end neural diarization and x-vector clustering systems combined by DOVER-Lap S Horiguchi, N Yalta, P Garcia, Y Takashima, Y Xue, D Raj, Z Huang, ... arXiv preprint arXiv:2102.01363, 2021 | 43 | 2021 |
Online streaming end-to-end neural diarization handling overlapping speech and flexible numbers of speakers Y Xue, S Horiguchi, Y Fujita, Y Takashima, S Watanabe, P Garcia, ... arXiv preprint arXiv:2101.08473, 2021 | 30 | 2021 |
Emotional speech synthesis system based on a three-layered model using a dimensional approach Y Xue, Y Hamada, M Akagi 2015 Asia-Pacific Signal and Information Processing Association Annual …, 2015 | 13 | 2015 |
Acoustic and articulatory analysis and synthesis of shouted vowels Y Xue, M Marxen, M Akagi, P Birkholz Computer Speech & Language 66, 101156, 2021 | 11 | 2021 |
Voice conversion to emotional speech based on three-layered model in dimensional approach and parameterization of dynamic features in prosody Y Xue, Y Hamada, M Akagi 2016 Asia-Pacific Signal and Information Processing Association Annual …, 2016 | 9 | 2016 |
Encoder-decoder based attractor calculation for end-to-end neural diarization S Horiguchi, Y Fujita, S Watanabe, Y Xue, P García arXiv preprint arXiv:2106.10654, 2021 | 5 | 2021 |
A study on applying target prediction model to parameterize power envelope of emotional speech Y Xue, M Akagi 2016 RISP International Workshop on Nonlinear Circuits, Communications and …, 2016 | 4 | 2016 |
Acoustic and articulatory analysis and synthesis of shouted vowels Y XUE, M Marxen, M Akagi, P Birkholz Proc. Audit. Res. Meet, 695-700, 2018 | 1 | 2018 |
Voice conversion system to emotional speech in multiple languages based on three-layered model for dimensional space Y Xue, Y Hamada, R Elbarougy, M Akagi 2016 Conference of The Oriental Chapter of International Committee for …, 2016 | 1 | 2016 |
Emotional voice conversion system for multiple languages based on three-layered model in dimensional space Y Xue, Y Hamada, M Akagi Journal of the Acoustical Society of America 140 (4_Supplement), 2960-2960, 2016 | 1 | 2016 |
End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors K Nagamatsu, Y Xue, Y Fujita, S Horiguchi, S Watanabe Interspeech 2020, 2020 | | 2020 |