Multi-band melgan: Faster waveform generation for high-quality text-to-speech G Yang, S Yang, K Liu, P Fang, W Chen, L Xie 2021 IEEE Spoken Language Technology Workshop (SLT), 492-498, 2021 | 253 | 2021 |
Controllable emotion transfer for end-to-end speech synthesis T Li, S Yang, L Xue, L Xie 2021 12th International Symposium on Chinese Spoken Language Processing …, 2021 | 88 | 2021 |
Msemotts: Multi-scale emotion transfer, prediction, and control for emotional speech synthesis Y Lei, S Yang, X Wang, L Xie IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 853-864, 2022 | 73 | 2022 |
A deep bidirectional LSTM approach for video-realistic talking head B Fan, L Xie, S Yang, L Wang, FK Soong Multimedia Tools and Applications 75, 5287-5309, 2016 | 69 | 2016 |
Statistical parametric speech synthesis using generative adversarial networks under a multi-task learning framework S Yang, L Xie, X Chen, X Lou, X Zhu, D Huang, H Li 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2017 | 66 | 2017 |
Fine-grained emotion strength transfer, control and prediction for emotional speech synthesis Y Lei, S Yang, L Xie 2021 IEEE Spoken Language Technology Workshop (SLT), 423-430, 2021 | 64 | 2021 |
Controlling emotion strength with relative attribute for end-to-end speech synthesis Z Xiaolian, Y Shan, X Geng, Yang, Lei 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2019 | 56 | 2019 |
Pre-alignment guided attention for improving training efficiency and model stability in end-to-end speech synthesis X Zhu, Y Zhang, S Yang, L Xue, L Xie IEEE Access 7, 65955-65964, 2019 | 39 | 2019 |
Accent and speaker disentanglement in many-to-many voice conversion Z Wang, W Ge, X Wang, S Yang, W Gan, H Chen, H Li, L Xie, X Li 2021 12th International Symposium on Chinese Spoken Language Processing …, 2021 | 37 | 2021 |
On the localness modeling for the self-attention based end-to-end speech synthesis S Yang, H Lu, S Kang, L Xue, J Xiao, D Su, L Xie, D Yu Neural Networks 125, 121-130, 2020 | 37 | 2020 |
Controllable context-aware conversational speech synthesis J Cong, S Yang, N Hu, G Li, L Xie, D Su Interspeech, 2021, 4658-4662, 2021 | 34 | 2021 |
Data efficient voice cloning from noisy samples with domain adversarial training J Cong, S Yang, L Xie, G Yu, G Wan arXiv preprint arXiv:2008.04265, 2020 | 34 | 2020 |
Glow-wavegan: Learning speech representations from gan-based variational auto-encoder for high fidelity flow-based speech synthesis J Cong, S Yang, L Xie, D Su Interspeech, 2021, 2021 | 30 | 2021 |
Learning Hierarchical Representations for Expressive Speaking Style in End-to-End Speech Synthesis X An, Y Wang, S Yang, Z Ma, L Xie 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2019 | 22 | 2019 |
Enhancing Hybrid Self-attention Structure with Relative-position-aware Bias for Speech Synthesis S Yang, H Lu, S Kang, L Xie, D Yu 2019 IEEE International Conference on Acoustics, Speech and Signal …, 2019 | 20 | 2019 |
On the training of dnn-based average voice model for speech synthesis S Yang, Z Wu, L Xie 2016 Asia-Pacific Signal and Information Processing Association Annual …, 2016 | 18 | 2016 |
Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and Any-to-any Voice Conversion Y Lei, S Yang, J Cong, L Xie, D Su Interspeech, 2022, 2022 | 17 | 2022 |
Improving Mandarin End-to-End Speech Synthesis by Self-Attention and Learnable Gaussian Bias F Yang, S Yang, P Zhu, P Yan, L Xie 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2019 | 17 | 2019 |
The USTC system for blizzard challenge 2017 YJ Hu, C Ding, LJ Liu, ZH Ling, LR Dai Proc. Blizzard Challenge Workshop, 2017 | 16 | 2017 |
Cross-speaker emotion transfer through information perturbation in emotional speech synthesis Y Lei, S Yang, X Zhu, L Xie, D Su IEEE Signal Processing Letters 29, 1948-1952, 2022 | 14 | 2022 |