Follow
Ziyang Ma
Title
Cited by
Cited by
Year
MT4SSL: Boosting self-supervised speech representation learning by integrating multiple targets
Z Ma, Z Zheng, C Tang, Y Wang, X Chen
Proc. Interspeech 2023, 2022
182022
Lauragpt: Listen, attend, understand, and regenerate audio with gpt
Q Chen, Y Chu, Z Gao, Z Li, K Hu, X Zhou, J Xu, Z Ma, W Wang, S Zheng, ...
arXiv preprint arXiv:2310.04673, 2023
112023
Hierarchical deep residual reasoning for temporal moment localization
Z Ma, X Han, X Song, Y Cui, L Nie
Proceedings of the 3rd ACM International Conference on Multimedia in Asia, 1-7, 2021
102021
Leveraging speech ptm, text llm, and emotional tts for speech emotion recognition
Z Ma, W Wu, Z Zheng, Y Guo, Q Chen, S Zhang, X Chen
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
42024
ELLA-V: Stable Neural Codec Language Modeling with Alignment-guided Sequence Reordering
Y Song, Z Chen, X Wang, Z Ma, X Chen
arXiv preprint arXiv:2401.07333, 2024
42024
Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation
Z Ma, Z Zheng, G Yang, Y Wang, C Zhang, X Chen
Proc. Interspeech 2023, 2023
42023
Tessp: text-enhanced self-supervised speech pre-training
Z Yao, S Ren, S Chen, Z Ma, P Guo, L Xie
arXiv preprint arXiv:2211.13443, 2022
42022
VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching
Y Guo, C Du, Z Ma, X Chen, K Yu
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
3*2024
EAT: Self-Supervised Pre-Training with Efficient Audio Transformer
W Chen, Y Liang, Z Ma, Z Zheng, X Chen
arXiv preprint arXiv:2401.03497, 2024
32024
Fast-Hubert: an Efficient Training Framework for Self-Supervised Speech Representation Learning
G Yang, Z Ma, Z Zheng, Y Song, Z Niu, X Chen
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-7, 2023
32023
Front-end adapter: Adapting front-end input of speech based self-supervised learning for speech recognition
X Chen, Z Ma, C Tang, Y Wang, Z Zheng
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
32023
Improving few-shot learning for talking face system with tts data augmentation
Q Chen, Z Ma, T Liu, X Tan, Q Lu, K Yu, X Chen
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
32023
Towards universal speech discrete tokens: A case study for asr and tts
Y Yang, F Shen, C Du, Z Ma, K Yu, D Povey, X Chen
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
22024
Unsupervised Active Learning: Optimizing Labeling Cost-Effectiveness for Automatic Speech Recognition
Z Zheng, Z Ma, Y Wang, X Chen
Proc. Interspeech 2023, 2023
22023
Improving Code-Switching and Named Entity Recognition in ASR with Speech Editing based Data Augmentation
Z Liang, Z Song, Z Ma, C Du, K Yu, X Chen
Proc. Interspeech 2023, 2023
22023
Hourglass-AVSR: Down-Up Sampling-Based Computational Efficiency Model for Audio-Visual Speech Recognition
F Yu, H Wang, Z Ma, S Zhang
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
12024
ChatMusician: Understanding and Generating Music Intrinsically with LLM
R Yuan, H Lin, Y Wang, Z Tian, S Wu, T Shen, G Zhang, Y Wu, C Liu, ...
arXiv preprint arXiv:2402.16153, 2024
12024
emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
Z Ma, Z Zheng, J Ye, J Li, Z Gao, S Zhang, X Chen
arXiv preprint arXiv:2312.15185, 2023
12023
Exploring effective distillation of self-supervised speech models for automatic speech recognition
Y Wang, C Tang, Z Ma, Z Zheng, X Chen, WQ Zhang
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-6, 2023
12023
SOUND EVENT DETECTION BY AGGREGATING PRE-TRAINED EMBEDDINGS FROM DIFFERENT LAYERS
X Xu, Z Ma, F Yang, G Yang, M Wu, X Chen
Tech. Rep., Technical report, DCASE2023 Challenge, 2023
12023
The system can't perform the operation now. Try again later.
Articles 1–20