Ruibo Fu
Open Menu
Close Menu
Bio
Papers
Experience
Projects
Tao Wang
DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech
Jan 1, 2025
VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing
Jan 1, 2024
Text Prompt is Not Enough: Sound Event Enhanced Prompt Adapter for Target Style Audio Generation
Jan 1, 2024
PPPR: Portable Plug-in Prompt Refiner for Text to Audio Generation
Jan 1, 2024
MINT: a Multi-modal Image and Narrative Text Dubbing Dataset for Foley Audio Content Planning and Generation
Jan 1, 2024
Minimally-supervised speech synthesis with conditional diffusion model and language model: A comparative study of semantic coding
Jan 1, 2024
Mel-Refine: A Plug-and-Play Approach to Refine Mel-Spectrogram in Audio Generation
Jan 1, 2024
Learning speech representation from contrastive token-acoustic pretraining
Jan 1, 2024
ICAGC 2024: Inspirational and Convincing Audio Generation Challenge 2024
Jan 1, 2024
Emotion selectable end-to-end text-based speech editing
Jan 1, 2024
Next »