MINT: a Multi-modal Image and Narrative Text Dubbing Dataset for Foley Audio Content Planning and Generation

Jan 1, 2024·
Ruibo Fu
,
Shuchen Shi
,
Hongming Guo
,
Tao Wang
,
Chunyu Qiang
,
Zhengqi Wen
,
Jianhua Tao
,
Xin Qi
,
Yi Lu
,
Xiaopeng Wang
,
Others
· 0 min read
Type
Publication
arXiv preprint arXiv:2406.10591