Letstalk: Latent diffusion transformer for talking video synthesisJan 1, 2024·Haojie Zhang,Zhihao Liang,Ruibo Fu,Zhengqi Wen,Xuefei Liu,Chenxing Li,Jianhua Tao,Yaling Liang· 0 min read CiteTypeJournal articlePublicationarXiv preprint arXiv:2411.16748Last updated on Jan 1, 2024 ← Learning speech representation from contrastive token-acoustic pretraining Jan 1, 2024Mdpe: A multimodal deception dataset with personality and emotional characteristics Jan 1, 2024 →