VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing

Jan 1, 2024·

Chunyu Qiang

,

Wang Geng

,

Yi Zhao

,

Ruibo Fu

,

Tao Wang

,

Cheng Gong

,

Tianrui Wang

,

Qiuyu Liu

,

Jiangyan Yi

,

Zhengqi Wen

,

Others

· 0 min read

Type

Journal article

Publication

arXiv preprint arXiv:2408.05758

Last updated on Jan 1, 2024

← Unlocking the Power of Emotions: Enhancing Personality Trait Recognition Through Utilization of Emotional Cues Jan 1, 2024

Adaptive fake audio detection with low-rank model squeezing Jan 1, 2023 →