Emotional speech synthesis

Published:

Status: Available ✅

Audio Processing

immagine

Emotional speech synthesis represents a groundbreaking technology that has the potential to reshape human-machine interaction across various domains. By infusing synthesized speech with different emotions, this technology can enhance the naturalness and effectiveness of machine-generated speech, opening up new frontiers in virtual agents, human-computer interfaces, entertainment, therapy, and assistive technologies. The implications are vast, promising a future where machines can authentically and empathetically communicate emotions, transforming how we interact and engage with artificial systems.

The main objectives of this thesis are:

  • Analyze the state-of-the-art techniques for emotional speech synthesis.
  • Leverage modern deep learning architectures to design a novel approach for this task.
  • Demonstrate the effectiveness of the proposed approach using benchmark data collections (e.g., IEMOCAP).

References:

  1. Emotional Speech Synthesis: A Review
  2. Speech Synthesis with Mixed Emotions
  3. Hume AI