Voice Personalization

For a given text in any language or (English text), the solution should speak in the tone of the specific person. The best would be when the Voice Personalization happens, the words/phrases spoken bring in the associated emotion in that text. The emotions could only four, happy, anger, sad and neutral.

Technical design

Proposed Model:- As we need emotions in the synthesized speech, we will consider the embedding vector which is generated from the emotional analysis module and given as input to the vectors in Generative Adversarial Networks (GANs)-based…


