juraj on Nostr: Do you know of any great text to speech models that do intonation well? Open weights. ...
Do you know of any great text to speech models that do intonation well? Open weights. They do not need to clone voices.
I've tried suno bark, but it sometimes hallucinates. I need the reading to be literally what's written. Also tried f5-tts, intonation is not great and the speed varies a lot, so when it's reading multiple texts, the speed of output speech is different between generation. The duration predictor is also not great and sometimes causes cutoffs.
Have I missed something?
English only for now is ok.
Published at
2024-11-04 15:45:14Event JSON
{
"id": "dff2b5fb88745e8a09ca8995841bdfe895a4c1d4a3c5414d8f668c355c7d4ab7",
"pubkey": "dab6c6065c439b9bafb0b0f1ff5a0c68273bce5c1959a4158ad6a70851f507b6",
"created_at": 1730735114,
"kind": 1,
"tags": [],
"content": "Do you know of any great text to speech models that do intonation well? Open weights. They do not need to clone voices.\n\nI've tried suno bark, but it sometimes hallucinates. I need the reading to be literally what's written. Also tried f5-tts, intonation is not great and the speed varies a lot, so when it's reading multiple texts, the speed of output speech is different between generation. The duration predictor is also not great and sometimes causes cutoffs.\n\nHave I missed something? \n\nEnglish only for now is ok.",
"sig": "a2e954e5565ea87bc22b67dd0aa8f89b044d0d9a946d7352c27766c22208e88d46831fdcdfdad07a7ede49569d36f2d58e8a6669c5c4f48aa01b92d9c1d8b49e"
}