With the rise of really good transcription models, TTS that’s actually enjoyable to ...

2024-09-22 13:43:14

With the rise of really good transcription models, TTS that’s actually enjoyable to listen to, and LLMs that can carry a conversation and understand complex commands, why haven’t we seen an explosion of really good voice interfaces?

It seems obvious to me but I’ve only seen Apple making a serious attempt with the latest Siri update. There are so many times that I’m doing something with my hands, driving, etc. and wish I could give commands to my RSS reader or just chat with an LLM that has the Arxiv and Wikipedia connected with RAG.

Author Public Key

npub1uqeexjx2djkfwzxdnrnrrch5h2k4xn0uapcgsxm94ftaxrlhy5lqywjckg

Show more details

jsm on Nostr: With the rise of really good transcription models, TTS that’s actually enjoyable to ...