Overview of Text to Speech Approaches

Name: Overview of Text to Speech Approaches
Start: 2024-11-20T18:00:00-06:00
End: 2024-11-20T19:00:00-06:00

November 20, 2024 @ 6:00 pm – 7:00 pm

Thanks to Josh Phillips for hosting last week while I was out! I should have the recording up soon at https://hsv.ai/videos/

This week we will take a look at Speech to Text models in three different categories:

Products that create audio from text in an offline mode
APIs that can be integrated into a product
Open source models that you can host

Each of these presents different challenges that we’ll cover such as latency, realism, and hallucination. Here’s the list of products and models so far, so if you don’t see your favorite in the list, let me know and we’ll check it out as well:

Parler
Coqui
Bark
OpenAI (6 models)
BASE TTS (Amazon)
MetaVoice
MeloTTS
ElevenLabs
Facebook MMS

Also – a few of us went to the Huntsville AI and Machine Learning Technology Exchange and Expo last week, so we might do an overview of those topics if time permits.

Links & Other Events:

2024 AI Symposium Recorded Sessions – https://www.youtube.com/playlist?list=PLvvHQqQynqmtkQuLsvfFtmy0OohBcA5H4
2025 AI Symposium – https://www.rocketcenter.com/institute
Hugging Face Text to Speech – https://huggingface.co/tasks/text-to-speech

Details:

Date – 11/20/2024
Time – 6-7:30pm
Zoom –https://us02web.zoom.us/j/89971705398?pwd=CndWhnWX6sbtgQLaAbn8CctPjzcxjV.1

As always, I really appreciate the support and replies to these emails. You can also help by following, sharing, liking, and dropping comments on my posts on LinkedIn and Facebook – especially the ones directly for the Huntsville AI page on LinkedIn – https://www.linkedin.com/company/huntsville-ai

Details

Date:: November 20, 2024
Time:: 6:00 pm – 7:00 pm