🪴 Anil's Garden

❯

Better speech synthesis through scaling

19 Dec 20251 min read

paper
tts
speech
annotated

Title: Better speech synthesis through scaling
Authors: James Betker
Published: 12th May 2023 (Friday) @ 04:19:49
Link: http://arxiv.org/abs/2305.07243v2

Abstract

In recent years, the field of image generation has been revolutionized by the application of autoregressive transformers and DDPMs. These approaches model the process of image generation as a step-wise probabilistic processes and leverage large amounts of compute and data to learn the image distribution. This methodology of improving performance need not be confined to images. This paper describes a way to apply advances in the image generative domain to speech synthesis. The result is TorToise — an expressive, multi-voice text-to-speech system. All model code and trained weights have been open-sourced at https://github.com/neonbjb/tortoise-tts.

TorToise TTS Code - Github
TorToiSe - Spending Compute for High Quality TTS
TorToise TTS Hugging Face Space

Graph View

Backlinks

Speech and Audio - Rolodex - Papers, Models and Releases

Website
Bluesky
Twitter/X
GitHub
LinkedIn
Instagram
Goodreads
Letterboxd
🍋

🪴 Anil's Garden

Explorer

Better speech synthesis through scaling

Graph View

Backlinks