List of Datasets for Automatic Speech Recognition (ASR) and Text To Speech Synthesis (TTS)
Excerpt
Providing practical tutorials and unconventional views on AI for physical world applications.
This list contains datasets aimed at both ASR (sometimes called STT) and TTS. Rule of thumb: ASR and TTS are interchangable if done carefully
-
- spoken digits (0 - 9) by 60 different speakers
-
- provide samples for various languages
-
FSDD (Free Spoken Digit Dataset)
- spoken digits by 6 speakers
-
- famous for LibriSpeech and LibriTTS