VCTK
CSTR’s VCTK Corpus (Centre for Speech Technology Voice Cloning Toolkit) includes speech data uttered by 109 native speakers of English with various accents. Each speaker reads out about 400 sentences, most of which were selected from a newspaper plus the Rainbow Passage and an elicitation paragraph intended to identify the speaker’s accent. The newspaper texts were taken from The Herald (Glasgow), with permission from Herald & Times Group. Each speaker reads a different set of the newspaper sentences, where each set was selected using a greedy algorithm designed to maximise the contextual and phonetic coverage. The Rainbow Passage and elicitation paragraph are the same for all speakers. This corpus was recorded for the purpose of building HMM-based text-to-speech synthesis systems, especially for speaker-adaptive HMM-based speech synthesis using average voice models trained on multiple speakers and speaker adaptation technologies. The file was previously available on the CSTR website, and was referenced in the Google DeepMind work on WaveNet: https://arxiv.org/pdf/1609.03499.pdf .
Image: Detail showing a rainbow from “Late Autumn Landscape, Cambuskenneth” by Thomas Fenwick © The University of Edinburgh, all rights reserved. (N.B. Recordings include The Rainbow Passage.)
Items in this Collection
-
CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit (version 0.92)
Yamagishi, Junichi; Veaux, Christophe; MacDonald, Kirsten
This CSTR VCTK Corpus includes speech data uttered by 110 English speakers with various accents. Each speaker reads out about 400 sentences, which were selected from a newspaper, the rainbow passage and an elicitation …
-
Device Recorded VCTK (Small subset version)
Sarfjoo, Seyyed Saeed; Yamagishi, Junichi
This dataset is a new variant of the voice cloning toolkit (VCTK) dataset: device-recorded VCTK (DR-VCTK), where the high-quality speech signals recorded in a semi-anechoic chamber using professional audio devices are …
-
SUPERSEDED - Device Recorded VCTK (Small subset version)
Sarfjoo, Seyyed Saeed; Yamagishi, Junichi
This item has been replaced by the one which can be found at https://doi.org/10.7488/ds/2316 This dataset is a new variant of the voice cloning toolkit (VCTK) dataset: device-recorded VCTK (DR-VCTK), where the …
-
Noisy reverberant speech database for training speech enhancement algorithms and TTS models
Valentini-Botinhao, Cassia
Noisy reverberant speech database. The database was designed to train and test speech enhancement (noise suppression and dereverberation) methods that operate at 48kHz. Clean speech was made reverberant and noisy by …
-
Noisy speech database for training speech enhancement algorithms and TTS models
Valentini-Botinhao, Cassia
Clean and noisy parallel speech database. The database was designed to train and test speech enhancement methods that operate at 48kHz. A more detailed description can be found in the papers associated with the database. …
-
96kHz version of the CSTR VCTK Corpus
Veaux, Christophe; Yamagishi, Junichi
This dataset includes 96kHz version of the CSTR VCTK Corpus including speech data uttered by 109 native speakers of English with various accents. The main dataset can be found at https://doi.org/10.7488/ds/1994 (containing …
-
SUPERSEDED - CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit
Veaux, Christophe; Yamagishi, Junichi; MacDonald, Kirsten
This item has been replaced by the one which can be found at https://doi.org/10.7488/ds/2645’ This CSTR VCTK Corpus (Centre for Speech Technology Voice Cloning Toolkit) includes speech data uttered by 109 native …
-
SUPERSEDED - CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit
Veaux, Christophe; Yamagishi, Junichi; MacDonald, Kirsten
SUPERSEDED - This item has been replaced by the one which can be found at https://doi.org/10.7488/ds/1994 . # This CSTR VCTK Corpus (Centre for Speech Technology Voice Cloning Toolkit) includes speech data uttered by 109 …
-
Reverberant speech database for training speech dereverberation algorithms and TTS models
Valentini-Botinhao, Cassia
Reverberant speech database. The database was designed to train and test speech dereverberation methods that operate at 48kHz. Clean speech was made reverberant by convolving it with a room impulse response. The room impulse …