VCTK

Excerpt


Browse by

 

The collection's logo

CSTR’s VCTK Corpus (Centre for Speech Technology Voice Cloning Toolkit) includes speech data uttered by 109 native speakers of English with various accents. Each speaker reads out about 400 sentences, most of which were selected from a newspaper plus the Rainbow Passage and an elicitation paragraph intended to identify the speaker’s accent. The newspaper texts were taken from The Herald (Glasgow), with permission from Herald & Times Group. Each speaker reads a different set of the newspaper sentences, where each set was selected using a greedy algorithm designed to maximise the contextual and phonetic coverage. The Rainbow Passage and elicitation paragraph are the same for all speakers. This corpus was recorded for the purpose of building HMM-based text-to-speech synthesis systems, especially for speaker-adaptive HMM-based speech synthesis using average voice models trained on multiple speakers and speaker adaptation technologies. The file was previously available on the CSTR website, and was referenced in the Google DeepMind work on WaveNet: https://arxiv.org/pdf/1609.03499.pdf .

Image: Detail showing a rainbow from “Late Autumn Landscape, Cambuskenneth” by Thomas Fenwick © The University of Edinburgh, all rights reserved. (N.B. Recordings include The Rainbow Passage.)

Items in this Collection