🪴 Anil's Garden

❯

The Zero Resource Speech Challenge 2021: Spoken language modelling

18 Jul 20252 min read

paper
annotated

Title: The Zero Resource Speech Challenge 2021: Spoken language modelling
Authors: Ewan Dunbar, Mathieu Bernard, Nicolas Hamilakis, Tu Anh Nguyen, Maureen de Seyssel, Patricia Rozé, Morgane Rivière, Eugene Kharitonov, Emmanuel Dupoux
Published: 29th April 2021 (Thursday) @ 23:53:37
Link: http://arxiv.org/abs/2104.14700v2

Abstract

We present the Zero Resource Speech Challenge 2021, which asks participants to learn a language model directly from audio, without any text or labels. The challenge is based on the Libri-light dataset, which provides up to 60k hours of audio from English audio books without any associated text. We provide a pipeline baseline system consisting on an encoder based on contrastive predictive coding (CPC), a quantizer ( $k$ -means) and a standard language model (BERT or LSTM). The metrics evaluate the learned representations at the acoustic (ABX discrimination), lexical (spot-the-word), syntactic (acceptability judgment) and semantic levels (similarity judgment). We present an overview of the eight submitted systems from four groups and discuss the main results.

Contains the descriptions of the four Zero Resouce Benchmark 2021 metrics esp. neatly presented in Table 1:

ABX (Acoustic phonetic) - Libri-light dataset
Spot-the-word (lexicon) - sWUGGY dataset
- p(a)>p(b)?
- (brick, ∗ blick)
- (squalled, ∗ squilled)
Semantics similarity judgment (lexical) - sSIMI dataset
- d(a, b) ∝ dh(a, b)?
- (abduct, kidnap) : 8.63
- (abduct, tap): 0.5
Acceptability judgment (syntax) - sBLIMP dataset
- p(a) > p(b)? (dogs eat meat, ∗ dogs eats meat) (the boy can’t help himself, ∗ the boy can’t help herself)

👉 The Zero Resource Speech Benchmark (series)

The objective of the Zero Resource Speech Challenge (ZRC) series is to enable researchers to build a spoken dialogue system directly from raw audio recordings (no text, no labels!)
This is difficult, so ZRC breaks down the problem into more manageable subtasks, and provides metrics for cumulative progress
It currently supports 4 subtasks:

Graph View

Backlinks

Speech and Audio

Website
Bluesky
Twitter/X
GitHub
LinkedIn
Instagram
Goodreads
Letterboxd
🍋

🪴 Anil's Garden

Explorer

The Zero Resource Speech Challenge 2021: Spoken language modelling

👉 The Zero Resource Speech Benchmark (series)

Graph View

Backlinks