Datasets

Over the years, I’ve helped create a number of datasets for research on speech processing, particularly in challenging listening environments. Below is a selection of these datasets, with links to where they can be accessed. If you use any of them in your work, please cite the associated publications. And if you have any questions about the datasets, feel free to contact me.

Distant mic and noise robust ASR

CHiME-5 / CHiME-6

Real dinner parties recorded in people's homes using distant microphone arrays. Fully transcribed.

Download from OpenSLR

CHiME-3

Read speech recorded over a multi-microphone tablet in noisy urban environments (streets, on buses, cafes etc).

Download from the LDC

CHiME-2 WSJ

Binaural speech in noise simulations. Extension of CHiME-1 using materials from the WSJ corpus.

Download from the LDC

CHiME-2 Grid

Binaural speech in noise simulations. Extension of CHiME-1 to simulate moving speech sources.

Download from the LDC

CHiME-1

Target sentences mixed into real binaural recordings of a noisy domestic home.

Download from MyAirBridge

Hearing Impairment and Hearing Aids

CHiME-9 ECHI

Four person conversations in cafeteria noise recorded over close-talk mic, hearing aids and Meta Aria glasses.

Download from HuggingFace

Clairity Prediction Challenge 3

Data for the 3rd Clairity Prediction Challenge: Intelligibility scores from hearing impaired listener presented with speech processed by experimental hearing aid algorithms.

Download from Zenodo

CEC3 Real Dynamic Backgrounds

Simulated hearing aid inputs with speech mixed into ambisonic recordings of real environments.

Download from Zenoda

CEC3 Real Hearing Aid Recordings

Speech plus multiple speech and noise distractors produced using loud speaker arrays and recording over head-worn hearing aid microphone.

Download from Zenodo

CEC3 Real impulse responses

Simulated hearing aid inputs for speech plus multiple speech and noise distractors mixed using real ambisonic impulse responses.

Download from Zenodo

Speech Perception

AV Grid Corpus

A multi-talker audiovisual sentence corpus designed to support joint computational-behavioral studies in Perception.

Download from Zenodo

The AV Lombard Grid Corpus

An extension of the Grid corpus with normal and Lombard style speech. Contains frontal and profile video.

Download

The Clarity Sentences

A set of 10,000 studio-recorded sentences from 40 British English speakers, designed for the evaluation of speech enhancement and speech intelligibility algorithm.

Download from Figshare

Music

Cadenza Lyrics Intelligibility Predict (CLIP) dataset

Dataset of music excerpts with hearing loss simulation and human intelligibility scores

Download from Zenodo

Cadenza Woodwind

Synthesized dataset of small ensembles of woodwind instruments for demixing and rebalancing research.

Download from Zenodo

Miscellaneous

CHiME Home

A subset of the CHiME-5 dataset annotations for a set of domestic acoustic events.

Download from DCASE