Datasets
Over the years, I’ve helped create a number of datasets for research on speech processing, particularly in challenging listening environments. Below is a selection of these datasets, with links to where they can be accessed. If you use any of them in your work, please cite the associated publications. And if you have any questions about the datasets, feel free to contact me.
Distant mic and noise robust ASR
-
CHiME-5 / CHiME-6
Real dinner parties recorded in people's homes using distant microphone arrays. Fully transcribed.
-
CHiME-3
Read speech recorded over a multi-microphone tablet in noisy urban environments (streets, on buses, cafes etc).
-
CHiME-2 WSJ
Binaural speech in noise simulations. Extension of CHiME-1 using materials from the WSJ corpus.
-
CHiME-2 Grid
Binaural speech in noise simulations. Extension of CHiME-1 to simulate moving speech sources.
-
CHiME-1
Target sentences mixed into real binaural recordings of a noisy domestic home.
Hearing Impairment and Hearing Aids
-
CHiME-9 ECHI
Four person conversations in cafeteria noise recorded over close-talk mic, hearing aids and Meta Aria glasses.
-
Clairity Prediction Challenge 3
Data for the 3rd Clairity Prediction Challenge: Intelligibility scores from hearing impaired listener presented with speech processed by experimental hearing aid algorithms.
-
CEC3 Real Dynamic Backgrounds
Simulated hearing aid inputs with speech mixed into ambisonic recordings of real environments.
-
CEC3 Real Hearing Aid Recordings
Speech plus multiple speech and noise distractors produced using loud speaker arrays and recording over head-worn hearing aid microphone.
-
CEC3 Real impulse responses
Simulated hearing aid inputs for speech plus multiple speech and noise distractors mixed using real ambisonic impulse responses.
Speech Perception
-
AV Grid Corpus
A multi-talker audiovisual sentence corpus designed to support joint computational-behavioral studies in Perception.
-
The AV Lombard Grid Corpus
An extension of the Grid corpus with normal and Lombard style speech. Contains frontal and profile video.
-
The Clarity Sentences
A set of 10,000 studio-recorded sentences from 40 British English speakers, designed for the evaluation of speech enhancement and speech intelligibility algorithm.
Music
-
Cadenza Lyrics Intelligibility Predict (CLIP) dataset
Dataset of music excerpts with hearing loss simulation and human intelligibility scores
-
Cadenza Woodwind
Synthesized dataset of small ensembles of woodwind instruments for demixing and rebalancing research.
Miscellaneous
-
CHiME Home
A subset of the CHiME-5 dataset annotations for a set of domestic acoustic events.