Skip to content

Datasets

Over the years, I’ve helped create a number of datasets for research on speech processing, particularly in challenging listening environments. Below is a selection of these datasets, with links to where they can be accessed. If you use any of them in your work, please cite the associated publications. And if you have any questions about the datasets, feel free to contact me.

Distant mic and noise robust ASR

  • CHiME-5 / CHiME-6


    Real dinner parties recorded in people's homes using distant microphone arrays. Fully transcribed.

    Download from OpenSLR

  • CHiME-3


    Read speech recorded over a multi-microphone tablet in noisy urban environments (streets, on buses, cafes etc).

    Download from the LDC

  • CHiME-2 WSJ


    Binaural speech in noise simulations. Extension of CHiME-1 using materials from the WSJ corpus.

    Download from the LDC

  • CHiME-2 Grid


    Binaural speech in noise simulations. Extension of CHiME-1 to simulate moving speech sources.

    Download from the LDC

Hearing Impairment and Hearing Aids

  • CHiME-9 ECHI


    Four person conversations in cafeteria noise recorded over close-talk mic, hearing aids and Meta Aria glasses.

    Download from HuggingFace

  • Clairity Prediction Challenge 3


    Data for the 3rd Clairity Prediction Challenge: Intelligibility scores from hearing impaired listener presented with speech processed by experimental hearing aid algorithms.

    Download from Zenodo

  • CEC3 Real Dynamic Backgrounds


    Simulated hearing aid inputs with speech mixed into ambisonic recordings of real environments.

    Download from Zenoda

  • CEC3 Real Hearing Aid Recordings


    Speech plus multiple speech and noise distractors produced using loud speaker arrays and recording over head-worn hearing aid microphone.

    Download from Zenodo

  • CEC3 Real impulse responses


    Simulated hearing aid inputs for speech plus multiple speech and noise distractors mixed using real ambisonic impulse responses.

    Download from Zenodo

Speech Perception

  • AV Grid Corpus


    A multi-talker audiovisual sentence corpus designed to support joint computational-behavioral studies in Perception.

    Download from Zenodo

  • The AV Lombard Grid Corpus


    An extension of the Grid corpus with normal and Lombard style speech. Contains frontal and profile video.

    Download

  • The Clarity Sentences


    A set of 10,000 studio-recorded sentences from 40 British English speakers, designed for the evaluation of speech enhancement and speech intelligibility algorithm.

    Download from Figshare

Music

  • Cadenza Lyrics Intelligibility Predict (CLIP) dataset


    Dataset of music excerpts with hearing loss simulation and human intelligibility scores

    Download from Zenodo

  • Cadenza Woodwind


    Synthesized dataset of small ensembles of woodwind instruments for demixing and rebalancing research.

    Download from Zenodo

Miscellaneous

  • CHiME Home


    A subset of the CHiME-5 dataset annotations for a set of domestic acoustic events.

    Download from DCASE