Overview
Casual Conversations is composed of over 45,000 videos (3,011 participants) and intended to be used for assessing the performance of already trained models in computer vision and audio applications for the purposes permitted in our data user agreement. The videos feature paid individuals who agreed to participate in the project and explicitly provided age and gender labels themselves. The videos were recorded in the U.S. with a diverse set of adults in various age, gender and apparent skin tone groups. A group of trained annotators labeled the participantsâ apparent skin tone using the Fitzpatrick scale in addition to annotations of whether the videos are recorded in low ambient lighting conditions. Spoken words in all videos were also manually transcribed by human annotators and are available with the dataset.
Casual Conversations Dataset Version 1.0
Casual Conversations dataset is designed to help researchers evaluate their computer vision and audio models for accuracy across certain attributes.
Key Application
Machine learning, ML Fairness
Intended Use Cases
Assist in measuring algorithmic fairness in terms of age, gender, apparent skin tone, ambient lighting conditions, and speech recognition
Primary Data Type
Video (mp4)
Data Function
Testing, training (without using provided annotations)
Dataset Characteristics
Total number of subjects/actors: 3,011
Total number of video recordings: 45,186
Average per video length: ~1 Minute
Gender (self-provided)
3,011
Skin Tone (human labelled)
3,011
Lighting (human labelled)
45,186
Speech Transcriptions (human labelled)
45,186
Nature Of Content
Video recordings of individuals, who are asked random questions from a pre-approved list, to provide their âunscriptedâ answer
Privacy PII
Participants de-identified with unique numbers
License
Limited; see full license language for use
Summary of license permissions
You can evaluate models on the provided labels
You cannot train any model with the provided labels
Access Cost
Open access
Data Collection
Data sources
Vendor data collection efforts
Data selection
All videos are opted-in for data use in ML by the participants
Sampling Methods
Unsampled
Geographic distribution
100% US, cities: Atlanta, Houston, Miami, New Orleans, and Richmond
Labelling Methods
Human Labels
Label types
Human-labels: free-form text labels
Labeling procedure - Human
Participants provided age and gender labels by themselves
Annotators labelled for apparent skin tone, ambient lighting and speech transcriptions
Validation Methods
Human validated
Validator description(s)
Human validated
Validation tasks
Human validators verify labels
Human validators flag PII
Human validators filter data
Validation policy summary
All labels are verified by human validators based in the U.S.
Validators flag any PII content
This dataset features the original video recordings created by Facebook for the Deepfake Detection Challenge (DFDC) dataset. The AI research community can use Casual Conversations as one important stepping stone toward normalizing subgroup measurement and fairness research. With Casual Conversations, we hope to spur further research in this important, emerging field.
If you are an individual who appears in this dataset and would like for your videos to be removed from this dataset, please contact: casualconversations@fb.com