Published September 23, 2018 | Version v1.0.0
Dataset Open
- 1. Spotify
- 2. NYU
Description
The OpenMIC-2018 dataset is made available through a collaboration between Spotify and MARL@NYU. Additionally, the cost of annotation was sponsored by Spotify, whose contributions to open-source research can be found online at the developer site, engineering blog, and public GitHub.
If you use this dataset, please cite the following work:
Humphrey, Eric J., Durand, Simon, and McFee, Brian. âOpenMIC-2018: An Open Dataset for Multiple Instrument Recognition.â in Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), 2018. [pdf]
The dataset is made available by Spotify ABÂ under a Creative Commons Attribution 4.0 International (CC BY 4.0)Â license. The full terms of this license are included alongside this dataset.
This dataset contains the following:
- 10 second snippets of audio, in a directory format like âaudio/{0:3}/{0}.oggâ.format(sample_key)
- VGGish features as JSON objects, in a directory format like âvggish/{0:3}/{0}.jsonâ.format(sample_key)
- MD5 checksums for each OGG and JSON file
- Anonymized individual responses, in âopenmic-2018-individual-responses.csvâ
- Aggregated labels, in âopenmic-2018-aggregated-labels.csvâ
- Track metadata, with licenses for each audio recording, in âopenmic-2018-metadata.csvâ
- A Python-friendly NPZ file of features and labels, âopenmic-2018.npzâ
- Sample partitions for train and test, in âpartitions/*.txtâ
Files
Files (2.6 GB)
Additional details
- Humphrey, Eric J., Durand, Simon, and McFee, Brian. âOpenMIC-2018: An Open Dataset for Multiple Instrument Recognition.â in Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), 2018.
CitationsCitations to this version