论文标题

学习具有部分标签的多仪器分类

Learning Multi-instrument Classification with Partial Labels

论文作者

Anhari, Amir Kenarsari

论文摘要

多仪器识别是预测音频剪辑中不同工具的存在或不存在的任务。将深度学习应用于多个仪器识别的一个巨大挑战是标记数据的稀缺性。 OpenMIC是一个最近包含20K复音音频夹的数据集。数据集的标记较弱,因为每个剪辑中只有仪器的存在或不存在,而开始和偏移时间却未知。数据集也被部分标记,因为每个剪辑只有一个子集标记了一部分仪器。 在这项工作中,我们研究了基于注意力的复发性神经网络来解决弱标记的问题。我们还使用不同的数据增强方法来减轻部分标记的问题。我们的实验表明,我们的方法实现了OpenMIC多仪器识别任务的最新结果。

Multi-instrument recognition is the task of predicting the presence or absence of different instruments within an audio clip. A considerable challenge in applying deep learning to multi-instrument recognition is the scarcity of labeled data. OpenMIC is a recent dataset containing 20K polyphonic audio clips. The dataset is weakly labeled, in that only the presence or absence of instruments is known for each clip, while the onset and offset times are unknown. The dataset is also partially labeled, in that only a subset of instruments are labeled for each clip. In this work, we investigate the use of attention-based recurrent neural networks to address the weakly-labeled problem. We also use different data augmentation methods to mitigate the partially-labeled problem. Our experiments show that our approach achieves state-of-the-art results on the OpenMIC multi-instrument recognition task.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源