|
Title:
|
LYRICCOVERS: A COMPREHENSIVE LARGE-SCALE DATASET OF COVER SONGS WITH LYRICS |
|
Author(s):
|
Maximilian Balluff, Peter Mandl and Christian Wolff |
|
ISBN:
|
978-989-8704-62 |
|
Editors:
|
Paula Miranda and Pedro IsaĆas |
|
Year:
|
2024 |
|
Edition:
|
Single |
|
Keywords:
|
Cover Song Detection, Music Information Retrieval, Dataset, Lyrics |
|
Type:
|
Full |
|
First Page:
|
325 |
|
Last Page:
|
334 |
|
Language:
|
English |
|
Cover:
|
|
|
Full Contents:
|
click to dowload
|
|
Paper Abstract:
|
This research offers a detailed examination of a novel dataset that collates original musical compositions alongside their
derivative cover versions. Unique in its inclusion of both links to YouTube as well as and lyrical content, the dataset enlis ts
more than 70,000 tracks, encompassing more than 18,000 cover song groupings. It stands as the most diverse compendium
of cover songs currently available for study. The characteristics of the LyricCovers dataset are thoroughly analyzed through
its metadata, and empirical evaluations in the subsequent experimental lyrics analysis section suggest that lyrical analysis
is a fundamental component in the identification and study of cover songs. This work presents a baseline approach to cover
song detection, with an emphasis on lyrical content processing. It describes the extraction of lyrics from the audio files and
the application of the Jina Embeddings 2 Model, fine-tuned with a hard triplet-loss objective, which successfully exploits
lyric similarity to accurately identify cover songs. |
|
|
|
|
|
|