WJRH Music Fingerprinting

10 Sep 2017

Introduction

This project aims to use open source audio fingerprinting software to automate the task of having to enter the current playing songs by the DJ at WJRH. The software could also be extended to help clean up and manage RoboDJ by finding explicit songs, and categorising songs into different genres, artists, etc for specialised automatic play (e.g. genre for the hour…). Also should allow the website to know what track RoboDJ is currently playing.

Ideally, this would also allow for students to upload music they would like to be played. E.g. If they made their own songs

Code

Implementation

Since this will be an ongoing project, expect this to list change

  • Python 2.7 Backend
  • Possible direct integration with teal (In the future)
  • Pyacoustid
    • Can use a free API key with AcoustID Service for requests around 3 times per minute
    • Using this webservice gives a pretty good, up to date, reference to work with
    • They offer a large dataset with fingerprints and song information, but this is a large file size and apparently only updated once per year (too infrequent)
  • Dejavu
    • This library seems to require a lot of manual work (setup your own database), but looks less clunky than Pyacoustid's
    • All data stored via MySQL
    • Apparently this only keeps track of the song name, no other information (this is a bummer)
      • I will be modifying Dejavu to resolve this. It will then hold much more information about the song such as: artist, album, genre, etc...
    • Seems to average about 15 megabytes worth of fingerprints per song (~120k fingerprints)
    • Sadly, it appears the developer is no longer maintaining this project
  • Discogs
    • This is where I will try to gather the most information about the song, after identified
    • MusicBrainz seems unorganised, doesn't provide me with all the information I want, and has fewer songs in its database than Discogs
    • I found the site from this list
    • Ideally, we could use Dejavu to start to generate our own database, any unknown files will be ran through the Web API to get the name, and then we can use that to generate our database automatically.
    • Any unknown songs should be able to be set by the current DJ, and then we should instruct the DJ to play as much of the song as possible in order to make a good fingerprint for the database
    • Then the order of song information retrieval will go straight to: 1) Dejavu Database, 2) AcoustID Remote Server for name, 2.5) Grab other information from Discogs 3) Input from user. This information will then allow Dejavu to continually improve its database

Published on 10 Sep 2017 by Clement Hathaway