Music Fingerprinting Continued...

28 Sep 2017

Progress is being made

It's been a little while (two weeks) since I have last written a post here, but progress has continued. I have managed to revamp most of Will Drevo's Dejavu project so that it is more feature-rich for use at WJRH. My fork of his project is named Jamaisvu and is found on my github. It includes the following changes so far:

  • Integration with AcoustID service
    • This allows us to obtain the song name and artist from the file via fingerprinting, rather than just taking the name of the mp3
  • Support for unidentifiable songs (AcoustID) by having user-input
  • Spotify API usage, the data found by AcoustID or manually entered is fed into the Spotify API which tells us more about the song
    • This is very important since it was the only somewhat reliable API I could find to tell us if a song is explicit
  • The MySQL database now keeps track of additional information on the songs we add
    • Artist
    • Album
    • Genre
    • Length
    • Explicity
  • YAML for the config to improve readability
  • ...and other tiny optimisations and fixes

Progress still to be made

Whilst I am quite happy with my results so far, the major issue with the software is its speed. Will touched on this in his blog. The main issue with the current implementation lies in two key parts of the programme:

  1. The Fast Fourier Transform
  2. Peak Finding

These two areas are the areas which take the most amount of time. When I run an analysis on a file which as already been fingerprinted, it takes about 37 seconds to complete. Around 16 seconds were spent on each peak finding (one per channel) and around 1.5 to 2 seconds for calculating the spectrogram (FFTs). Early tests with a neat library called PyFFT which is based off PyCUDA, showed that the GPU on my Late 2013 MBP provided a significant increase in performance for computing FFTs. Transferring this over to the GTX580 which I put in the test machine, cut that time further in half. Sadly, calculating the FFTs really isn't the biggest worry.

In order to actually solve the performance issues of Dejavu/Jamaisvu, I need to implement a faster peak finding algorithm, preferably on the GPU. I have made a new file in the fork which I will eventually merge into the main programme so that when there are CUDA drivers, and eventually OpenCL support, the programme will automatically use those. In case the user does not have CUDA or OpenCL, I will let it fall back to Drevo's much slower approach.

Hopefully in the next post, I will have some pretty graphs on this!

Published on 28 Sep 2017 by Clement Hathaway