audio fingerprinting

I've been reading a little bit about classifiers, including this paper on audio fingerprinting.

From the abstract:
Recent years have seen a growing scientific and industrial interest in computing fingerprints of multimedia objects. The growing industrial interest in shown among others by a large number of (startup) companies and the recent request for information on audio finger printing technologies by the International Federation of the Phonographic Industry and the Recording Industry Association of America.

The prime objective of multimedia fingerprinting is an efficient mechanism to establish the perceptual equality of two multimedia objects: not by comparing the (typically large) objects themselves, but by comparing the associated fingerprints (small by design)...

I have to admit, "fingerprinting" audio files is an interesting idea. Typically CD databases use things like track length and title to identify CDs, but that information is not fundamentally part of the music. Fingerprinting the tracks would make for a much more effective CDDB. The authors also present a lot of other interesting ideas about how this technology could be applied. In fact, justifications and applications of the technology take up two whole pages of the paper. They are two interesting pages, though.

The real meat of the paper seems to be in the heuristics for matching an audio file against a fingerprint. There's a lot of good old elbow grease involved, including hash tables, which I'm pretty familiar with. The math wasn't so bad; it was mostly just some statistical distributions including a gaussian. I wasn't able to follow the reasoning about the "symmetric binary source" all the way through, although I was able to follow the first few steps of the proof.
I can understand why the recording industry is interested in this technology.

I'm also interested in whether this system could output any interesting statistics or visualizations about music. It's clear that it analyzes certain patterns in the song-- what would those patterns look like if they were displayed? Could audio engineers some day care about these patterns, or is that too far-fetched? If you've ever studied controls, you know about Bode plots and Nyquist plots, and how much study people put into those. Of course, somehow, I have the feeling that this technology will be put to much less exciting uses by the RIAA.


Post a Comment

<< Home