Cover Page

A Real Music-Matching Database?






The catalogs beginning to appear on the net in the last year are a step in the right direction, but not all the way there. It's still a big problem for a new act to get noticed amidst the ocean of competition. This means that merit alone will not do the trick -- that must be supported with other promotional methods and often some kind of mentoring from inside the industry.

With some luck, and a sustained effort over a long time, enough word-of-mouth can sometimes allow a new act to get recognition to the point that they can "hit the big time." But, this is far from certain. It is very important to understand that, in the current state of the music business, hard work, dedication and "stick-to-it-iveness" guarantees nothing. The myths that run the machine don't acknowledge this, but it is the reality that underlies the music business.

What I would love to see, but seems a decade or two away, is an online database that automatically (i.e., with well-defined algorithms requiring no subjective judgments by any human beings) evaluates and stores descriptions of music in a large database. A user might enter a search spec for a piece of music that "sounds about half like A and about half like B" (where A and B are particular pieces the user is familiar with) and the system will automatically come up with the top-N choices, ranked on closeness to the blend specified by the user.

There are selection systems in development (and maybe coming into actual use) where users can respond to a questionnaire of some kind, as to what music they like, and then the system can search for pairs of entries that have higher-than-expected correlations (a simple statistical analysis). For example, if fans of Bruce Springsteen are more than typically likely to also like Don Henley, then that would show up when a Springsteen fan asks the system for suggestions as to other music to look for.

But these systems rely on user feedback which has two serious drawbacks: user subjectivity (why should I have to adhere to what you think?), and lack of comprehensiveness (it still requires a certain minimum number of users to have heard your music and included it on their feedback before you start showing up on the suggestion lists). Basically, it moves from whims of a few individual reviewers to the whims of a larger group of people. But it's still a case of "majority rules" and an individual who has unusual values will be less likely to find what they're looking for.

The obstacles facing a truly automatic system are twofold.

One: the process of transcribing a complex polyphonic sound sample into a symbolic notation adequate to describe the variables that would be used to "score" the music is not at hand. Eric Scheirer, who is working on this task at the MIT Media Lab, predicts one to two decades before this will be achieved (he did a Masters thesis that involved a system that was able to fine-tune the nuances of a performance of a piano piece for which the complete score was already given and entered, but analyzing a multi-instrumental recording with no score is well beyond any current algorithms).

Two: deciding what variables should be stored is not necessarily clear (it would require understanding mathematically what the human ear recognizes in complex sounds). The MIT ML folks are more optimistic about this (they refer to an eigenvector analysis, sometimes known as a "principle components" or "factor" analysis). However, until this kind of research can be carried out in real life, it is impossible to know the results of such investigations.

If anyone knows of a source of significant funding for such research, be sure to email the webmaster!

-- Dan Krimm, 3/96