2006-07-30

Bell Labs and the future of technology

This article examines what's going on at Bell Labs today and what it means for the future of technology in general.

In decades past, Bell Labs was famous for doing "blue sky" research that eventually led to, among other things, the transistor and the UNIX operating system. Lately, Bell Labs has been doing what some people would call "applied" computer science research into things like voice recognition, data mining, and wireless security.

"The real problem," Stokes declares, "is that what AT&T is doing today is not your grandfather's R&D, and neither is the work coming out of Google's labs, or Microsoft's, or the labs of any of the other information economy wunderkinds...
I think it's clear that chaotic, market-driven change is a good way to bring ideas quickly and efficiently from concept to profitable product. However, such a rapid churning of the institutional and cultural landscape ultimately may not be conducive to the kind of steady, expensive, long-term investment in fundamental research that produces the really big ideas that somewhere, at some completely unforeseeable point in the future, change the world."

Hmm. I guess Stokes is arguing that we should be putting more money into "fundamental" research... things like pure biology, pure math, and pure physics. Despite the fact that I'm not a "pure" or "theoretical" anything, I think I agree with that. I'm not sure I really agree with his reasons, though. Does fundamental research really give a competitive advantage to the country that does it?

Will knowing the mass of the top quark keep our economy afloat for decades to come? Especially if the mass is revealed in a internationally published scientific journal?
As with all things economic, the question is very hard to answer. One good thing about funding research is that it brings smart people to your country. If they stay, some of them may go into private industry or have kids that do. (Oftentimes, parents pass on their excitement about science and technology to their kids.) This could be helpful.

Overall, though, I don't think fundamental research always gives an economic advantage to the country doing the research. The idea of "scientific capital" is an oversimplification.

I think people should be doing pure research just for the sake of knowledge, rather than because they want to get a competitive advantage. There's nothing wrong with trying to get a competitive advantage, but it's better to leave that to more nimble and agile private corporations.

2006-07-21

heart surgery robots

The "HeartLander" robot could help with heart surgery. I guess the idea here is to make it possible to do minimally invasive heart surgery. The robot crawls on the heart while it's beating, and can reach areas that normally would require more invasive surgery. It uses nitinol wires to steer and to inch forward like an earthworm. Overall, pretty interesting.

2006-07-11

audio fingerprinting

I've been reading a little bit about classifiers, including this paper on audio fingerprinting.

From the abstract:
Recent years have seen a growing scientific and industrial interest in computing fingerprints of multimedia objects. The growing industrial interest in shown among others by a large number of (startup) companies and the recent request for information on audio finger printing technologies by the International Federation of the Phonographic Industry and the Recording Industry Association of America.

The prime objective of multimedia fingerprinting is an efficient mechanism to establish the perceptual equality of two multimedia objects: not by comparing the (typically large) objects themselves, but by comparing the associated fingerprints (small by design)...

I have to admit, "fingerprinting" audio files is an interesting idea. Typically CD databases use things like track length and title to identify CDs, but that information is not fundamentally part of the music. Fingerprinting the tracks would make for a much more effective CDDB. The authors also present a lot of other interesting ideas about how this technology could be applied. In fact, justifications and applications of the technology take up two whole pages of the paper. They are two interesting pages, though.

The real meat of the paper seems to be in the heuristics for matching an audio file against a fingerprint. There's a lot of good old elbow grease involved, including hash tables, which I'm pretty familiar with. The math wasn't so bad; it was mostly just some statistical distributions including a gaussian. I wasn't able to follow the reasoning about the "symmetric binary source" all the way through, although I was able to follow the first few steps of the proof.
I can understand why the recording industry is interested in this technology.

I'm also interested in whether this system could output any interesting statistics or visualizations about music. It's clear that it analyzes certain patterns in the song-- what would those patterns look like if they were displayed? Could audio engineers some day care about these patterns, or is that too far-fetched? If you've ever studied controls, you know about Bode plots and Nyquist plots, and how much study people put into those. Of course, somehow, I have the feeling that this technology will be put to much less exciting uses by the RIAA.

2006-07-02

"why most published research results are false"

With a title like that, you'd expect this article to be controversial, and it doesn't dissapoint.
The article seems to focus on the field of medicine.

Basically, the author comes up with some statistical models of how research proceeds. He's concerned with how studies find "relationships," and defines some variables like the probability of finding a relationship when none exists (alpha) and the probability of not finding a relationship that does, in fact, exist (beta). Then the goal becomes measuring a metric called the PPV, the post-study probability that the study was "true."

His model gets kind of complicated when he introduces parameters like "bias." Bias seems to be sort of a fudge factor because it has to be provided from "outside" the model-- just like alpha and beta.

Anyway, once this is all laid out, he starts computing some things based on study sizes, and finds some disturbing patterns. Apparently based on these purely statistical arguments, a lot of medical research could be false.

I'm not sure if I really agree with the contention that "most published research results are false." If that were really true, why would we bother funding medicine at all? But it may well be true that we need to re-examine some of the methodologies here. For example, small study sizes are inherently dangerous because they tend to distort the statistics.
Another problem is that medicine is increasingly focusing more and more on making smaller and smaller incremental gains. This makes things a lot harder on study designers... to take an extreme example, if a medicine decreases your chance of cancer by 0.0001%, is that even measurable by current statistical techniques? The author has some more ideas on how to make the situation better which he presents at the end of the paper-- the most memorable one for me is the idea that we should try to estimate the pre-study odds that a relationship is true.

I'm not sure if I agree with the author's assessment of the effect of having multiple teams work on the same problem. His model seems to say that the more studies there are on a given phenomenon, the less helpful each study is, to the point where more studies actually hurt. To quote:
"With increasing number of independent studies, PPV tends to decrease, unless 1-beta < alpha"

This seems very counterintuitive. I think it's because he's treating the studies as an aggregate: he wants to know if anyone claimed a result that wasn't true. So he more or less straightforwardly substitutes beta^n for beta and alpha^n for alpha in table 3 (vs. table 1). This is a little misleading, because reproducibility is at the heart of science. Suggesting that doing more studies on an issue leads to having less knowledge is a pretty radical statement.

Looking at this paper makes medicine seem like kind of a dismal field. Everything result is statistical in nature-- nothing is really certain-- and often the companies and organizations funding the research have vested interests. I guess that's the down side of the field. There are probably a lot of up sides as well, like the chance to help people with their medical problems.