Yesterday, we hosted a party here at Forbes house. It's interesting being the host. It's been a good carnival.

Anyway-- I was reading this guy's site about SCIDs (source code in database systems).
It's at http://mindprod.com/projects/scid.html

In his words:
We have been teaching our customers to regard their data as a precious resource that should be milked and reused by finding many possible ways of summarising, viewing and updating it.
However, we programmers have not yet learned to treat our source code as a similar structured data resource. This is an enormous project, but you could start small. The basic idea is your pre-parse your code and put it in a database.

This an ambitious idea, indeed. He's proposing that we change the whole source-code-as-text paradigm that's been in place-- well-- ever since punch cards died out a half-century ago. In its place, he would put a database which programmers could modify in various ways.

The ultimate goal is to come up with a better way of programming than just writing text on a screen. You could use pointing and clicking to design GUIs, or write little scripts to find similarities in various parts of the database. With an editor that truly understands the language it is written for, the possibilities really are limitless. In essence, everything the compiler knows, the editor now knows.

There's a lot of potential problems here. The biggest practical problem is really that of vendor lock-in. Using plain old text for source files is a well-known and well-understood standard. If a company moves to some proprietary SCID, there's no guarantee that they will be able to export their code in a reasonably readable and usable form to another SCID. Why would any vendor make this easy? I guess reverse engineering is sometimes legal, but companies could set up a number of copyright and patent roadblocks under the current system that would effectively trap any company switching to a proprietary SCID. If I were a manager contemplating this, I would be very afraid.

Secondly, for those using unix, text is basically the lowest common denominator of unix systems. Traditional unix tools like grep, awk, cat, and find don't work on databases. I guess you could write equivalents of these tools for your new database, but that would be a lot of work.

Finally, in a more philosophical sense, the SCID philosophy tends to run counter to the traditional "separation of function" philosophy under which the compiler, text editor, and revision control system were all separate entities. The old philosophy wasn't perfect, but it tended to give the authors of all of these tools a pretty good idea of what their jobs should be. In a very real sense, implementing a high-quality SCID forces us to re-consider the real world problem of division of responsibility. Should the CVS or subversion people work with the SCID people on a new feature they want to add? Does the GCC team need to meet with the text editor team before releasing their new revision? It's even possible to argue that only the biggest, most monolithic companies have the resources to produce something with as much vertical integration as a true SCID. And these are not necessarity the organizations we want in control.

As the author mentions in closing, some SCID-like systems have already been built. Perhaps eclipse is the most famous one. I guess IDEs (integrated development environments) of all kinds could potentially evolve into SCIDs slowly. Generally, most IDEs don't provide very much meta-programming functionality now. It will be interesting to see how this plays out in the future. As I mentioned before, this may not be a very good research topic for academics, because the players with the resources to make these kinds of systems a reality are the big guys like Microsoft and IBM. As with so many other problems the software industry faces today, it's a (lack of) infrastructure problem.


Post a Comment

<< Home