Databases

Vol. 3 No. 3 – April 2005

Databases

A Call to Arms:
Long anticipated, the arrival of radically restructured database architectures is now finally at hand.

We live in a time of extreme change, much of it precipitated by an avalanche of information that otherwise threatens to swallow us whole. Under the mounting onslaught, our traditional relational database constructs—always cumbersome at best—are now clearly at risk of collapsing altogether. In fact, rarely do you find a DBMS anymore that doesn’t make provisions for online analytic processing. Decision trees, Bayes nets, clustering, and time-series analysis have also become part of the standard package, with allowances for additional algorithms yet to come. Also, text, temporal, and spatial data access methods have been added—along with associated probabilistic logic, since a growing number of applications call for approximated results. Column stores, which store data column-wise rather than record-wise, have enjoyed a rebirth, mostly to accommodate sparse tables, as well as to optimize bandwidth.

by Jim Gray, Mark Compton

Beyond Relational Databases:
There is more to data access than SQL.

The number and variety of computing devices in the environment are increasing rapidly. Real computers are no longer tethered to desktops or locked in server rooms. PDAs, highly mobile tablet and laptop devices, palmtop computers, and mobile telephony handsets now offer powerful platforms for the delivery of new applications and services. These devices are, however, only the tip of the iceberg. Hidden from sight are the many computing and network elements required to support the infrastructure that makes ubiquitous computing possible.

by Margo Seltzer

Databases of Discovery:
Open-ended database ecosystems promote new discoveries in biotech. Can they help your organization, too?

The National Center for Biotechnology Information is responsible for massive amounts of data. A partial list includes the largest public bibliographic database in biomedicine, the U.S. national DNA sequence database, an online free full text research article database, assembly, annotation, and distribution of a reference set of genes, genomes, and chromosomes, online text search and retrieval systems, and specialized molecular biology data search engines. At this writing, NCBI receives about 50 million Web hits per day, at peak rates of about 1,900 hits per second, and about 400,000 BLAST searches per day from about 2.5 million users. The Web site transfers about 0.6 terabytes per day, and people interested in local copies of bulk data FTP about 1.2 terabytes per day.

by James Ostell

File under "Unknowable!":
It’s been ahard day’s night—proving nonexistence!

The Yellow Pages used to advertise along the following lines: “If you can’t find it here, it does not exist.” Shannan Hobbes, my favorite epistemologist, and I would ponder this claim well into the wee hours, testing its validity by searching for vendors of “Square Circles,” “Pet Unicorns,” “Cold Fusion,” “The Largest Prime Number,” “Reigning Bald Kings of France,” and similar quiddities oft debated in the PhilTrans (Philosophical Transactions). Our mounting failures—or, to quickly remove the scandalous ambiguity—our growing number of “not founds” amounted to some sort of inductive verification (of which, more anon). The Yellow Pages, considered as a merely finite hierarchy of marketable strings, has nothing much to tell us of contingent, objective existence. Indeed, we searched in vain for many incontrovertibly real, corporeal “thingies.” No sign of the Renaissance crypto-text, Hypnerotomachia Poliphili, which will be on everyone’s lips as soon as the Da Vinci Code mercifully fades from the best-seller lists.

by Stan Kelly-Bootle

A Conversation with Pat Selinger:
Leading the way to manage the world’s information

Take Pat Selinger of IBM and James Hamilton of Microsoft and put them in a conversation together, and you may hear everything you wanted to know about database technology and weren’t afraid to ask. Selinger, IBM Fellow and vice president of area strategy, information, and interaction for IBM Research, drives the strategy for IBM’s research work spanning the range from classic database systems through text, speech, and multimodal interactions. Since graduating from Harvard with a Ph.D. in applied mathematics, she has spent almost 30 years at IBM, hopscotching between research and development of IBM’s database products.

Kode Vicious Battles On:
Kode Vicious is at it again, dragging you out of your koding quagmires and kombating the enemies of kommon sense.

Dear KV, I’m maintaining some C code at work that is driving me right out of my mind. It seems I cannot go more than three lines in any file without coming across a chunk of code that is conditionally compiled.

by George Neville-Neil