Programming Languages

Vol. 9 No. 5 – May 2011

Programming Languages

If You Have Too Much Data, then "Good Enough" Is Good Enough:
In today’s humongous database systems, clarity may be relaxed, but business needs can still be met.

Classic database systems offer crisp answers for a relatively small amount of data. These systems hold their data in one or a relatively small number of computers. With a tightly defined schema and transactional consistency, the results returned from queries are crisp and accurate. New systems have humongous amounts of data content, change rates, and querying rates and take lots of computers to hold and process. The data quality and meaning are fuzzy. The schema, if present, is likely to vary across the data. The origin of the data may be suspect, and its staleness may vary.

by Pat Helland

Deduplicating Devices Considered Harmful:
A good idea, but it can be taken too far

During the research for their interesting paper, "Reliably Erasing Data From Flash-based Solid State Drives," delivered at the FAST (File and Storage Technology) workshop at San Jose in February, Michael Wei and his co-authors from the University of California, San Diego discovered that at least one flash controller, the SandForce SF-1200, was by default doing block-level deduplication of data written to it. The SF-1200 is used in SSDs (solid-state disks) from, among others, Corsair, ADATA, and Mushkin.

by David Rosenthal

Passing a Language through the Eye of a Needle:
How the embeddability of Lua impacted its design

Scripting languages are an important element in the current landscape of programming languages. A key feature of a scripting language is its ability to integrate with a system language. This integration takes two main forms: extending and embedding. In the first form, you extend the scripting language with libraries and functions written in the system language and write your main program in the scripting language. In the second form, you embed the scripting language in a host program (written in the system language) so that the host can run scripts and call functions defined in the scripts; the main program is the host program. In this setting, the system language is usually called the host language.

by Roberto Ierusalimschy, Luiz Henrique de Figueiredo, Waldemar Celes

Storage Strife:
Beware keeping data in binary format

Where I work we are very serious about storing all of our data, not just our source code, in our source-code control system. When we started the company we made the decision to store as much as possible in one place. The problem is that over time we have moved from a pure programming environment to one where there are other people - the kind of people who send e-mails using Outlook and who keep their data in binary and proprietary formats.

by George V. Neville-Neil