November/December 2018 issue of acmqueue The November/December issue of acmqueue is out now

Subscribers and ACM Professional members login here



Databases

 

Download PDF version of this article
This and other acmqueue articles have been translated into Portuguese
ACM Q em Língua Portuguesa

Error 526 Ray ID: 4ac18f481c0a927e • 2019-02-20 14:10:57 UTC

Invalid SSL certificate

You

Browser

Working
Newark

Cloudflare

Working
deliverybot.acm.org

Host

Error

What happened?

The origin web server does not have a valid SSL certificate.

What can I do?

If you're a visitor of this website:

Please try again in a few minutes.

If you're the owner of this website:

The SSL certificate presented by the server did not pass validation. This could indicate an expired SSL certificate or a certificate that does not include the requested domain name. Please contact your hosting provider to ensure that an up-to-date and valid SSL certificate issued by a Certificate Authority is configured for this domain name on the origin server. Additional troubleshooting information here.

acmqueue

Originally published in Queue vol. 9, no. 4
see this item in the ACM Digital Library


Tweet


Related:

Pat Helland - Identity by Any Other Name
The complex cacophony of intertwined systems


Raymond Blum, Betsy Beyer - Achieving Digital Permanence
The many challenges to maintaining stored information and ways to overcome them


Graham Cormode - Data Sketching
The approximate approach is often faster and more efficient.


Heinrich Hartmann - Statistics for Engineers
Applying statistical techniques to operations data



Comments

(newest first)

Benjamin Black | Fri, 22 Apr 2011 21:26:37 UTC

I covered similar ground in a rather similar way in this GigaOm article from last year: http://gigaom.com/cloud/nosql-is-for-the-birds/

b


Michael Schuerig | Wed, 20 Apr 2011 20:01:06 UTC

Michael, I think what you are describing could be called "scaling SQL by non-relational means". The approach is more or less the same as what NoSQL people would do, with the only difference being how data is stored at the bottom: relationally or some other way.

When scaling up, "relationality" gets lost. I don't know, much less claim that it has to be that way. Scaling up in practice means distributing data and processing over many nodes with fallible connections. Arbitrary relational operations, joins in particular, are no longer practical in such a setting.

Queries are no longer independent of the physical organization of data. To the contrary, the physical organization must be specifically designed to optimize for expected queries.

This comment is by no means intended as a criticism of your or anyone else's approach. It's meant as a reminder of the unavoidable(?) price we pay for scaling, namely that we no longer have a purely logical model of data.


Andrew Wolfe | Wed, 20 Apr 2011 15:18:41 UTC

I am a proud employee of Oracle Corporation, but this is my personal opinion, not written on behalf of Oracle.

The claim that relational databases cannot scale and that noSQL databases are somehow magically able to perform better on huge datasets is uninformed or outdated. I remember the stunned look on noSQL proponent's face around 2008 when I told him I could run multiple simultaneous 10k row/second imports into a departmental-sized Oracle server. The same organization considered 250+ midtier application servers to be a "scalable" solution for a mid-sized e-commerce site, much better than the single 4-core database server that was envisioned.

Proponents of NoSQL and critics of SQL really have to show due diligence in SQL tuning not only along the lines of this article but in leveraging vendor capabilities in hardware and software. A noSQL solution may seem to get implemented fast, but as you debug the multiple threading, concurrent data loads, interleaving multiple data streams across multiple disks, multiple servers - are you really saving implementation time? Haven't you just shifted time from well-known relational DBA practices to a similar amount of improvisational noSQL design and development?

In relational databases, scalability is "compromised" by the maintenance of transactional integrity - the so-called "ACID" properties. NoSQL proponents often correctly identify this issue. As someone who has worked with large data sets, I'm not on board with relaxing integrity support. In fact, the very size of the datasets requires vastly MORE attention to correctness, not a mindless obsession with performance numbers. The only thing worth "scaling" is not data throughput, but DURABLE TRANSACTIONS. The single-percentage data error one can tolerate with 500 records becomes intolerable at 500k. At 500M records, one percent is absolutely catastrophic. At that scale you can hardly even find out if you have a data problem without a relational database to sift it for you. It's unaccountable. Frankly, I believe many people choose non-relational data storage not despite, but because it limits their accountability for administering a large data store.

So I'll go Michael Rys one step further. SQL is not just a good thing to scale into millions and billions - it's the ONLY thing that does.

Respectfully

Andrew D Wolfe Jr


Leave this field empty

Post a Comment:







© 2018 ACM, Inc. All Rights Reserved.