Don’t Settle for Eventual Consistency

Stronger properties for low-latency geo-replicated storage

WYATT LLOYD, FACEBOOK; MICHAEL J. FREEDMAN, PRINCETON UNIVERSITY; MICHAEL KAMINSKY, INTEL LABS; DAVID G. ANDERSEN, CARNEGIE MELLON UNIVERSITY

Geo-replicated storage provides copies of the same data at multiple, geographically distinct locations. Facebook, for example, geo-replicates its data (profiles, friends lists, likes, etc.) to data centers on the east and west coasts of the United States, and in Europe. In each data center, a tier of separate Web servers accepts browser requests and then handles those requests by reading and writing data from the storage system

Don’t Settle for Eventual Consistency

 

Related:
Proving the Correctness of Nonblocking Data Structures
Eventual Consistency Today: Limitations, Extensions, and Beyond
Structured Deferral: Synchronization via Procrastination

A Primer on Provenance

Better understanding of data requires tracking its history and context.

LUCIAN CARATA, SHERIF AKOUSH, NIKILESH BALAKRISHNAN, THOMAS BYTHEWAY, RIPDUMAN SOHAN, MARGO SELTZER, ANDY HOPPER

Assessing the quality or validity of a piece of data is not usually done in isolation. You typically examine the context in which the data appears and try to determine its original sources or review the process through which it was created. This is not so straightforward when dealing with digital data, however: the result of a computation might have been derived from numerous sources and by applying complex successive transformations, possibly over long periods of time.

A Primer on Provenance

 

Related:
Provenance in Sensor Data Management
CTO Roundtable: Storage
Better Scripts, Better Games

 

Queue Portrait #5: Hilary Mason

Queue Portrait #5: Hilary Mason

Bitly’s Chief Data Scientist, Hilary Mason, describes what data science is and how to build systems that make doing data science possible.

Chief Data Scientist at Bitly, Hilary Mason, discusses the current state of data science.

https://vimeo.com/74990264

Proving the Correctness of Nonblocking Data Structures

Nonblocking synchronization can yield astonishing results in terms of scalability and realtime response, but at the expense of verification state space.

MATHIEU DESNOYERS, EFFICIOS

So you’ve decided to use a nonblocking data structure, and now you need to be certain of its correctness. How can this be achieved?

When a multithreaded program is too slow because of a frequently acquired mutex, the programmer’s typical reaction is to question whether this mutual exclusion is indeed required. This doubt becomes even more pronounced if the mutex protects accesses to only a single variable performed using a single instruction at every site. Removing synchronization improves performance, but can it be done without impairing program correctness?

Proving the Correctness of Nonblocking Data Structures

 

Related:

Software and the Concurrency Revolution

Real-World Concurrency

 

Eventual Consistency Today: Limitations, Extensions, and Beyond

How can applications be built on eventually consistent infrastructure given no guarantee of safety?

PETER BAILIS AND ALI GHODSI, UC BERKELEY

In a July 2000 conference keynote, Eric Brewer, now VP of engineering at Google and a professor at the University of California, Berkeley, publicly postulated the CAP (consistency, availability, and partition tolerance) theorem, which would change the landscape of how distributed storage systems were architected.8 Brewer’s conjecture—based on his experiences building infrastructure for some of the first Internet search engines at Inktomi—states that distributed systems requiring always-on, highly available operation cannot guarantee the illusion of coherent, consistent single-system operation in the presence of network partitions, which cut communication between active servers. Brewer’s conjecture proved prescient: in the following decade, with the continued rise of large-scale Internet services, distributed-system architects frequently dropped “strong” guarantees in favor of weaker models—the most notable being eventual consistency.

Eventual Consistency Today:
Limitations, Extensions, and Beyond

Related:

Hazy: Making it Easier to Build and Maintain Big-data Analytics

Racing to unleash the full potential of big data with the latest statistical and machine-learning techniques.

ARUN KUMAR, FENG NIU, AND CHRISTOPHER RÉ, DEPARTMENT OF COMPUTER SCIENCES, UNIVERSITY OF WISCONSIN-MADISON

The rise of big data presents both big opportunities and big challenges in domains ranging from enterprises to sciences. The opportunities include better-informed business decisions, more efficient supply-chain management and resource allocation, more effective targeting of products and advertisements, better ways to “organize the world’s information,” faster turnaround of scientific discoveries, etc.

Hazy: Making it Easier to Build and Maintain Big-data Analytics

 

Related:

The Pathologies of Big Data

Condos and Clouds

How Will Astronomy Archives Survive the Data Tsunami?

 

All Your Database Are Belong to Us

In the big open world of the cloud, highly available distributed objects will rule.

ERIK MEIJER, MICROSOFT

In the database world, the raw physical data model is at the center of the universe, and queries freely assume intimate details of the data representation (indexes, statistics, metadata). This closed-world assumption and the resulting lack of abstraction have the pleasant effect of allowing the data to outlive the application. On the other hand, this makes it hard to evolve the underlying model independently from the queries over the model.

http://queue.acm.org/detail.cfm?id=2338507

 

Related:

The Rise and Fall of Corba

How Will Astronomy Archives Survive the Data Tsunami?

Cybercrime 2.0: When the Cloud Turns Dark