January/February 2019 issue of acmqueue The January/February 2019 issue of acmqueue is out now

Subscribers and ACM Professional members login here

Distributed Development

  Download PDF version of this article PDF

Error 526 Ray ID: 4cd10ba738744722 • 2019-04-25 14:35:47 UTC

Invalid SSL certificate








What happened?

The origin web server does not have a valid SSL certificate.

What can I do?

If you're a visitor of this website:

Please try again in a few minutes.

If you're the owner of this website:

The SSL certificate presented by the server did not pass validation. This could indicate an expired SSL certificate or a certificate that does not include the requested domain name. Please contact your hosting provider to ensure that an up-to-date and valid SSL certificate issued by a Certificate Authority is configured for this domain name on the origin server. Additional troubleshooting information here.


Originally published in Queue vol. 13, no. 7
see this item in the ACM Digital Library



Martin Kleppmann, Alastair R. Beresford, Boerge Svingen - Online Event Processing
Achieving consistency where distributed transactions have failed

Andrew Leung, Andrew Spyker, Tim Bozarth - Titus: Introducing Containers to the Netflix Cloud
Approaching container adoption in an already cloud-native infrastructure

Marius Eriksen - Functional at Scale
Applying functional programming principles to distributed computing projects

Caitie McCaffrey - The Verification of a Distributed System
A practitioner's guide to increasing confidence in system correctness


(newest first)

Yomi Bazuaye | Tue, 13 Oct 2015 22:10:26 UTC

Great article, lots of good points to think about here.

I think that the "node reconstruction" scenario that you described adds real value to any big data / distributed system test suite (in fact I'm going to start working on something similar for my current project tomorrow).

Thanks for sharing, Yomi

Jaksa | Fri, 09 Oct 2015 12:28:30 UTC

I'm wokring on a Paxos implementation (https://github.com/jaksa76/paxos/) and I'm facing the exact problems that you're discribing. Testing components in isolation is a necessary but not sufficient condition. I found it easier to test by using a more reactive design of the components: "when this message arrives, change state to this and send this message(s)". That makes it closer to model checking and easier to test various scenarios on the components, including message reordering, message loss, crashing of processes at specific points. However, not using a blocking req/response primitive made the code very "spaghettish". You need to jump around to follow a single line of thought. I had done a mockup of the algorithm some time ago using RMI and it was waaaay more elegant (although less efficient).

Sebastian Czort | Sat, 08 Aug 2015 09:45:53 UTC

I would add that - for the vast majority of the tests - I find testing each component in the distributed system in isolation a much simpler approach.

I would mock the messages communicated, as well as the system time, in order to test: sent messages (from the component), received message re-ordering, lost (received) messages (f.x. from a slow network and node failures) etc.

Kind regards, Sebastian Czort Syncram Consulting

Lukasz Gawron | Mon, 13 Jul 2015 10:24:35 UTC

Hi, Thanks for this article I found it interesting, but one thing is a question should we extend our systems to allow easier testing?

You wrote that sometimes it may require adding endpoints in API exclusively for testing. I'm not convinced for such approach as this will blur what actually system is doing. For example lets say system doesn't support deleting items, but for testing purposes we could add "delete endpoint" and use it for data removal after test. That would add additional complexity as this endpoint needs to be tested and it adds additional risk because it could negatively impact bussiness endpoints. It is also less obvious which endpoints are used in real world scenarios and which are just for testing purposes. It has of course some pros as CI server doesn't need to be coupled to some backend service instead it only relies on public API and refactoring of such services is simpler. Sometimes we also don't have possibility to access backend service directly and API extension could be single option to test something.

Can you elaborate why you choose adding explicit endpoint just for testing? Did you experience any problems with such approach? Did you test hitting specific nodes directly from tests instead of extending API?

Regards, Lukasz @lukaszgawron at twitter

Mikhail Golubtsov | Mon, 06 Jul 2015 15:01:45 UTC

Why do not take an approach that is present in Selenium Web Driver? You can wait for a certain condition of the system to arrive without explicitly defining a timeout to wait. The Web Driver checks a condition every second or so, as you configure, and when the condition is satisfied then the test is passed. And there is a time limit to wait, and when it's over, the test fails. So you wait longer for failures, but wait much less for tests to be passed successfully! This method suits well any asynchronous effects testing.

Leave this field empty

Post a Comment:

© 2018 ACM, Inc. All Rights Reserved.