May/June 2018 issue of acmqueue The May/June issue of acmqueue is out now



Distributed Development

  Download PDF version of this article PDF

ITEM not available

acmqueue

Originally published in Queue vol. 13, no. 7
see this item in the ACM Digital Library


Tweet



Related:

Andrew Leung, Andrew Spyker, Tim Bozarth - Titus: Introducing Containers to the Netflix Cloud
Approaching container adoption in an already cloud-native infrastructure


Marius Eriksen - Functional at Scale
Applying functional programming principles to distributed computing projects


Caitie McCaffrey - The Verification of a Distributed System
A practitioner's guide to increasing confidence in system correctness


Mark Kobayashi-Hillary - A Passage to India
Most American IT employees take a dim view of offshore outsourcing. It's considered unpatriotic and it drains valuable intellectual capital and jobs from the United States to destinations such as India or China. Online discussion forums on sites such as isyourjobgoingoffshore.com are headlined with titles such as "How will you cope?" and "Is your career in danger?" A cover story in BusinessWeek magazine a couple of years ago summed up the angst most people suffer when faced with offshoring: "Is your job next?"



Comments

(newest first)

Yomi Bazuaye | Tue, 13 Oct 2015 22:10:26 UTC

Great article, lots of good points to think about here.

I think that the "node reconstruction" scenario that you described adds real value to any big data / distributed system test suite (in fact I'm going to start working on something similar for my current project tomorrow).

Thanks for sharing, Yomi


Jaksa | Fri, 09 Oct 2015 12:28:30 UTC

I'm wokring on a Paxos implementation (https://github.com/jaksa76/paxos/) and I'm facing the exact problems that you're discribing. Testing components in isolation is a necessary but not sufficient condition. I found it easier to test by using a more reactive design of the components: "when this message arrives, change state to this and send this message(s)". That makes it closer to model checking and easier to test various scenarios on the components, including message reordering, message loss, crashing of processes at specific points. However, not using a blocking req/response primitive made the code very "spaghettish". You need to jump around to follow a single line of thought. I had done a mockup of the algorithm some time ago using RMI and it was waaaay more elegant (although less efficient).


Sebastian Czort | Sat, 08 Aug 2015 09:45:53 UTC

I would add that - for the vast majority of the tests - I find testing each component in the distributed system in isolation a much simpler approach.

I would mock the messages communicated, as well as the system time, in order to test: sent messages (from the component), received message re-ordering, lost (received) messages (f.x. from a slow network and node failures) etc.

Kind regards, Sebastian Czort Syncram Consulting


Lukasz Gawron | Mon, 13 Jul 2015 10:24:35 UTC

Hi, Thanks for this article I found it interesting, but one thing is a question should we extend our systems to allow easier testing?

You wrote that sometimes it may require adding endpoints in API exclusively for testing. I'm not convinced for such approach as this will blur what actually system is doing. For example lets say system doesn't support deleting items, but for testing purposes we could add "delete endpoint" and use it for data removal after test. That would add additional complexity as this endpoint needs to be tested and it adds additional risk because it could negatively impact bussiness endpoints. It is also less obvious which endpoints are used in real world scenarios and which are just for testing purposes. It has of course some pros as CI server doesn't need to be coupled to some backend service instead it only relies on public API and refactoring of such services is simpler. Sometimes we also don't have possibility to access backend service directly and API extension could be single option to test something.

Can you elaborate why you choose adding explicit endpoint just for testing? Did you experience any problems with such approach? Did you test hitting specific nodes directly from tests instead of extending API?

Regards, Lukasz @lukaszgawron at twitter


Mikhail Golubtsov | Mon, 06 Jul 2015 15:01:45 UTC

Why do not take an approach that is present in Selenium Web Driver? You can wait for a certain condition of the system to arrive without explicitly defining a timeout to wait. The Web Driver checks a condition every second or so, as you configure, and when the condition is satisfied then the test is passed. And there is a time limit to wait, and when it's over, the test fails. So you wait longer for failures, but wait much less for tests to be passed successfully! This method suits well any asynchronous effects testing.


Leave this field empty

Post a Comment:







© 2018 ACM, Inc. All Rights Reserved.