Quality Assurance

Sort By:

Adopting DevOps Practices in Quality Assurance:
Merging the art and science of software development

Software life-cycle management was, for a very long time, a controlled exercise. The duration of product design, development, and support was predictable enough that companies and their employees scheduled their finances, vacations, surgeries, and mergers around product releases. When developers were busy, QA (quality assurance) had it easy. As the coding portion of a release cycle came to a close, QA took over while support ramped up. Then when the product released, the development staff exhaled, rested, and started the loop again while the support staff transitioned to busily supporting the new product.

by James Roche | October 30, 2013


Automated QA Testing at EA: Driven by Events:
A discussion with Michael Donat, Jafar Husain, and Terry Coatta

To millions of game geeks, the position of QA (quality assurance) tester at Electronic Arts must seem like a dream job. But from the company’s perspective, the overhead associated with QA can look downright frightening, particularly in an era of massively multiplayer games.

by Terry Coatta, Michael Donat, Jafar Husain | May 19, 2014


Black Box Debugging:
It’s all about what takes place at the boundary of an application.

Modern software development practices build applications as a collection of collaborating components. Unlike older practices that linked compiled components into a single monolithic application, modern executables are made up of any number of executable components that exist as separate binary files. This design means that as an application component needs resources from another component, calls are made to transfer control or data from one component to another. Thus, we can observe externally visible application behaviors by watching the activity that occurs across the boundaries of the application’s constituent components.

by James A. Whittaker, Herbert H. Thompson | January 29, 2004


Code Spelunking: Exploring Cavernous Code Bases:
Code diving through unfamiliar source bases is something we do far more often than write new code from scratch--make sure you have the right gear for the job.

Try to remember your first day at your first software job. Do you recall what you were asked to do, after the human resources people were done with you? Were you asked to write a piece of fresh code? Probably not. It is far more likely that you were asked to fix a bug, or several, and to try to understand a large, poorly documented collection of source code. Of course, this doesn’t just happen to new graduates; it happens to all of us whenever we start a new job or look at a new piece of code. With experience we all develop a set of techniques for working with large, unfamiliar source bases.

by George V. Neville-Neil | October 1, 2003


Debugging in an Asynchronous World:
Hard-to-track bugs can emerge when you can’t guarantee sequential execution. The right tools and the right techniques can help.

Pagers, cellular phones, smart appliances, and Web services - these products and services are almost omnipresent in our world, and are stimulating the creation of a new breed of software: applications that must deal with inputs from a variety of sources, provide real-time responses, deliver strong security - and do all this while providing a positive user experience. In response, a new style of application programming is taking hold, one that is based on multiple threads of control and the asynchronous exchange of data, and results in fundamentally more complex applications.

by Michael Donat | October 1, 2003


Fault Injection in Production:
Making the case for resilience testing

When we build Web infrastructures at Etsy, we aim to make them resilient. This means designing them carefully so that they can sustain their (increasingly critical) operations in the face of failure. Thankfully, there have been a couple of decades and reams of paper spent on researching how fault tolerance and graceful degradation can be brought to computer systems. That helps the cause.

by John Allspaw | August 24, 2012


Microsoft’s Protocol Documentation Program:
A Discussion with Nico Kicillof, Wolfgang Grieskamp and Bob Binder

In 2002, Microsoft began the difficult process of verifying much of the technical documentation for its Windows communication protocols.

June 8, 2011


Model-based Testing: Where Does It Stand?:
MBT has positive effects on efficiency and effectiveness, even if it only partially fulfills high expectations.

You have probably heard about MBT (model-based testing), but like many software-engineering professionals who have not used MBT, you might be curious about others’ experience with this test-design method. From mid-June 2014 to early August 2014, we conducted a survey to learn how MBT users view its efficiency and effectiveness. The 2014 MBT User Survey, a follow-up to a similar 2012 survey, was open to all those who have evaluated or used any MBT approach. Its 32 questions included some from a survey distributed at the 2013 User Conference on Advanced Automated Testing. Some questions focused on the efficiency and effectiveness of MBT, providing the figures that managers are most interested in.

by Robert V. Binder, Bruno Legeard, Anne Kramer | January 19, 2015


MongoDB’s JavaScript Fuzzer:
The fuzzer is for those edge cases that your testing didn’t catch.

As MongoDB becomes more feature-rich and complex with time, the need to develop more sophisticated methods for finding bugs grows as well. Three years ago, MongDB added a home-grown JavaScript fuzzer to its toolkit, and it is now our most prolific bug-finding tool, responsible for detecting almost 200 bugs over the course of two release cycles. These bugs span a range of MongoDB components from sharding to the storage engine, with symptoms ranging from deadlocks to data inconsistency. The fuzzer runs as part of the CI (continuous integration) system, where it frequently catches bugs in newly committed code.

by Robert Guo | March 6, 2017


Orchestrating an Automated Test Lab:
Composing a score can help us manage the complexity of testing distributed apps.

Networking and the Internet are encouraging increasing levels of interaction and collaboration between people and their software. Whether users are playing games or composing legal documents, their applications need to manage the complex interleaving of actions from multiple machines over potentially unreliable connections. As an example, Silicon Chalk is a distributed application designed to enhance the in-class experience of instructors and students. Its distributed nature requires that we test with multiple machines. Manual testing is too tedious, expensive, and inconsistent to be effective. While automating our testing, however, we have found it very labor intensive to maintain a set of scripts describing each machine’s portion of a given test.

by Michael Donat | February 16, 2005


Quality Assurance: Much More than Testing:
Good QA is not only about technology, but also methods and approaches.

Quality assurance isn’t just testing, or analysis, or wishful thinking. Although it can be boring, difficult, and tedious, QA is nonetheless essential. Ensuring that a system will work when delivered requires much planning and discipline. Convincing others that the system will function properly requires even more careful and thoughtful effort. QA is performed through all stages of the project, not just slapped on at the end. It is a way of life.

by Stuart Feldman | February 16, 2005


Resilience Engineering: Learning to Embrace Failure:
A discussion with Jesse Robbins, Kripa Krishnan, John Allspaw, and Tom Limoncelli

In the early 2000s, Amazon created GameDay, a program designed to increase resilience by purposely injecting major failures into critical systems semi-regularly to discover flaws and subtle dependencies. Basically, a GameDay exercise tests a company’s systems, software, and people in the course of preparing for a response to a disastrous event. Widespread acceptance of the GameDay concept has taken a few years, but many companies now see its value and have started to adopt their own versions. This discussion considers some of those experiences.

by Jesse Robbins, Kripa Krishnan, John Allspaw, Thomas A. Limoncelli | September 13, 2012


Sifting Through the Software Sandbox: SCM Meets QA:
Source control—it’s not just for tracking changes anymore.

Thanks to modern SCM (software configuration management) systems, when developers work on a codeline they leave behind a trail of clues that can reveal what parts of the code have been modified, when, how, and by whom. From the perspective of QA (quality assurance) and test engineers, is this all just “data,” or is there useful information that can improve the test coverage and overall quality of a product?

by William W. White | February 16, 2005


Take a Freaking Measurement!:
A koder with attitude, KV answers your questions. Miss Manners he ain’t.

Have you ever worked with someone who is a complete jerk about measuring everything?

by George V. Neville-Neil | January 17, 2008


The Antifragile Organization:
Embracing Failure to Improve Resilience and Maximize Availability

Failure is inevitable. Disks fail. Software bugs lie dormant waiting for just the right conditions to bite. People make mistakes. Data centers are built on farms of unreliable commodity hardware. If you’re running in a cloud environment, then many of these factors are outside of your control. To compound the problem, failure is not predictable and doesn’t occur with uniform probability and frequency. The lack of a uniform frequency increases uncertainty and risk in the system.

by Ariel Tseitlin | June 27, 2013


The Meaning of Maintenance:
Software maintenance is more than just bug fixes.

Isn’t software maintenance a misnomer? I’ve never heard of anyone reviewing a piece of code every year, just to make sure it was still in good shape. It seems like software maintenance is really just a cover for bug fixing. When I think of maintenance I think of taking my car in for an oil change, not fixing a piece of code. Are there any people who actually review code after it has been running in a production environment?

by George V. Neville-Neil | August 14, 2009


The Reliability of Enterprise Applications:
Understanding enterprise reliability

Enterprise reliability is a discipline that ensures applications will deliver the required business functionality in a consistent, predictable, and cost-effective manner without compromising core aspects such as availability, performance, and maintainability. This article describes a core set of principles and engineering methodologies that enterprises can apply to help them navigate the complex environment of enterprise reliability and deliver highly reliable and cost-efficient applications.

by Sanjay Sha | December 3, 2019


Too Darned Big to Test:
Testing large systems is a daunting task, but there are steps we can take to ease the pain.

The increasing size and complexity of software, coupled with concurrency and distributed systems, has made apparent the ineffectiveness of using only handcrafted tests. The misuse of code coverage and avoidance of random testing has exacerbated the problem. We must start again, beginning with good design (including dependency analysis), good static checking (including model property checking), and good unit testing (including good input selection). Code coverage can help select and prioritize tests to make you more efficient, as can the all-pairs technique for controlling the number of configurations.

by Keith Stobie | February 16, 2005


Traipsing Through the QA Tools Desert:
Who’s really to blame for buggy code?

The Jeremiahs of the software world are out there lamenting, “Software is buggy and insecure!” Like the biblical prophet who bemoaned the wickedness of his people, these malcontents tell us we must repent and change our ways. But as someone involved in building commercial software, I’m thinking to myself, “I don’t need to repent. I do care about software quality.” Even so, I know that I have transgressed. I have shipped software that has bugs in it. Why did I do it? Why can’t I ship perfect software all the time?

by Terry Coatta | February 16, 2005


Uprooting Software Defects at the Source:
Source code analysis is an emerging technology in the software industry that allows critical source code defects to be detected before a program runs.

Although the concept of detecting programming errors at compile time is not new, the technology to build effective tools that can process millions of lines of code and report substantive defects with only a small amount of noise has long eluded the market. At the same time, a different type of solution is needed to combat current trends in the software industry that are steadily diminishing the effectiveness of conventional software testing and quality assurance.

by Seth Hallem, David Park, Dawson Engler | January 28, 2004


Verification of Safety-critical Software:
Avionics software safety certification is achieved through objective-based standards.

Avionics software has become a keystone in today’s aircraft design. Advances in avionics systems have reduced aircraft weight thereby reducing fuel consumption, enabled precision navigation, improved engine performance, and provided a host of other benefits. These advances have turned modern aircraft into flying data centers with computers controlling or monitoring many of the critical systems onboard. The software that runs these aircraft systems must be as safe as we can make it.

by B. Scott Andersen, George Romanski | August 29, 2011


Weathering the Unexpected:
Failures happen, and resilience drills help organizations prepare for them.

Whether it is a hurricane blowing down power lines, a volcanic-ash cloud grounding all flights for a continent, or a humble rodent gnawing through underground fibers -- the unexpected happens. We cannot do much to prevent it, but there is a lot we can do to be prepared for it. To this end, Google runs an annual, company-wide, multi-day Disaster Recovery Testing event -- DiRT -- the objective of which is to ensure that Google’s services and internal business operations continue to run following a disaster.

by Kripa Krishnan | September 16, 2012