Failures happen, and resilience drills help organizations prepare for them.
A discussion with Jesse Robbins, Kripa Krishnan, John Allspaw, and Tom Limoncelli
Making the case for resilience testing
Avionics software safety certification is achieved through objective-based standards.
A Discussion with Nico Kicillof, Wolfgang Grieskamp and Bob Binder
Software maintenance is more than just bug fixes.
Have you ever worked with someone who is a complete jerk about measuring everything?
The Jeremiahs of the software world are out there lamenting, "Software is buggy and insecure!" Like the biblical prophet who bemoaned the wickedness of his people, these malcontents tell us we must repent and change our ways. But as someone involved in building commercial software, I'm thinking to myself, "I don't need to repent. I do care about software quality." Even so, I know that I have transgressed. I have shipped software that has bugs in it. Why did I do it? Why can't I ship perfect software all the time?
Networking and the Internet are encouraging increasing levels of interaction and collaboration between people and their software. Whether users are playing games or composing legal documents, their applications need to manage the complex interleaving of actions from multiple machines over potentially unreliable connections. As an example, Silicon Chalk is a distributed application designed to enhance the in-class experience of instructors and students. Its distributed nature requires that we test with multiple machines. Manual testing is too tedious, expensive, and inconsistent to be effective. While automating our testing, however, we have found it very labor intensive to maintain a set of scripts describing each machine's portion of a given test.
Thanks to modern SCM (software configuration management) systems, when developers work on a codeline they leave behind a trail of clues that can reveal what parts of the code have been modified, when, how, and by whom. From the perspective of QA (quality assurance) and test engineers, is this all just "data," or is there useful information that can improve the test coverage and overall quality of a product?
The increasing size and complexity of software, coupled with concurrency and distributed systems, has made apparent the ineffectiveness of using only handcrafted tests. The misuse of code coverage and avoidance of random testing has exacerbated the problem. We must start again, beginning with good design (including dependency analysis), good static checking (including model property checking), and good unit testing (including good input selection). Code coverage can help select and prioritize tests to make you more efficient, as can the all-pairs technique for controlling the number of configurations.
Quality assurance isn't just testing, or analysis, or wishful thinking. Although it can be boring, difficult, and tedious, QA is nonetheless essential.
Modern software development practices build applications as a collection of collaborating components. Unlike older practices that linked compiled components into a single monolithic application, modern executables are made up of any number of executable components that exist as separate binary files.
Source code analysis is an emerging technology in the software industry that allows critical source code defects to be detected before a program runs. Although the concept of detecting programming errors at compile time is not new, the technology to build effective tools that can process millions of lines of code and report substantive defects with only a small amount of noise has long eluded the market. At the same time, a different type of solution is needed to combat current trends in the software industry that are steadily diminishing the effectiveness of conventional software testing and quality assurance.
Try to remember your first day at your first software job. Do you recall what you were asked to do, after the human resources people were done with you? Were you asked to write a piece of fresh code? Probably not. It is far more likely that you were asked to fix a bug, or several, and to try to understand a large, poorly documented collection of source code.
Pagers, cellular phones, smart appliances, and Web services - these products and services are almost omnipresent in our world, and are stimulating the creation of a new breed of software: applications that must deal with inputs from a variety of sources, provide real-time responses, deliver strong security - and do all this while providing a positive user experience. In response, a new style of application programming is taking hold, one that is based on multiple threads of control and the asynchronous exchange of data, and results in fundamentally more complex applications.