Failure and Recovery

RSS
Sort By:

Abstracting the Geniuses Away from Failure Testing

Ordinary users need tools that automate the selection of custom-tailored faults to inject.

by Peter Alvaro, Severine Tymon | October 26, 2017

0 comments

Too Big NOT to Fail

Embrace failure so it doesn't embrace you.

by Pat Helland, Simon Weaver, Ed Harris | April 5, 2017

CACM This article appears in print in Communications of the ACM, Volume 60 Issue 6

0 comments

Forced Exception-Handling

You can never discount the human element in programming.

by George Neville-Neil | March 14, 2017

1 comments

Injecting Errors for Fun and Profit

Error-detection and correction features are only as good as our ability to test them.

by Steve Chessin | August 6, 2010

CACM This article appears in print in Communications of the ACM, Volume 53 Issue 9

0 comments

Self-Healing in Modern Operating Systems

A few early steps show there's a long (and bumpy) road ahead.

by Michael W. Shapiro | December 27, 2004

0 comments

Error Messages:
What's the Problem?

Computer users spend a lot of time chasing down errors - following the trail of clues that starts with an error message and that sometimes leads to a solution and sometimes to frustration. Problems with error messages are particularly acute for system administrators (sysadmins) - those who configure, install, manage, and maintain the computational infrastructure of the modern world - as they spend a lot of effort to keep computers running amid errors and failures.

by Paul P. Maglio, Eser Kandogan | December 6, 2004

0 comments

Automating Software Failure Reporting

We can only fix those bugs we know about.

by Brendan Murphy | December 6, 2004

0 comments

Oops! Coping with Human Error in IT Systems

Errors Happen. How to Deal.

by Aaron B. Brown | December 6, 2004

0 comments