Securing the Tangled Web

 

Preventing script injection vulnerabilities through software design

CHRISTOPH KERN, GOOGLE

 

Script injection vulnerabilities are a bane of Web application development: deceptively simple in cause and remedy, they are nevertheless surprisingly difficult to prevent in large-scale Web development.

 

XSS (cross-site scripting) arises when insufficient data validation, sanitization, or escaping within a Web application allow an attacker to cause browser-side execution of malicious JavaScript in the application’s context. This injected code can then do whatever the attacker wants, using the privileges of the victim. Exploitation of XSS bugs results in complete (though not necessarily persistent) compromise of the victim’s session with the vulnerable application. This article provides an overview of how XSS vulnerabilities arise and why it is so difficult to avoid them in real-world Web application software development. The article then describes software design patterns developed at Google to address the problem. A key goal of these design patterns is to confine the potential for XSS bugs to a small fraction of an application’s code base, significantly improving one’s ability to reason about the absence of this class of security bugs. In several software projects within Google, this approach has resulted in a substantial reduction in the incidence of XSS vulnerabilities.

Securing the Tangled Web

 

Related:
Fault Injection in Production
High Performance Web Sites
Vicious XSS

Privacy, Anonymity, and Big Data in the Social Sciences

 

Quality social science research and the privacy of human subjects requires trust.

JON P. DARIES, JUSTIN REICH, JIM WALDO, ELISE M. YOUNG, JONATHAN WHITTINGHILL, DANIEL THOMAS SEATON, ANDREW DEAN HO, ISAAC CHUANG

 

Open data has tremendous potential for science, but, in human subjects research, there is a tension between privacy and releasing high-quality open data. Federal law governing student privacy and the release of student records suggests that anonymizing student data protects student privacy. Guided by this standard, we de-identified and released a data set from 16 MOOCs (massive open online courses) from MITx and HarvardX on the edX platform. In this article, we show that these and other de-identification procedures necessitate changes to data sets that threaten replication and extension of baseline analyses. To balance student privacy and the benefits of open data, we suggest focusing on protecting privacywithout anonymizing data by instead expanding policies that compel researchers to uphold the privacy of the subjects in open data sets. If we want to have high-quality social science research and also protect the privacy of human subjects, we must eventually have trust in researchers. Otherwise, we’ll always have the strict tradeoff between anonymity and science illustrated here.

Privacy, Anonymity, and Big Data in the Social Sciences

 

Related:
Four Billion Little Brothers?: Privacy, mobile phones, and ubiquitous data collection
Communications Surveillance: Privacy and Security at Risk
Modeling People and Places with Internet Photo Collections

ACM and the Professional Programmer

 

How do you, the reader, stay informed about research that influences your work?

VINTON G. CERF

 

In the very early days of computing, professional programming was nearly synonymous with academic research because computers tended to be devices that existed only or largely in academic settings. As computers became commercially available, they began to be found in private-sector, business environments. The 1950s and 1960s brought computing in the form of automation anddata processing to the private sector, and along with this came a growing community of professionals whose focus on computing was pragmatic and production-oriented. Computing was (and still is) evolving, and the academic community continued to explore new software and hardware concepts and constructs. New languages were invented (and are still being invented) to try new ideas in the formulation of programs. The introduction of time sharing created new territory to explore. In today’s world cloud computing is the new time sharing, more or less.

ACM and the Professional Programmer

The Network is Reliable

An informal survey of real-world communications failures

PETER BAILIS, UC BERKELEY

KYLE KINGSBURY, JEPSEN NETWORKS

“The network is reliable” tops Peter Deutsch’s classic list, “Eight fallacies of distributed computing” (https://blogs.oracle.com/jag/resource/Fallacies.html), “all [of which] prove to be false in the long run and all [of which] cause big trouble and painful learning experiences.” Accounting for and understanding the implications of network behavior is key to designing robust distributed programs—in fact, six of Deutsch’s “fallacies” directly pertain to limitations on networked communications. This should be unsurprising: the ability (and often requirement) to communicate over a shared channel is a defining characteristic of distributed programs, and many of the key results in the field pertain to the possibility and impossibility of performing distributed computations under particular sets of network conditions.

The Network is Reliable

Related:
Eventual Consistency Today: Limitations, Extensions, and Beyond 
The Antifragile Organization 
Self-Healing Networks

 

Undergraduate Software Engineering

Addressing the Needs of Professional Software Development

MICHAEL J. LUTZ, J. FERNANDO NAVEDA, AND JAMES R. VALLINO 

DEPARTMENT OF SOFTWARE ENGINEERING, ROCHESTER INSTITUTE OF TECHNOLOGY

In the fall semester of 1996 RIT (Rochester Institute of Technology) launched the first undergraduate software engineering program in the United States.9,10 The culmination of five years of planning, development, and review, the program was designed from the outset to prepare graduates for professional positions in commercial and industrial software development.

Undergraduate Software Engineering

Related:

Fun and Games: Multi-Language Development
Pride and Prejudice: (The Vasa)
A Conversation with John Hennessy and David Patterson

Bringing Arbitrary Compute to Authoritative Data

Many disparate use cases can be satisfied with a single storage system.

MARK CAVAGE AND DAVID PACHECO, JOYENT

While the term big data is vague enough to have lost much of its meaning, today’s storage systems are growing more quickly and managing more data than ever before. Consumer devices generate large numbers of photos, videos, and other large digital assets. Machines are rapidly catching up to humans in data generation through extensive recording of system logs and metrics, as well as applications such as video capture and genome sequencing. Large data sets are now commonplace, and people increasingly want to run sophisticated analyses on the data. In this article, big data refers to a corpus of data large enough to benefit significantly from parallel computation across a fleet of systems, where the efficient orchestration of the computation is itself a considerable challenge.

 > Bringing Arbitrary Compute to Authoritative Data

Related:
Cloud Computing: An Overview
A co-Relational Model of Data for Large Shared Data Banks
Condos and Clouds

Outsourcing Responsibility

What do you do when your debugger fails you?

 

Dear KV,

I’ve been assigned to help with a new project and have been looking over the admittedly skimpy documentation the team has placed on the internal wiki. I spent a day or so staring at what seemed to be a long list of open-source projects that they intend to integrate into the system they have been building, but I couldn’t find where their original work was described. I asked one of the project team members where I might find that documentation and was told that there really isn’t much that they need to document, because all the features they need are available in various projects on github.

I really don’t get why people do not understand that outsourcing work also means outsourcing responsibility, and that in a software project, responsibility and accountability are paramount.

Feeling a Sense of Responsibility

Outsourcing Responsibility

George V. Neville-Neil

 

Quality Software Costs Money – Heartbleed Was Free

How to generate funding for FOSS

POUL-HENNING KAMP

The world runs on free and open-source software, FOSS for short, and to some degree it has predictably infiltrated just about any software-based product anywhere in the world.

What’s not to like about FOSS? Ready-to-run source code, ready to download, no license payments—just take it and run. There may be some fine print in the license to comply with but nothing too onerous or burdensome.

Quality Software Costs Money – Heartbleed Was Free

 

Who Must You Trust?

You must have some trust if you want to get anything done.

THOMAS WADLOW

In his novel The Diamond Age,7 author Neal Stephenson describes a constructed society (called a phyle) based on extreme trust in one’s fellow members. Part of the membership requirements is that, from time to time, each member is called upon to undertake certain tasks to reinforce that trust. For example, a phyle member might be told to go to a particular location at the top of a cliff at a specific time, where he will find bungee cords with ankle harnesses attached. The other ends of the cords trail off into the bushes. At the appointed time he is to fasten the harnesses to his ankles and jump off the cliff. He has to trust that the unseen fellow phyle member who was assigned the job of securing the other end of the bungee to a stout tree actually did his job; otherwise, he will plummet to his death. A third member secretly watches to make sure the first two don’t communicate in any way, relying only on trust to keep tragedy at bay.

Who Must You Trust?

 

Related:
The Answer is 42 of Course
Weapons of Mass Assignment
LinkedIn Password Leak: Salt Their Hide

Automated QA Testing at EA: Driven by Events

A discussion with Michael Donat, Jafar Husain, and Terry Coatta

To millions of game geeks, the position of QA (quality assurance) tester at Electronic Arts must seem like a dream job. But from the company’s perspective, the overhead associated with QA can look downright frightening, particularly in an era of massively multiplayer games.

Automated QA Testing at EA: Driven by Events

 

Related:
Orchestrating an Automated Test Lab
Finding Usability Bugs with Automated Tests
Adopting DevOps Practices in Quality Assurance