Questioning the Criteria for Evaluating Non-cryptographic Hash Functions:
Maybe we need to think more about non-cryptographic hash functions.
Although cryptographic and non-cryptographic hash functions are everywhere, there seems to be a gap in how they are designed. Lots of criteria exist for cryptographic hashes motivated by various security requirements, but on the non-cryptographic side there is a certain amount of folklore that, despite the long history of hash functions, has not been fully explored. While targeting a uniform distribution makes a lot of sense for real-world datasets, it can be a challenge when confronted by a dataset with particular patterns.
Assessing IT Project Success: Perception vs. Reality:
We would not be in the digital age if it were not for the recurrent success of IT projects.
This study has significant implications for practice, research, and education by providing new insights into IT project success. It expands the body of knowledge on project management by reporting project success (and not exclusively project management success), grounded in several objective criteria such as deliverables usage by the client in the post-project stage, hiring of project-related support/maintenance services by the client, contracting of new projects by the client, and vendor recommendation by the client to potential clients. Researchers can find a set of criteria they can use when studying and reporting the success of IT projects, thus expanding the current perspective on evaluation and contributing to more accurate conclusions.
Confidential Computing Proofs:
An alternative to cryptographic zero-knowledge
Proofs are powerful tools for integrity and privacy, enabling the verifier to delegate a computation and still verify its correct execution, and enabling the prover to keep the details of the computation private. Both CCP and ZKP can achieve soundness and zero-knowledge but with important differences. CCP relies on hardware trust assumptions, which yield high performance and additional confidentiality protection for the prover but may be unacceptable for some applications. CCP is also often easier to use, notably with existing code, whereas ZKP comes with a large prover overhead that may be unpractical for some applications.
GPTs and Hallucination:
Why do large language models hallucinate?
The findings in this experiment support the hypothesis that GPTs based on LLMs perform well on prompts that are more popular and have reached a general consensus yet struggle on controversial topics or topics with limited data. The variability in the applications's responses underscores that the models depend on the quantity and quality of their training data, paralleling the system of crowdsourcing that relies on diverse and credible contributions. Thus, while GPTs can serve as useful tools for many mundane tasks, their engagement with obscure and polarized topics should be interpreted with caution.
Virtual Machinations: Using Large Language Models as Neural Computers:
LLMs can function not only as databases, but also as dynamic, end-user programmable neural computers.
We explore how Large Language Models (LLMs) can function not just as databases, but as dynamic, end-user programmable neural computers. The native programming language for this neural computer is a Logic Programming-inspired declarative language that formalizes and externalizes the chain-of-thought reasoning as it might happen inside a large language model.
Toward Effective AI Support for Developers:
A survey of desires and concerns
The journey of integrating AI into the daily lives of software engineers is not without its challenges. Yet, it promises a transformative shift in how developers can translate their creative visions into tangible solutions. As we have seen, AI tools such as GitHub Copilot are already reshaping the code-writing experience, enabling developers to be more productive and to spend more time on creative and complex tasks. The skepticism around AI, from concerns about job security to its real-world efficacy, underscores the need for a balanced approach that prioritizes transparency, education, and ethical considerations.
You Don't Know Jack about Bandwidth:
If you're an ISP and all your customers hate you, take heart. This is now a solvable problem.
Bandwidth probably isn't the problem when your employees or customers say they have terrible Internet performance. Once they have something in the range of 50 to 100 Mbps, the problem is latency, how long it takes for the ISP's routers to process their traffic. If you're an ISP and all your customers hate you, take heart. This is now a solvable problem, thanks to a dedicated band of individuals who hunted it down, killed it, and then proved out their solution in home routers.
Transactions and Serverless are Made for Each Other:
If serverless platforms could wrap functions in database transactions, they would be a good fit for database-backed applications.
Database-backed applications are an exciting new frontier for serverless computation. By tightly integrating application execution and data management, a transactional serverless platform enables many new features not possible in either existing serverless platforms or server-based deployments.
Trustworthy AI using Confidential Federated Learning:
Federated learning and confidential computing are not competing technologies.
The principles of security, privacy, accountability, transparency, and fairness are the cornerstones of modern AI regulations. Classic FL was designed with a strong emphasis on security and privacy, at the cost of transparency and accountability. CFL addresses this gap with a careful combination of FL with TEEs and commitments. In addition, CFL brings other desirable security properties, such as code-based access control, model confidentiality, and protection of models during inference. Recent advances in confidential computing such as confidential containers and confidential GPUs mean that existing FL frameworks can be extended seamlessly to support CFL with low overheads.
Confidential Computing or Cryptographic Computing?:
Tradeoffs between cryptography and hardware enclaves
Secure computation via MPC/homomorphic encryption versus hardware enclaves presents tradeoffs involving deployment, security, and performance. Regarding performance, it matters a lot which workload you have in mind. For simple workloads such as simple summations, low-degree polynomials, or simple machine-learning tasks, both approaches can be ready to use in practice, but for rich computations such as complex SQL analytics or training large machine-learning models, only the hardware enclave approach is at this moment practical enough for many real-world deployment scenarios.
Confidential Container Groups:
Implementing confidential computing on Azure container instances
The experiments presented here demonstrate that Parma, the architecture that drives confidential containers on Azure container instances, adds less than one percent additional performance overhead beyond that added by the underlying TEE. Importantly, Parma ensures a security invariant over all reachable states of the container group rooted in the attestation report. This allows external third parties to communicate securely with containers, enabling a wide range of containerized workflows that require confidential access to secure data. Companies obtain the advantages of running their most confidential workflows in the cloud without having to compromise on their security requirements.
Elevating Security with Arm CCA:
Attestation and verification are integral to adopting confidential computing.
Confidential computing has great potential to improve the security of general-purpose computing platforms by taking supervisory systems out of the TCB, thereby reducing the size of the TCB, the attack surface, and the attack vectors that security architects must consider. Confidential computing requires innovations in platform hardware and software, but these have the potential to enable greater trust in computing, especially on devices that are owned or controlled by third parties. Early consumers of confidential computing will need to make their own decisions about the platforms they choose to trust.
DevEx in Action:
A study of its tangible impacts
DevEx (developer experience) is garnering increased attention at many software organizations as leaders seek to optimize software delivery amid the backdrop of fiscal tightening and transformational technologies such as AI. Intuitively, there is acceptance among technical leaders that good developer experience enables more effective software delivery and developer happiness. Yet, at many organizations, proposed initiatives and investments to improve DevEx struggle to get buy-in as business stakeholders question the value proposition of improvements.
Resolving the Human-subjects Status of Machine Learning's Crowdworkers:
What ethical framework should govern the interaction of ML researchers and crowdworkers?
In recent years, machine learning (ML) has relied heavily on crowdworkers both for building datasets and for addressing research questions requiring human interaction or judgment. The diversity of both the tasks performed and the uses of the resulting data render it difficult to determine when crowdworkers are best thought of as workers versus human subjects. These difficulties are compounded by conflicting policies, with some institutions and researchers regarding all ML crowdworkers as human subjects and others holding that they rarely constitute human subjects. Notably few ML papers involving crowdwork mention IRB oversight, raising the prospect of non-compliance with ethical and regulatory requirements.
How to Design an ISA:
The popularity of RISC-V has led many to try designing instruction sets.
Over the past decade I've been involved in several projects that have designed either ISA (instruction set architecture) extensions or clean-slate ISAs for various kinds of processors (you'll even find my name in the acknowledgments for the RISC-V spec, right back to the first public version). When I started, I had very little idea about what makes a good ISA, and, as far as I can tell, this isn't formally taught anywhere.
Improving Testing of Deep-learning Systems:
A combination of differential and mutation testing results in better test data.
We used differential testing to generate test data to improve diversity of data points in the test dataset and then used mutation testing to check the quality of the test data in terms of diversity. Combining differential and mutation testing in this fashion improves mutation score, a test data quality metric, indicating overall improvement in testing effectiveness and quality of the test data when testing deep learning systems.
Low-code Development Productivity:
"Is winter coming" for code-based technologies?
This article aims to provide new insights on the subject by presenting the results of laboratory experiments carried out with code-based, low-code, and extreme low-code technologies to study differences in productivity. Low-code technologies have clearly shown higher levels of productivity, providing strong arguments for low-code to dominate the software development mainstream in the short/medium term. The article reports the procedure and protocols, results, limitations, and opportunities for future research.
Use Cases are Essential:
Use cases provide a proven method to capture and explain the requirements of a system in a concise and easily understood format.
While the software industry is a fast-paced and exciting world in which new tools, technologies, and techniques are constantly being developed to serve business and society, it is also forgetful. In its haste for fast-forward motion, it is subject to the whims of fashion and can forget or ignore proven solutions to some of the eternal problems that it faces. Use cases, first introduced in 1986 and popularized later, are one of those proven solutions.
Device Onboarding using FDO and the Untrusted Installer Model:
FDO's untrusted model is contrasted with Wi-Fi Easy Connect to illustrate the advantages of each mechanism.
Automatic onboarding of devices is an important technique to handle the increasing number of "edge" and IoT devices being installed. Onboarding of devices is different from most device-management functions because the device's trust transitions from the factory and supply chain to the target application. To speed the process with automatic onboarding, the trust relationship in the supply chain must be formalized in the device to allow the transition to be automated.
Creating the First Confidential GPUs:
The team at NVIDIA brings confidentiality and integrity to user code and data for accelerated computing.
Today's datacenter GPU has a long and storied 3D graphics heritage. In the 1990s, graphics chips for PCs and consoles had fixed pipelines for geometry, rasterization, and pixels using integer and fixed-point arithmetic. In 1999, NVIDIA invented the modern GPU, which put a set of programmable cores at the heart of the chip, enabling rich 3D scene generation with great efficiency.
Why Should I Trust Your Code?:
Confidential computing enables users to authenticate code running in TEEs, but users also need evidence this code is trustworthy.
For Confidential Computing to become ubiquitous in the cloud, in the same way that HTTPS became the default for networking, a different, more flexible approach is needed. Although there is no guarantee that every malicious code behavior will be caught upfront, precise auditability can be guaranteed: Anyone who suspects that trust has been broken by a confidential service should be able to audit any part of its attested code base, including all updates, dependencies, policies, and tools. To achieve this, we propose an architecture to track code provenance and to hold code providers accountable. At its core, a new Code Transparency Service (CTS) maintains a public, append-only ledger that records all code deployed for confidential services.
Hardware VM Isolation in the Cloud:
Enabling confidential computing with AMD SEV-SNP technology
Confidential computing is a security model that fits well with the public cloud. It enables customers to rent VMs while enjoying hardware-based isolation that ensures that a cloud provider cannot purposefully or accidentally see or corrupt their data. SEV-SNP was the first commercially available x86 technology to offer VM isolation for the cloud and is deployed in Microsoft Azure, AWS, and Google Cloud. As confidential computing technologies such as SEV-SNP develop, confidential computing is likely to simply become the default trust model for the cloud.
Confidential Computing: Elevating Cloud Security and Privacy:
Working toward a more secure and innovative future
Confidential Computing (CC) fundamentally improves our security posture by drastically reducing the attack surface of systems. While traditional systems encrypt data at rest and in transit, CC extends this protection to data in use. It provides a novel, clearly defined security boundary, isolating sensitive data within trusted execution environments during computation. This means services can be designed that segment data based on least-privilege access principles, while all other code in the system sees only encrypted data. Crucially, the isolation is rooted in novel hardware primitives, effectively rendering even the cloud-hosting infrastructure and its administrators incapable of accessing the data.
Pointers in Far Memory:
A rethink of how data and computations should be organized
Effectively exploiting emerging far-memory technology requires consideration of operating on richly connected data outside the context of the parent process. Operating-system technology in development offers help by exposing abstractions such as memory objects and globally invariant pointers that can be traversed by devices and newly instantiated compute. Such ideas will allow applications running on future heterogeneous distributed systems with disaggregated memory nodes to exploit near-memory processing for higher performance and to independently scale their memory and compute resources for lower cost.
How Flexible is CXL's Memory Protection?:
Replacing a sledgehammer with a scalpel
CXL, a new interconnect standard for cache-coherent memory sharing, is becoming a reality - but its security leaves something to be desired. Decentralized capabilities are flexible and resilient against malicious actors, and should be considered while CXL is under active development.
Echoes of Intelligence:
Textual interpretation and large language models
We are now in the presence of a new medium disguised as good old text, but that text has been generated by an LLM, without authorial intention—an aspect that, if known beforehand, completely changes the expectations and response a human should have from a piece of text. Should our interpretation capabilities be engaged? If yes, under what conditions? The rules of the language game should be spelled out; they should not be passed over in silence.
You Don't know Jack about Application Performance:
Knowing whether you're doomed to fail is important when starting a project.
You don't need to do a full-scale benchmark any time you have a performance or capacity planning problem. A simple measurement will provide the bottleneck point of your system: This example program will get significantly slower after eight requests per second per CPU. That's often enough to tell you the most important thing: if you're going to fail.
Cargo Cult AI:
Is the ability to think scientifically the defining essence of intelligence?
Evidence abounds that the human brain does not innately think scientifically; however, it can be taught to do so. The same species that forms cargo cults around widespread and unfounded beliefs in UFOs, ESP, and anything read on social media also produces scientific luminaries such as Sagan and Feynman. Today's cutting-edge LLMs are also not innately scientific. But unlike the human brain, there is good reason to believe they never will be unless new algorithmic paradigms are developed.
Beyond the Repository:
Best practices for open source ecosystems researchers
Much of the existing research about open source elects to study software repositories instead of ecosystems. An open source repository most often refers to the artifacts recorded in a version control system and occasionally includes interactions around the repository itself. An open source ecosystem refers to a collection of repositories, the community, their interactions, incentives, behavioral norms, and culture. The decentralized nature of open source makes holistic analysis of the ecosystem an arduous task, with communities and identities intersecting in organic and evolving ways. Despite these complexities, the increased scrutiny on software security and supply chains makes it of the utmost importance to take an ecosystem-based approach when performing research about open source.
DevEx: What Actually Drives Productivity:
The developer-centric approach to measuring and improving productivity
Developer experience focuses on the lived experience of developers and the points of friction they encounter in their everyday work. In addition to improving productivity, DevEx drives business performance through increased efficiency, product quality, and employee retention. This paper provides a practical framework for understanding DevEx, and presents a measurement framework that combines feedback from developers with data about the engineering systems they interact with. These two frameworks provide leaders with clear, actionable insights into what to measure and where to focus in order to improve developer productivity.
Designing a Framework for Conversational Interfaces:
Combining the latest advances in machine learning with earlier approaches
Wherever possible, business logic should be described by code rather than training data. This keeps our system's behavior principled, predictable, and easy to change. Our approach to conversational interfaces allows them to be built much like any other application, using familiar tools, conventions, and processes, while still taking advantage of cutting-edge machine-learning techniques.
Opportunity Cost and Missed Chances in Optimizing Cybersecurity:
The loss of potential gain from other alternatives when one alternative is chosen
Opportunity cost should not be an afterthought when making security decisions. One way to ease into considering complex alternatives is to consider the null baseline of doing nothing instead of the choice at hand. Opportunity cost can feel abstract, elusive, and imprecise, but it can be understood by everyone, given the right introduction and framing. Using the approach presented here will make it natural and accessible.
Sharpening Your Tools:
Updating bulk_extractor for the 2020s
This article presents our experience updating the high-performance Digital forensics tool BE (bulk_extractor) a decade after its initial release. Between 2018 and 2022, we updated the program from C++98 to C++17. We also performed a complete code refactoring and adopted a unit test framework. DF tools must be frequently updated to keep up with changes in the ways they are used. A description of updates to the bulk_extractor tool serves as an example of what can and should be done.
Three-part Harmony for Program Managers Who Just Don't Get It, Yet:
Open-source software, open standards, and agile software development
This article examines three tools in the system acquisitions toolbox that can work to expedite development and procurement while mitigating programmatic risk: OSS, open standards, and the Agile/Scrum software development processes are all powerful additions to the DoD acquisition program management toolbox.
To PiM or Not to PiM:
The case for in-memory inferencing of quantized CNNs at the edge
As artificial intelligence becomes a pervasive tool for the billions of IoT (Internet of things) devices at the edge, the data movement bottleneck imposes severe limitations on the performance and autonomy of these systems. PiM (processing-in-memory) is emerging as a way of mitigating the data movement bottleneck while satisfying the stringent performance, energy efficiency, and accuracy requirements of edge imaging applications that rely on CNNs (convolutional neural networks).
Taking Flight with Copilot:
Early insights and opportunities of AI-powered pair-programming tools
Over the next five years, AI-powered tools likely will be helping developers in many diverse tasks. For example, such models may be used to improve code review, directing reviewers to parts of a change where review is most needed or even directly providing feedback on changes. Models such as Codex may suggest fixes for defects in code, build failures, or failing tests. These models are able to write tests automatically, helping to improve code quality and downstream reliability of distributed systems. This study of Copilot shows that developers spend more time reviewing code than actually writing code.
Reinventing Backend Subsetting at Google:
Designing an algorithm with reduced connection churn that could replace deterministic subsetting
Backend subsetting is useful for reducing costs and may even be necessary for operating within the system limits. For more than a decade, Google used deterministic subsetting as its default backend subsetting algorithm, but although this algorithm balances the number of connections per backend task, deterministic subsetting has a high level of connection churn. Our goal at Google was to design an algorithm with reduced connection churn that could replace deterministic subsetting as the default backend subsetting algorithm.
OCCAM-v2: Combining Static and Dynamic Analysis for Effective and Efficient Whole-program Specialization:
Leveraging scalable pointer analysis, value analysis, and dynamic analysis
OCCAM-v2 leverages scalable pointer analysis, value analysis, and dynamic analysis to create an effective and efficient tool for specializing LLVM bitcode. The extent of the code-size reduction achieved depends on the specific deployment configuration. Each application that is to be specialized is accompanied by a manifest that specifies concrete arguments that are known a priori, as well as a count of residual arguments that will be provided at runtime. The best case for partial evaluation occurs when the arguments are completely concretely specified. OCCAM-v2 uses a pointer analysis to devirtualize calls, allowing it to eliminate the entire body of functions that are not reachable by any direct calls.
The Rise of Fully Homomorphic Encryption:
Often called the Holy Grail of cryptography, commercial FHE is near.
Once commercial FHE is achieved, data access will become completely separated from unrestricted data processing, and provably secure storage and computation on untrusted platforms will become both relatively inexpensive and widely accessible. In ways similar to the impact of the database, cloud computing, PKE, and AI, FHE will invoke a sea change in how confidential information is protected, processed, and shared, and will fundamentally change the course of computing at a foundational level.
Mapping the Privacy Landscape for Central Bank Digital Currencies:
Now is the time to shape what future payment flows will reveal about you.
As central banks all over the world move to digitize cash, the issue of privacy needs to move to the forefront. The path taken may depend on the needs of each stakeholder group: privacy-conscious users, data holders, and law enforcement.
From Zero to One Hundred:
Demystifying zero trust and its implications on enterprise people, process, and technology
Changing network landscapes and rising security threats have imparted a sense of urgency for new approaches to security. Zero trust has been proposed as a solution to these problems, but some regard it as a marketing tool to sell existing best practice while others praise it as a new cybersecurity standard. This article discusses the history and development of zero trust and why the changing threat landscape has led to a new discourse in cybersecurity.
Privacy of Personal Information:
Going incog in a goldfish bowl
Each online interaction with an external service creates data about the user that is digitally recorded and stored. These external services may be credit card transactions, medical consultations, census data collection, voter registration, etc. Although the data is ostensibly collected to provide citizens with better services, the privacy of the individual is inevitably put at risk. With the growing reach of the Internet and the volume of data being generated, data protection and, specifically, preserving the privacy of individuals, have become particularly important.
The Challenges of IoT, TLS, and Random Number Generators in the Real World:
Bad random numbers are still with us and are proliferating in modern systems.
Many in the cryptographic community scoff at the mistakes made in implementing RNGs. Many cryptographers and members of the IETF resist the call to make TLS more resilient to this class of failures. This article discusses the history, current state, and fragility of the TLS protocol, and it closes with an example of how to improve the protocol. The goal is not to suggest a solution but to start a dialog to make TLS more resilient by proving that the security of TLS without the assumption of perfect random numbers is possible.
Walk a Mile in Their Shoes:
The Covid pandemic through the lens of four tech workers
Covid has changed how people work in many ways, but many of the outcomes have been paradoxical in nature. What works for one person may not work for the next (or even the same person the next day), and we have yet to figure out how to predict exactly what will work for everyone. As you saw in the composite personas described here, some people struggle with isolation and loneliness, have a hard time connecting socially with their teams, or find the time pressures of hybrid work with remote teams to be overwhelming. Others relish this newfound way of working, enjoying more time with family, greater flexibility to exercise during the day, a better work/life balance, and a stronger desire to contribute to the world.
Long Live Software Easter Eggs!
It's a period of unrest. Rebel developers, striking from continuous deployment servers, have won their first victory. During the battle, rebel spies managed to push an epic commit in the HTML code of https://pro.sony. Pursued by sinister agents, the rebels are hiding in commits, buttons, tooltips, API, HTTP headers, and configuration screens.
Autonomous Computing:
We frequently compute across autonomous boundaries but the implications of the patterns to ensure independence are rarely discussed.
Autonomous computing is a pattern for business work using collaborations to connect fiefdoms and their emissaries. This pattern, based on paper forms, has been used for centuries. Here, we explain fiefdoms, collaborations, and emissaries. We examine how emissaries work outside the autonomous boundary and are convenient while remaining an outsider. And we examine how work across different fiefdoms can be initiated, run for long periods of time, and eventually be completed.
Distributed Latency Profiling through Critical Path Tracing:
CPT can provide actionable and precise latency analysis.
Low latency is an important feature for many Google applications such as Search, and latency-analysis tools play a critical role in sustaining low latency at scale. For complex distributed systems that include services that constantly evolve in functionality and data, keeping overall latency to a minimum is a challenging task. In large, real-world distributed systems, existing tools such as RPC telemetry, CPU profiling, and distributed tracing are valuable to understand the subcomponents of the overall system, but are insufficient to perform end-to-end latency analyses in practice.
Middleware 101:
What to know now and for the future
Whether segregating a sophisticated software component into smaller services, transferring data between computers, or creating a general gateway for seamless communication, you can rely on middleware to achieve communication between different devices, applications, and software layers. Following the increasing agile movement, the tech industry has adopted the use of fast waterfall models to create stacks of layers for each structural need, including integration, communication, data, and security. Given this scope, emphasis must now be on endpoint connection and agile development. This means that middleware should not serve solely as an object-oriented solution to execute simple request-response commands.
Persistence Programming:
Are we doing this right?
A few years ago, my team was working on a commercial Java development project for Enhanced 911 (E911) emergency call centers. We were frustrated by trying to meet the data-storage requirements of this project using the traditional model of Java over an SQL database. After some reflection about the particular requirements (and nonrequirements) of the project, we took a deep breath and decided to create our own custom persistence layer from scratch.
FPGAs in Client Compute Hardware:
Despite certain challenges, FPGAs provide security and performance benefits over ASICs.
FPGAs (field-programmable gate arrays) are remarkably versatile. They are used in a wide variety of applications and industries where use of ASICs (application-specific integrated circuits) is less economically feasible. Despite the area, cost, and power challenges designers face when integrating FPGAs into devices, they provide significant security and performance benefits. Many of these benefits can be realized in client compute hardware such as laptops, tablets, and smartphones.
The Keys to the Kingdom:
A deleted private key, a looming deadline, and a last chance to patch a new static root of trust into the bootloader
An unlucky fat-fingering precipitated the current crisis: The client had accidentally deleted the private key needed to sign new firmware updates. They had some exciting new features to ship, along with the usual host of reliability improvements. Their customers were growing impatient, but my client had to stall when asked for a release date. How could they come up with a meaningful date? They had lost the ability to sign a new firmware release.
Interpretable Machine Learning:
Moving from mythos to diagnostics
The emergence of machine learning as a society-changing technology in the past decade has triggered concerns about people's inability to understand the reasoning of increasingly complex models. The field of IML (interpretable machine learning) grew out of these concerns, with the goal of empowering various stakeholders to tackle use cases, such as building trust in models, performing model debugging, and generally informing real human decision-making.
Federated Learning and Privacy:
Building privacy-preserving systems for machine learning and data science on decentralized data
Centralized data collection can expose individuals to privacy risks and organizations to legal risks if data is not properly managed. Federated learning is a machine learning setting where multiple entities collaborate in solving a machine learning problem, under the coordination of a central server or service provider. Each client's raw data is stored locally and not exchanged or transferred; instead, focused updates intended for immediate aggregation are used to achieve the learning objective.
Meaning and Context in Computer Programs:
Sharing domain knowledge among programmers using the source code as the medium
When you look at a function program's source code, how do you know what it means? Is the meaning found in the return values of the function, or is it located inside the function body? What about the function name? Answering these questions is important to understanding how to share domain knowledge among programmers using the source code as the medium. The program is the medium of communication among programmers to share their solutions.
Lamboozling Attackers: A New Generation of Deception:
Software engineering teams can exploit attackers' human nature by building deception environments.
The goal of this article is to educate software leaders, engineers, and architects on the potential of deception for systems resilience and the practical considerations for building deception environments. By examining the inadequacy and stagnancy of historical deception efforts by the information security community, the article also demonstrates why engineering teams are now poised to become significantly more successful owners of deception systems.
Designing UIs for Static Analysis Tools:
Evaluating tool design guidelines with SWAN
Static-analysis tools suffer from usability issues such as a high rate of false positives, lack of responsiveness, and unclear warning descriptions and classifications. Here, we explore the effect of applying user-centered approach and design guidelines to SWAN, a security-focused static-analysis tool for the Swift programming language. SWAN is an interesting case study for exploring static-analysis tool usability because of its large target audience, its potential to integrate easily into developers' workflows, and its independence from existing analysis platforms.
Human-Centered Approach to Static-Analysis-Driven Developer Tools:
The future depends on good HCI
Complex and opaque systems do not scale easily. A human-centered approach for evolving tools and practices is essential to ensuring that software is scaled safely and securely. Static analysis can unveil information about program behavior, but the goal of deriving this information should not be to accumulate hairsplitting detail. HCI can help direct static-analysis techniques into developer-facing systems that structure information and embody relationships in representations that closely mirror a programmer's thought. The survival of great software depends on programming languages that support, rather than inhibit, communicating, reasoning, and abstract thinking.
Static Analysis at GitHub:
An experience report
The Semantic Code team at GitHub builds and operates a suite of technologies that power symbolic code navigation on github.com. We learned that scale is about adoption, user behavior, incremental improvement, and utility. Static analysis in particular is difficult to scale with respect to human behavior; we often think of complex analysis tools working to find potentially problematic patterns in code and then trying to convince the humans to fix them.
Static Analysis: An Introduction:
The fundamental challenge of software engineering is one of complexity.
Modern static-analysis tools provide powerful and specific insights into codebases. The Linux kernel team, for example, developed Coccinelle, a powerful tool for searching, analyzing, and rewriting C source code; because the Linux kernel contains more than 27 million lines of code, a static-analysis tool is essential both for finding bugs and for making automated changes across its many libraries and modules. Another tool targeted at the C family of languages is Clang scan-build, which comes with many useful analyses and provides an API for programmers to write their own analyses. Like so many things in computer science, the utility of static analysis is self-referential: To write reliable programs, we must also write programs for our programs.
Declarative Machine Learning Systems:
The future of machine learning will depend on it being in the hands of the rest of us.
The people training and using ML models now are typically experienced developers with years of study working within large organizations, but the next wave of ML systems should allow a substantially larger number of people, potentially without any coding skills, to perform the same tasks. These new ML systems will not require users to fully understand all the details of how models are trained and used for obtaining predictions, but will provide them a more abstract interface that is less demanding and more familiar.
Real-world String Comparison:
How to handle Unicode sequences correctly
In many languages a string comparison is a pitfall for beginners. With any Unicode string as input, a comparison often causes problems even for advanced users. The semantic equivalence of different characters in Unicode requires a normalization of the strings before comparing them. This article shows how to handle Unicode sequences correctly. The comparison of two strings for equality often raises questions concerning the difference between comparison by value, comparison of object references, strict equality, and loose equality. The most important aspect is semantic equivalence.
Digging into Big Provenance (with SPADE):
A user interface for querying provenance
Several interfaces exist for querying provenance. Many are not flexible in allowing users to select a database type of their choice. Some provide query functionality in a data model that is different from the graph-oriented one that is natural for provenance. Others have intuitive constructs for finding results but have limited support for efficiently chaining responses, as needed for faceted search. This article presents a user interface for querying provenance that addresses these concerns and is agnostic to the underlying database being used.
The Complex Path to Quantum Resistance:
Is your organization prepared?
There is a new technology on the horizon that will forever change the information security and privacy industry landscape. Quantum computing, together with quantum communication, will have many beneficial applications but will also be capable of breaking many of today's most popular cryptographic techniques that help ensure data protection?in particular, confidentiality and integrity of sensitive information. These techniques are ubiquitously embedded in today's digital fabric and implemented by many industries such as finance, health care, utilities, and the broader information communication technology (ICT) community.
Biases in AI Systems:
A survey for practitioners
This article provides an organization of various kinds of biases that can occur in the AI pipeline starting from dataset creation and problem formulation to data analysis and evaluation. It highlights the challenges associated with the design of bias-mitigation strategies, and it outlines some best practices suggested by researchers. Finally, a set of guidelines is presented that could aid ML developers in identifying potential sources of bias, as well as avoiding the introduction of unwanted biases. The work is meant to serve as an educational resource for ML developers in handling and addressing issues related to bias in AI systems.
Software Development in Disruptive Times:
Creating a software solution with fast decision capability, agile project management, and extreme low-code technology
In this project, the challenge was to "deploy software faster than the coronavirus spread." In a project with such peculiar characteristics, several factors can influence success, but some clearly stand out: top management support, agility, understanding and commitment of the project team, and the technology used. Conventional development approaches and technologies would simply not be able to meet the requirements promptly.
WebRTC - Realtime Communication for the Open Web Platform:
What was once a way to bring audio and video to the web has expanded into more use cases we could ever imagine.
In this time of pandemic, the world has turned to Internet-based, RTC (realtime communication) as never before. The number of RTC products has, over the past decade, exploded in large part because of cheaper high-speed network access and more powerful devices, but also because of an open, royalty-free platform called WebRTC. WebRTC is growing from enabling useful experiences to being essential in allowing billions to continue their work and education, and keep vital human contact during a pandemic. The opportunities and impact that lie ahead for WebRTC are intriguing indeed.
Toward Confidential Cloud Computing:
Extending hardware-enforced cryptographic protection to data while in use
Although largely driven by economies of scale, the development of the modern cloud also enables increased security. Large data centers provide aggregate availability, reliability, and security assurances. The operational cost of ensuring that operating systems, databases, and other services have secure configurations can be amortized among all tenants, allowing the cloud provider to employ experts who are responsible for security; this is often unfeasible for smaller businesses, where the role of systems administrator is often conflated with many others.
The SPACE of Developer Productivity:
There's more to it than you think.
Developer productivity is about more than an individual's activity levels or the efficiency of the engineering systems relied on to ship software, and it cannot be measured by a single metric or dimension. The SPACE framework captures different dimensions of productivity, and here we demonstrate how this framework can be used to understand productivity in practice and why using it will help teams better understand developer productivity and create better measures to inform their work and teams.
Enclaves in the Clouds:
Legal considerations and broader implications
With organizational data practices coming under increasing scrutiny, demand is growing for mechanisms that can assist organizations in meeting their data-management obligations. TEEs (trusted execution environments) provide hardware-based mechanisms with various security properties for assisting computation and data management. TEEs are concerned with the confidentiality and integrity of data, code, and the corresponding computation. Because the main security properties come from hardware, certain protections and guarantees can be offered even if the host privileged software stack is vulnerable.
Best Practice: Application Frameworks:
While powerful, frameworks are not for everyone.
While frameworks can be a powerful tool, they have some disadvantages and may not make sense for all organizations. Framework maintainers need to provide standardization and well-defined behavior while not being overly prescriptive. When frameworks strike the right balance, however, they can offer large developer productivity gains. The consistency provided by widespread use of frameworks is a boon for other teams such as SRE and security that have a vested interest in the quality of applications. Additionally, the structure of frameworks provides a foundation for building higher-level abstractions such as microservices platforms, which unlock new opportunities for system architecture and automation.
Everything VPN is New Again:
The 24-year-old security model has found a second wind.
The VPN (virtual private network) is 24 years old. The concept was created for a radically different Internet from the one we know today. As the Internet grew and changed, so did VPN users and applications. The VPN had an awkward adolescence in the Internet of the 2000s, interacting poorly with other widely popular abstractions. In the past decade the Internet has changed again, and this new Internet offers new uses for VPNs. The development of a radically new protocol, WireGuard, provides a technology on which to build these new VPNs.
The Die is Cast:
Hardware Security is Not Assured
The future of hardware security will evolve with hardware. As packaging advances and focus moves to beyond Moore's law technologies, hardware security experts will need to keep ahead of changing security paradigms, including system and process vulnerabilities. Research focused on quantum hacking is emblematic of the translation of principles of security on the physical attack plane for emerging communications and computing technologies. Perhaps the commercial market will evolve such that the GAO will run a study on compromised quantum technologies in the not-too-distant future.
The Identity in Everyone's Pocket:
Keeping users secure through their smartphones
Newer phones use security features in many different ways and combinations. As with any security technology, however, using a feature incorrectly can create a false sense of security. As such, many app developers and service providers today do not use any of the secure identity-management facilities that modern phones offer. For those of you who fall into this camp, this article is meant to leave you with ideas about how to bring a hardware-backed and biometrics-based concept of user identity into your ecosystem.
Security Analysis of SMS as a Second Factor of Authentication:
The challenges of multifactor authentication based on SMS, including cellular security deficiencies, SS7 exploits, and SIM swapping
Despite their popularity and ease of use, SMS-based authentication tokens are arguably one of the least secure forms of two-factor authentication. This does not imply, however, that it is an invalid method for securing an online account. The current security landscape is very different from that of two decades ago. Regardless of the critical nature of an online account or the individual who owns it, using a second form of authentication should always be the default option, regardless of the method chosen.
Scrum Essentials Cards:
Experiences of Scrum Teams Improving with Essence
This article presents a series of examples and case studies on how people have used the Scrum Essentials cards to benefit their teams and improve how they work. Scrum is one of the most popular agile frameworks used successfully all over the world. It has been taught and used for 15-plus years. It is by far the most-used practice when developing software, and it has been generalized to be applicable for not just software but all kinds of products. It has been taught to millions of developers, all based on the Scrum Guide.
Data on the Outside vs. Data on the Inside:
Data kept outside SQL has different characteristics from data kept inside.
This article describes the impact of services and trust on the treatment of data. It introduces the notions of inside data as distinct from outside data. After discussing the temporal implications of not sharing transactions across the boundaries of services, the article considers the need for immutability and stability in outside data. This leads to a depiction of outside data as a DAG of data items being independently generated by disparate services.
The History, Status, and Future of FPGAs:
Hitting a nerve with field-programmable gate arrays
This article is a summary of a three-hour discussion at Stanford University in September 2019 among the authors. It has been written with combined experiences at and with organizations such as Zilog, Altera, Xilinx, Achronix, Intel, IBM, Stanford, MIT, Berkeley, University of Wisconsin, the Technion, Fairchild, Bell Labs, Bigstream, Google, DIGITAL (DEC), SUN, Nokia, SRI, Hitachi, Silicom, Maxeler Technologies, VMware, Xerox PARC, Cisco, and many others. These organizations are not responsible for the content, but may have inspired the authors in some ways, to arrive at the colorful ride through FPGA space described above.
Debugging Incidents in Google’s Distributed Systems:
How experts debug production issues in complex distributed systems
This article covers the outcomes of research performed in 2019 on how engineers at Google debug production issues, including the types of tools, high-level strategies, and low-level tasks that engineers use in varying combinations to debug effectively. It examines the research approach used to capture data, summarizing the common engineering journeys for production investigations and sharing examples of how experts debug complex distributed systems. Finally, the article extends the Google specifics of this research to provide some practical strategies that you can apply in your organization.
Is Persistent Memory Persistent?:
A simple and inexpensive test of failure-atomic update mechanisms
Power failures pose the most severe threat to application data integrity, and painful experience teaches that the integrity promises of failure-atomic update mechanisms can’t be taken at face value. Diligent developers and operators insist on confirming integrity claims by extensive firsthand tests. This article presents a simple and inexpensive testbed capable of subjecting storage devices, system software, and application software to ten thousand sudden whole-system power-interruption tests per week.
Dark Patterns: Past, Present, and Future:
The evolution of tricky user interfaces
Dark patterns are an abuse of the tremendous power that designers hold in their hands. As public awareness of dark patterns grows, so does the potential fallout. Journalists and academics have been scrutinizing dark patterns, and the backlash from these exposures can destroy brand reputations and bring companies under the lenses of regulators. Design is power. In the past decade, software engineers have had to confront the fact that the power they hold comes with responsibilities to users and to society. In this decade, it is time for designers to learn this lesson as well.
Demystifying Stablecoins:
Cryptography meets monetary policy
Self-sovereign stablecoins are interesting and probably here to stay; however, they face numerous regulatory hurdles from banking, financial tracking, and securities laws. For stablecoins backed by a governmental currency, the ultimate expression would be a CBDC. Since paper currency has been in steady decline (and disproportionately for legitimate transactions), a CBDC could reintroduce cash with technological advantages and efficient settlement while minimizing user fees.
Beyond the Fix-it Treadmill:
The Use of Post-Incident Artifacts in High-Performing Organizations
Given that humanity’s study of the sociological factors in safety is almost a century old, the technology industry’s post-incident analysis practices and how we create and use the artifacts those practices produce are all still in their infancy. So don’t be surprised that many of these practices are so similar, that the cognitive and social models used to parse apart and understand incidents and outages are few and cemented in the operational ethos, and that the byproducts sought from post-incident analyses are far-and-away focused on remediation items and prevention.
Managing the Hidden Costs of Coordination:
Controlling coordination costs when multiple, distributed perspectives are essential
Some initial considerations to control cognitive costs for incident responders include: (1) assessing coordination strategies relative to the cognitive demands of the incident; (2) recognizing when adaptations represent a tension between multiple competing demands (coordination and cognitive work) and seeking to understand them better rather than unilaterally eliminating them; (3) widening the lens to study the joint cognition system (integration of human-machine capabilities) as the unit of analysis; and (4) viewing joint activity as an opportunity for enabling reciprocity across inter- and intra-organizational boundaries.
Cognitive Work of Hypothesis Exploration During Anomaly Response:
A look at how we respond to the unexpected
Four incidents from web-based software companies reveal important aspects of anomaly response processes when incidents arise in web operations, two of which are discussed in this article. One particular cognitive function examined in detail is hypothesis generation and exploration, given the impact of obscure automation on engineers’ development of coherent models of the systems they manage. Each case was analyzed using the techniques and concepts of cognitive systems engineering. The set of cases provides a window into the cognitive work "above the line" in incident management of complex web-operation systems.
Above the Line, Below the Line:
The resilience of Internet-facing systems relies on what is above the line of representation.
Knowledge and understanding of below-the-line structure and function are continuously in flux. Near-constant effort is required to calibrate and refresh the understanding of the workings, dependencies, limitations, and capabilities of what is present there. In this dynamic situation no individual or group can ever know the system state. Instead, individuals and groups must be content with partial, fragmented mental models that require more or less constant updating and adjustment if they are to be useful.
Revealing the Critical Role of Human Performance in Software:
It’s time to revise our appreciation of the human side of Internet-facing software systems.
Understanding, supporting, and sustaining the capabilities above the line of representation require all stakeholders to be able to continuously update and revise their models of how the system is messy and yet usually manages to work. This kind of openness to continually reexamine how the system really works requires expanding the efforts to learn from incidents.
Blockchain Technology: What Is It Good for?:
Industry’s dreams and fears for this new technology
Business executives, government leaders, investors, and researchers frequently ask the following three questions: (1) What exactly is blockchain technology? (2) What capabilities does it provide? (3) What are good applications? Here we answer these questions thoroughly, provide a holistic overview of blockchain technology that separates hype from reality, and propose a useful lexicon for discussing the specifics of blockchain technology in the future.
The Reliability of Enterprise Applications:
Understanding enterprise reliability
Enterprise reliability is a discipline that ensures applications will deliver the required business functionality in a consistent, predictable, and cost-effective manner without compromising core aspects such as availability, performance, and maintainability. This article describes a core set of principles and engineering methodologies that enterprises can apply to help them navigate the complex environment of enterprise reliability and deliver highly reliable and cost-efficient applications.
Optimizations in C++ Compilers:
A practical journey
There’s a tradeoff to be made in giving the compiler more information: it can make compilation slower. Technologies such as link time optimization can give you the best of both worlds. Optimizations in compilers continue to improve, and upcoming improvements in indirect calls and virtual function dispatch might soon lead to even faster polymorphism.
Hack for Hire:
Investigating the emerging black market of retail email account hacking services
Hack-for-hire services charging $100-$400 per contract were found to produce sophisticated, persistent, and personalized attacks that were able to bypass 2FA via phishing. The demand for these services, however, appears to be limited to a niche market, as evidenced by the small number of discoverable services, an even smaller number of successful services, and the fact that these attackers target only about one in a million Google users.
The Effects of Mixing Machine Learning and Human Judgment:
Collaboration between humans and machines does not necessarily lead to better outcomes.
Based on the theoretical findings from the existing literature, some policymakers and software engineers contend that algorithmic risk assessments such as the COMPAS software can alleviate the incarceration epidemic and the occurrence of violent crimes by informing and improving decisions about policing, treatment, and sentencing. Considered in tandem, these findings indicate that collaboration between humans and machines does not necessarily lead to better outcomes, and human supervision does not sufficiently address problems when algorithms err or demonstrate concerning biases.
Persistent Memory Programming on Conventional Hardware:
The persistent memory style of programming can dramatically simplify application software.
Driven by the advent of byte-addressable non-volatile memory, the persistent memory style of programming will gain traction among developers, taking its rightful place alongside existing paradigms for managing persistent application state. Until NVM becomes available on all computers, developers can use the techniques presented in this article to enjoy the benefits of persistent memory programming on conventional hardware.
Velocity in Software Engineering:
From tectonic plate to F-16
Software engineering occupies an increasingly critical role in companies across all sectors, but too many software initiatives end up both off target and over budget. A surer path is optimized for speed, open to experimentation and learning, agile, and subject to regular course correcting. Good ideas tend to be abundant, though execution at high velocity is elusive. The good news is that velocity is controllable; companies can invest systematically to increase it.
Open-source Firmware:
Step into the world behind the kernel.
Open-source firmware can help bring computing to a more secure place by making the actions of firmware more visible and less likely to do harm. This article’s goal is to make readers feel empowered to demand more from vendors who can help drive this change.
Surviving Software Dependencies:
Software reuse is finally here but comes with risks.
Software reuse is finally here, and its benefits should not be understated, but we’ve accepted this transformation without completely thinking through the potential consequences. The Copay and Equifax attacks are clear warnings of real problems in the way software dependencies are consumed today. There’s a lot of good software out there. Let’s work together to find out how to reuse it safely.
Industry-scale Knowledge Graphs: Lessons and Challenges:
Five diverse technology companies show how it’s done
This article looks at the knowledge graphs of five diverse tech companies, comparing the similarities and differences in their respective experiences of building and using the graphs, and discussing the challenges that all knowledge-driven enterprises face today. The collection of knowledge graphs discussed here covers the breadth of applications, from search, to product descriptions, to social networks.
Garbage Collection as a Joint Venture:
A collaborative approach to reclaiming memory in heterogeneous software systems
Cross-component tracing is a way to solve the problem of reference cycles across component boundaries. This problem appears as soon as components can form arbitrary object graphs with nontrivial ownership across API boundaries. An incremental version of CCT is implemented in V8 and Blink, enabling effective and efficient reclamation of memory in a safe manner.
Online Event Processing:
Achieving consistency where distributed transactions have failed
Support for distributed transactions across heterogeneous storage technologies is either nonexistent or suffers from poor operational and performance characteristics. In contrast, OLEP is increasingly used to provide good performance and strong consistency guarantees in such settings. In data systems it is very common for logs to be used as internal implementation details. The OLEP approach is different: it uses event logs, rather than transactions, as the primary application programming model for data management. Traditional databases are still used, but their writes come from a log rather than directly from the application. The use of OLEP is not simply pragmatism on the part of developers, but rather it offers a number of advantages.
Net Neutrality: Unexpected Solution to Blockchain Scaling:
Cloud-delivery networks could dramatically improve blockchains’ scalability, but clouds must be provably neutral first.
Provably neutral clouds are undoubtedly a viable solution to blockchain scaling. By optimizing the transport layer, not only can the throughput be fundamentally scaled up, but the latency could be dramatically reduced. Indeed, the latency distribution in today’s data centers is already biased toward microsecond timescales for most of the flows, with millisecond timescales residing only at the tail of the distribution. There is no reason why a BDN point of presence would not be able to achieve a similar performance. Adding dedicated optical infrastructure among such BDN points of presence would further alleviate throughput and reduce latency, creating the backbone of an advanced BDN.
Identity by Any Other Name:
The complex cacophony of intertwined systems
New emerging systems and protocols both tighten and loosen our notions of identity, and that’s good! They make it easier to get stuff done. REST, IoT, big data, and machine learning all revolve around notions of identity that are deliberately kept flexible and sometimes ambiguous. Notions of identity underlie our basic mechanisms of distributed systems, including interchangeability, idempotence, and immutability.
Achieving Digital Permanence:
The many challenges to maintaining stored information and ways to overcome them
Today’s Information Age is creating new uses for and new ways to steward the data that the world depends on. The world is moving away from familiar, physical artifacts to new means of representation that are closer to information in its essence. We need processes to ensure both the integrity and accessibility of knowledge in order to guarantee that history will be known and true.
Metrics That Matter:
Critical but oft-neglected service metrics that every SRE and product owner should care about
Measure your site reliability metrics, set the right targets, and go through the work to measure the metrics accurately. Then, you’ll find that your service runs better, with fewer outages, and much more user adoption.
A Hitchhiker’s Guide to the Blockchain Universe:
Blockchain remains a mystery, despite its growing acceptance.
It is difficult these days to avoid hearing about blockchain. Despite the significant potential of blockchain, it is also difficult to find a consistent description of what it really is. This article looks at the basics of blockchain: the individual components, how those components fit together, and what changes might be made to solve some of the problems with blockchain technology.
Tear Down the Method Prisons! Set Free the Practices!:
Essence: a new way of thinking that promises to liberate the practices and enable true learning organizations
This article explains why we need to break out of this repetitive dysfunctional behavior, and it introduces Essence, a new way of thinking that promises to free the practices from their method prisons and thus enable true learning organizations.
Understanding Database Reconstruction Attacks on Public Data:
These attacks on statistical databases are no longer a theoretical danger.
With the dramatic improvement in both computer speeds and the efficiency of SAT and other NP-hard solvers in the last decade, DRAs on statistical databases are no longer just a theoretical danger. The vast quantity of data products published by statistical agencies each year may give a determined attacker more than enough information to reconstruct some or all of a target database and breach the privacy of millions of people. Traditional disclosure-avoidance techniques are not designed to protect against this kind of attack.
Benchmarking "Hello, World!":
Six different views of the execution of "Hello, World!" show what is often missing in today’s tools
As more and more software moves off the desktop and into data centers, and more and more cell phones use server requests as the other half of apps, observation tools for large-scale distributed transaction systems are not keeping up. This makes it tempting to look under the lamppost using simpler tools. You will waste a lot of high-pressure time following that path when you have a sudden complex performance crisis.
Using Remote Cache Service for Bazel:
Save time by sharing and reusing build and test output
Remote cache service is a new development that significantly saves time in running builds and tests. It is particularly useful for a large code base and any size of development team. Bazel is an actively developed open-source build and test system that aims to increase productivity in software development. It has a growing number of optimizations to improve the performance of daily development tasks.
Why SRE Documents Matter:
How documentation enables SRE teams to manage new and existing services
SRE (site reliability engineering) is a job function, a mindset, and a set of engineering approaches for making web products and services run reliably. SREs operate at the intersection of software development and systems engineering to solve operational problems and engineer solutions to design, build, and run large-scale distributed systems scalably, reliably, and efficiently. A mature SRE team likely has well-defined bodies of documentation associated with many SRE functions.
How to Live in a Post-Meltdown and -Spectre World:
Learn from the past to prepare for the next battle.
Spectre and Meltdown create a risk landscape that has more questions than answers. This article addresses how these vulnerabilities were triaged when they were announced and the practical defenses that are available. Ultimately, these vulnerabilities present a unique set of circumstances, but for the vulnerability management program at Goldman Sachs, the response was just another day at the office.
Tracking and Controlling Microservice Dependencies:
Dependency management is a crucial part of system and software design.
Dependency cycles will be familiar to you if you have ever locked your keys inside your house or car. You can’t open the lock without the key, but you can’t get the key without opening the lock. Some cycles are obvious, but more complex dependency cycles can be challenging to find before they lead to outages. Strategies for tracking and controlling dependencies are necessary for maintaining reliable systems.
Corp to Cloud: Google’s Virtual Desktops:
How Google moved its virtual desktops to the cloud
Over one-fourth of Googlers use internal, data-center-hosted virtual desktops. This on-premises offering sits in the corporate network and allows users to develop code, access internal resources, and use GUI tools remotely from anywhere in the world. Among its most notable features, a virtual desktop instance can be sized according to the task at hand, has persistent user storage, and can be moved between corporate data centers to follow traveling Googlers. Until recently, our virtual desktops were hosted on commercially available hardware on Google’s corporate network using a homegrown open-source virtual cluster-management system called Ganeti. Today, this substantial and Google-critical workload runs on GCP (Google Compute Platform).
The Mythos of Model Interpretability:
In machine learning, the concept of interpretability is both important and slippery.
Supervised machine-learning models boast remarkable predictive capabilities. But can you trust your model? Will it work in deployment? What else can it tell you about the world?
Mind Your State for Your State of Mind:
The interactions between storage and applications can be complex and subtle.
Applications have had an interesting evolution as they have moved into the distributed and scalable world. Similarly, storage and its cousin databases have changed side by side with applications. Many times, the semantics, performance, and failure models of storage and applications do a subtle dance as they change in support of changing business requirements and environmental challenges. Adding scale to the mix has really stirred things up. This article looks at some of these issues and their impact on systems.
Workload Frequency Scaling Law - Derivation and Verification:
Workload scalability has a cascade relation via the scale factor.
This article presents equations that relate to workload utilization scaling at a per-DVFS subsystem level. A relation between frequency, utilization, and scale factor (which itself varies with frequency) is established. The verification of these equations turns out to be tricky, since inherent to workload, the utilization also varies seemingly in an unspecified manner at the granularity of governance samples. Thus, a novel approach called histogram ridge trace is applied. Quantifying the scaling impact is critical when treating DVFS as a building block. Typical application includes DVFS governors and or other layers that influence utilization, power, and performance of the system.
Algorithms Behind Modern Storage Systems:
Different uses for read-optimized B-trees and write-optimized LSM-trees
This article takes a closer look at two storage system design approaches used in a majority of modern databases (read-optimized B-trees and write-optimized LSM (log-structured merge)-trees) and describes their use cases and tradeoffs.
C Is Not a Low-level Language:
Your computer is not a fast PDP-11.
In the wake of the recent Meltdown and Spectre vulnerabilities, it’s worth spending some time looking at root causes. Both of these vulnerabilities involved processors speculatively executing instructions past some kind of access check and allowing the attacker to observe the results via a side channel. The features that led to these vulnerabilities, along with several others, were added to let C programmers continue to believe they were programming in a low-level language, when this hasn’t been the case for decades.
Thou Shalt Not Depend on Me:
A look at JavaScript libraries in the wild
Most websites use JavaScript libraries, and many of them are known to be vulnerable. Understanding the scope of the problem, and the many unexpected ways that libraries are included, are only the first steps toward improving the situation. The goal here is that the information included in this article will help inform better tooling, development practices, and educational efforts for the community.
Designing Cluster Schedulers for Internet-Scale Services:
Embracing failures for improving availability
Engineers looking to build scheduling systems should consider all failure modes of the underlying infrastructure they use and consider how operators of scheduling systems can configure remediation strategies, while aiding in keeping tenant systems as stable as possible during periods of troubleshooting by the owners of the tenant systems.
Canary Analysis Service:
Automated canarying quickens development, improves production safety, and helps prevent outages.
It is unreasonable to expect engineers working on product development or reliability to have statistical knowledge; removing this hurdle led to widespread CAS adoption. CAS has proven useful even for basic cases that don’t need configuration, and has significantly improved Google’s rollout reliability. Impact analysis shows that CAS has likely prevented hundreds of postmortem-worthy outages, and the rate of postmortems among groups that do not use CAS is noticeably higher.
Continuous Delivery Sounds Great, but Will It Work Here?:
It’s not magic, it just requires continuous, daily improvement at all levels.
Continuous delivery is a set of principles, patterns, and practices designed to make deployments predictable, routine affairs that can be performed on demand at any time. This article introduces continuous delivery, presents both common objections and actual obstacles to implementing it, and describes how to overcome them using real-life examples. Continuous delivery is not magic. It’s about continuous, daily improvement at all levels of the organization.
Containers Will Not Fix Your Broken Culture (and Other Hard Truths):
Complex socio-technical systems are hard; film at 11.
We focus so often on technical anti-patterns, neglecting similar problems inside our social structures. Spoiler alert: the solutions to many difficulties that seem technical can be found by examining our interactions with others. Let’s talk about five things you’ll want to know when working with those pesky creatures known as humans.
DevOps Metrics:
Your biggest mistake might be collecting the wrong data.
Delivering value to the business through software requires processes and coordination that often span multiple teams across complex systems, and involves developing and delivering software with both quality and resiliency. As practitioners and professionals, we know that software development and delivery is an increasingly difficult art and practice, and that managing and improving any process or system requires insights into that system. Therefore, measurement is paramount to creating an effective software value stream. Yet accurate measurement is no easy feat.
Monitoring in a DevOps World:
Perfect should never be the enemy of better.
Monitoring can seem quite overwhelming. The most important thing to remember is that perfect should never be the enemy of better. DevOps enables highly iterative improvement within organizations. If you have no monitoring, get something; get anything. Something is better than nothing, and if you’ve embraced DevOps, you’ve already signed up for making it better over time.
Bitcoin’s Underlying Incentives:
The unseen economic forces that govern the Bitcoin protocol
Incentives are crucial for the Bitcoin protocol’s security and effectively drive its daily operation. Miners go to extreme lengths to maximize their revenue and often find creative ways to do so that are sometimes at odds with the protocol. Cryptocurrency protocols should be placed on stronger foundations of incentives. There are many areas left to improve, ranging from the very basics of mining rewards and how they interact with the consensus mechanism, through the rewards in mining pools, and all the way to the transaction fee market itself.
Titus: Introducing Containers to the Netflix Cloud:
Approaching container adoption in an already cloud-native infrastructure
We believe our approach has enabled Netflix to quickly adopt and benefit from containers. Though the details may be Netflix-specific, the approach of providing low-friction container adoption by integrating with existing infrastructure and working with the right early adopters can be a successful strategy for any organization looking to adopt containers.
Abstracting the Geniuses Away from Failure Testing:
Ordinary users need tools that automate the selection of custom-tailored faults to inject.
This article presents a call to arms for the distributed systems research community to improve the state of the art in fault tolerance testing. Ordinary users need tools that automate the selection of custom-tailored faults to inject. We conjecture that the process by which superusers select experiments can be effectively modeled in software. The article describes a prototype validating this conjecture, presents early results from the lab and the field, and identifies new research directions that can make this vision a reality.
Network Applications Are Interactive:
The network era requires new models, with interactions instead of algorithms.
The miniaturization of devices and the prolific interconnectedness of these devices over high-speed wireless networks is completely changing how commerce is conducted. These changes (a.k.a. digital) will profoundly change how enterprises operate. Software is at the heart of this digital world, but the software toolsets and languages were conceived for the host-based era. The issues that already plague software practice (such as high defects, poor software productivity, information vulnerability, poor software project success rates, etc.) will be more profound with such an approach. It is time for software to be made simpler, secure, and reliable.
Cache Me If You Can:
Building a decentralized web-delivery model
The world is more connected than it ever has been before, and with our pocket supercomputers and IoT (Internet of Things) future, the next generation of the web might just be delivered in a peer-to-peer model. It’s a giant problem space, but the necessary tools and technology are here today. We just need to define the problem a little better.
Bitcoin’s Academic Pedigree:
The concept of cryptocurrencies is built from forgotten ideas in research literature.
We’ve seen repeatedly that ideas in the research literature can be gradually forgotten or lie unappreciated, especially if they are ahead of their time, even in popular areas of research. Both practitioners and academics would do well to revisit old ideas to glean insights for present systems. Bitcoin was unusual and successful not because it was on the cutting edge of research on any of its components, but because it combined old ideas from many previously unrelated fields. This is not easy to do, as it requires bridging disparate terminology, assumptions, etc., but it is a valuable blueprint for innovation.
Metaphors We Compute By:
Code is a story that explains how to solve a particular problem.
Programmers must be able to tell a story with their code, explaining how they solved a particular problem. Like writers, programmers must know their metaphors. Many metaphors will be able to explain a concept, but you must have enough skill to choose the right one that’s able to convey your ideas to future programmers who will read the code. Thus, you cannot use every metaphor you know. You must master the art of metaphor selection, of meaning amplification. You must know when to add and when to subtract. You will learn to revise and rewrite code as a writer does. Once there’s nothing else to add or remove, you have finished your work.
Is There a Single Method for the Internet of Things?:
Essence can keep software development for the IoT from becoming unwieldy.
The Industrial Internet Consortium predicts the IoT (Internet of Things) will become the third technological revolution after the Industrial Revolution and the Internet Revolution. Its impact across all industries and businesses can hardly be imagined. Existing software (business, telecom, aerospace, defense, etc.) is expected to be modified or redesigned, and a huge amount of new software, solving new problems, will have to be developed. As a consequence, the software industry should welcome new and better methods.
Data Sketching:
The approximate approach is often faster and more efficient.
Do you ever feel overwhelmed by an unending stream of information? It can seem like a barrage of new email and text messages demands constant attention, and there are also phone calls to pick up, articles to read, and knocks on the door to answer. Putting these pieces together to keep track of what’s important can be a real challenge. In response to this challenge, the model of streaming data processing has grown in popularity. The aim is no longer to capture, store, and index every minute event, but rather to process each observation quickly in order to create a summary of the current state.
The Calculus of Service Availability:
You’re only as available as the sum of your dependencies.
Most services offered by Google aim to offer 99.99 percent (sometimes referred to as the "four 9s") availability to users. Some services contractually commit to a lower figure externally but set a 99.99 percent target internally. This more stringent target accounts for situations in which users become unhappy with service performance well before a contract violation occurs, as the number one aim of an SRE team is to keep users happy. For many services, a 99.99 percent internal target represents the sweet spot that balances cost, complexity, and availability.
The IDAR Graph:
An improvement over UML
UML is the de facto standard for representing object-oriented designs. It does a fine job of recording designs, but it has a severe problem: its diagrams don’t convey what humans need to know, making them hard to understand. This is why most software developers use UML only when forced to. People understand an organization, such as a corporation, in terms of a control hierarchy. When faced with an organization of people or objects, the first question usually is, "What’s controlling all this?" Surprisingly, UML has no concept of one object controlling another. Consequently, in every type of UML diagram, no object appears to have greater or lesser control than its neighbors.
Too Big NOT to Fail:
Embrace failure so it doesn’t embrace you.
Web-scale infrastructure implies LOTS of servers working together, often tens or hundreds of thousands of servers all working toward the same goal. How can the complexity of these environments be managed? How can commonality and simplicity be introduced?
The Debugging Mindset:
Understanding the psychology of learning strategies leads to effective problem-solving skills.
Software developers spend 35-50 percent of their time validating and debugging software. The cost of debugging, testing, and verification is estimated to account for 50-75 percent of the total budget of software development projects, amounting to more than $100 billion annually. While tools, languages, and environments have reduced the time spent on individual debugging tasks, they have not significantly reduced the total time spent debugging, nor the cost of doing so. Therefore, a hyperfocus on elimination of bugs during development is counterproductive; programmers should instead embrace debugging as an exercise in problem solving.
MongoDB’s JavaScript Fuzzer:
The fuzzer is for those edge cases that your testing didn’t catch.
As MongoDB becomes more feature-rich and complex with time, the need to develop more sophisticated methods for finding bugs grows as well. Three years ago, MongDB added a home-grown JavaScript fuzzer to its toolkit, and it is now our most prolific bug-finding tool, responsible for detecting almost 200 bugs over the course of two release cycles. These bugs span a range of MongoDB components from sharding to the storage engine, with symptoms ranging from deadlocks to data inconsistency. The fuzzer runs as part of the CI (continuous integration) system, where it frequently catches bugs in newly committed code.
Making Money Using Math:
Modern applications are increasingly using probabilistic machine-learned models.
A big difference between human-written code and learned models is that the latter are usually not represented by text and hence are not understandable by human developers or manipulable by existing tools. The consequence is that none of the traditional software engineering techniques for conventional programs (such as code reviews, source control, and debugging) are applicable anymore. Since incomprehensibility is not unique to learned code, these aspects are not of concern here.
Pervasive, Dynamic Authentication of Physical Items:
The use of silicon PUF circuits
Authentication of physical items is an age-old problem. Common approaches include the use of bar codes, QR codes, holograms, and RFID (radio-frequency identification) tags. Traditional RFID tags and bar codes use a public identifier as a means of authenticating. A public identifier, however, is static: it is the same each time when queried and can be easily copied by an adversary. Holograms can also be viewed as public identifiers: a knowledgeable verifier knows all the attributes to inspect visually. It is difficult to make hologram-based authentication pervasive; a casual verifier does not know all the attributes to look for.
Uninitialized Reads:
Understanding the proposed revisions to the C language
Most developers understand that reading uninitialized variables in C is a defect, but some do it anyway. What happens when you read uninitialized objects is unsettled in the current version of the C standard (C11).3 Various proposals have been made to resolve these issues in the planned C2X revision of the standard. Consequently, this is a good time to understand existing behaviors as well as proposed revisions to the standard to influence the evolution of the C language. Given that the behavior of uninitialized reads is unsettled in C11, prudence dictates eliminating uninitialized reads from your code.
Heterogeneous Computing: Here to Stay:
Hardware and Software Perspectives
Mentions of the buzzword heterogeneous computing have been on the rise in the past few years and will continue to be heard for years to come, because heterogeneous computing is here to stay. What is heterogeneous computing, and why is it becoming the norm? How do we deal with it, from both the software side and the hardware side? This article provides answers to some of these questions and presents different points of view on others.
Time, but Faster:
A computing adventure about time through the looking glass
The first premise was summed up perfectly by the late Douglas Adams in The Hitchhiker’s Guide to the Galaxy: "Time is an illusion. Lunchtime doubly so." The concept of time, when colliding with decoupled networks of computers that run at billions of operations per second, is... well, the truth of the matter is that you simply never really know what time it is. That is why Leslie Lamport’s seminal paper on Lamport timestamps was so important to the industry, but this article is actually about wall-clock time, or a reasonably useful estimation of it.
Life Beyond Distributed Transactions:
An apostate’s opinion
This article explores and names some of the practical approaches used in the implementation of large-scale mission-critical applications in a world that rejects distributed transactions. Topics include the management of fine-grained pieces of application data that may be repartitioned over time as the application grows. Design patterns support sending messages between these repartitionable pieces of data.
BBR: Congestion-Based Congestion Control:
Measuring bottleneck bandwidth and round-trip propagation time
When bottleneck buffers are large, loss-based congestion control keeps them full, causing bufferbloat. When bottleneck buffers are small, loss-based congestion control misinterprets loss as a signal of congestion, leading to low throughput. Fixing these problems requires an alternative to loss-based congestion control. Finding this alternative requires an understanding of where and how network congestion originates.
Faucet: Deploying SDN in the Enterprise:
Using OpenFlow and DevOps for rapid development
While SDN as a technology continues to evolve and become even more programmable, Faucet and OpenFlow 1.3 hardware together are sufficient to realize benefits today. This article describes specifically how to take advantage of DevOps practices to develop and deploy features rapidly. It also describes several practical deployment scenarios, including firewalling and network function virtualization.
Industrial Scale Agile - from Craft to Engineering:
Essence is instrumental in moving software development toward a true engineering discipline.
There are many, many ways to illustrate how fragile IT investments can be. You just have to look at the way that, even after huge investments in education and coaching, many organizations are struggling to broaden their agile adoption to the whole of their organization - or at the way other organizations are struggling to maintain the momentum of their agile adoptions as their teams change and their systems mature.
Functional at Scale:
Applying functional programming principles to distributed computing projects
Modern server software is demanding to develop and operate: it must be available at all times and in all locations; it must reply within milliseconds to user requests; it must respond quickly to capacity demands; it must process a lot of data and even more traffic; it must adapt quickly to changing product needs; and in many cases it must accommodate a large engineering organization, its many engineers the proverbial cooks in a big, messy kitchen.
Scaling Synchronization in Multicore Programs:
Advanced synchronization methods can boost the performance of multicore software.
Designing software for modern multicore processors poses a dilemma. Traditional software designs, in which threads manipulate shared data, have limited scalability because synchronization of updates to shared data serializes threads and limits parallelism. Alternative distributed software designs, in which threads do not share mutable data, eliminate synchronization and offer better scalability. But distributed designs make it challenging to implement features that shared data structures naturally provide, such as dynamic load balancing and strong consistency guarantees, and are simply not a good fit for every program. Often, however, the performance of shared mutable data structures is limited by the synchronization methods in use today, whether lock-based or lock-free.
Idle-Time Garbage-Collection Scheduling:
Taking advantage of idleness to reduce dropped frames and memory consumption
Google’s Chrome web browser strives to deliver a smooth user experience. An animation will update the screen at 60 FPS (frames per second), giving Chrome around 16.6 milliseconds to perform the update. Within these 16.6 ms, all input events have to be processed, all animations have to be performed, and finally the frame has to be rendered. A missed deadline will result in dropped frames. These are visible to the user and degrade the user experience. Such sporadic animation artifacts are referred to here as jank. This article describes an approach implemented in the JavaScript engine V8, used by Chrome, to schedule garbage-collection pauses during times when Chrome is idle.
Dynamics of Change: Why Reactivity Matters:
Tame the dynamics of change by centralizing each concern in its own module.
Professional programming is about dealing with software at scale. Everything is trivial when the problem is small and contained: it can be elegantly solved with imperative programming or functional programming or any other paradigm. Real-world challenges arise when programmers have to deal with large amounts of data, network requests, or intertwined entities, as in UI (user interface) programming.
Cluster-level Logging of Containers with Containers:
Logging Challenges of Container-Based Cloud Deployments
This article shows how cluster-level logging infrastructure can be implemented using open source tools and deployed using the very same abstractions that are used to compose and manage the software systems being logged. Collecting and analyzing log information is an essential aspect of running production systems to ensure their reliability and to provide important auditing information. Many tools have been developed to help with the aggregation and collection of logs for specific software components (e.g., an Apache web server) running on specific servers (e.g., Fluentd and Logstash.)
The Hidden Dividends of Microservices:
Microservices aren’t for every company, and the journey isn’t easy.
Microservices are an approach to building distributed systems in which services are exposed only through hardened APIs; the services themselves have a high degree of internal cohesion around a specific and well-bounded context or area of responsibility, and the coupling between them is loose. Such services are typically simple, yet they can be composed into very rich and elaborate applications. The effort required to adopt a microservices-based approach is considerable, particularly in cases that involve migration from more monolithic architectures. The explicit benefits of microservices are well known and numerous, however, and can include increased agility, resilience, scalability, and developer productivity.
Debugging Distributed Systems:
Challenges and options for validation and debugging
Distributed systems pose unique challenges for software developers. Reasoning about concurrent activities of system nodes and even understanding the system’s communication topology can be difficult. A standard approach to gaining insight into system activity is to analyze system logs. Unfortunately, this can be a tedious and complex process. This article looks at several key features and debugging challenges that differentiate distributed systems from other kinds of software. The article presents several promising tools and ongoing research to help resolve these challenges.
Should You Upload or Ship Big Data to the Cloud?:
The accepted wisdom does not always hold true.
It is accepted wisdom that when the data you wish to move into the cloud is at terabyte scale and beyond, you are better off shipping it to the cloud provider, rather than uploading it. This article takes an analytical look at how shipping and uploading strategies compare, the various factors on which they depend, and under what circumstances you are better off shipping rather than uploading data, and vice versa. Such an analytical determination is important to make, given the increasing availability of gigabit-speed Internet connections, along with the explosive growth in data-transfer speeds supported by newer editions of drive interfaces such as SAS and PCI Express.
The Flame Graph:
This visualization of software execution is a new necessity for performance profiling and debugging.
An everyday problem in our industry is understanding how software is consuming resources, particularly CPUs. What exactly is consuming how much, and how did this change since the last software version? These questions can be answered using software profilers, tools that help direct developers to optimize their code and operators to tune their environment. The output of profilers can be verbose, however, making it laborious to study and comprehend. The flame graph provides a new visualization for profiler output and can make for much faster comprehension, reducing the time for root cause analysis.
Why Logical Clocks are Easy:
Sometimes all you need is the right language.
Any computing system can be described as executing sequences of actions, with an action being any relevant change in the state of the system. For example, reading a file to memory, modifying the contents of the file in memory, or writing the new contents to the file are relevant actions for a text editor. In a distributed system, actions execute in multiple locations; in this context, actions are often called events. Examples of events in distributed systems include sending or receiving messages, or changing some state in a node. Not all events are related, but some events can cause and influence how other, later events occur.
Use-Case 2.0:
The Hub of Software Development
Use cases have been around for almost 30 years as a requirements approach and have been part of the inspiration for more-recent techniques such as user stories. Now the inspiration has flown in the other direction. Use-Case 2.0 is the new generation of use-case-driven development-light, agile, and lean-inspired by user stories and the agile methodologies Scrum and Kanban.
Statistics for Engineers:
Applying statistical techniques to operations data
Modern IT systems collect an increasing wealth of data from network gear, operating systems, applications, and other components. This data needs to be analyzed to derive vital information about the user experience and business performance. For instance, faults need to be detected, service quality needs to be measured and resource usage of the next days and month needs to be forecast.
Borg, Omega, and Kubernetes:
Lessons learned from three container-management systems over a decade
Though widespread interest in software containers is a relatively recent phenomenon, at Google we have been managing Linux containers at scale for more than ten years and built three different container-management systems in that time. Each system was heavily influenced by its predecessors, even though they were developed for different reasons. This article describes the lessons we’ve learned from developing and operating them.
The Verification of a Distributed System:
A practitioner’s guide to increasing confidence in system correctness
Leslie Lamport, known for his seminal work in distributed systems, famously said, "A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable." Given this bleak outlook and the large set of possible failures, how do you even begin to verify and validate that the distributed systems you build are doing the right thing?
Accountability in Algorithmic Decision-making:
A view from computational journalism
Every fiscal quarter automated writing algorithms churn out thousands of corporate earnings articles for the AP (Associated Press) based on little more than structured data. Companies such as Automated Insights, which produces the articles for AP, and Narrative Science can now write straight news articles in almost any domain that has clean and well-structured data: finance, sure, but also sports, weather, and education, among others. The articles aren’t cardboard either; they have variability, tone, and style, and in some cases readers even have difficulty distinguishing the machine-produced articles from human-written ones.
Immutability Changes Everything:
We need it, we can afford it, and the time is now.
There is an inexorable trend toward storing and sending immutable data. We need immutability to coordinate at a distance, and we can afford immutability as storage gets cheaper. This article is an amuse-bouche sampling the repeated patterns of computing that leverage immutability. Climbing up and down the compute stack really does yield a sense of déjà vu all over again.
Time is an Illusion.:
Lunchtime doubly so. - Ford Prefect to Arthur Dent in "The Hitchhiker’s Guide to the Galaxy", by Douglas Adams
One of the more surprising things about digital systems - and, in particular, modern computers - is how poorly they keep time. When most programs ran on a single system this was not a significant issue for the majority of software developers, but once software moved into the distributed-systems realm this inaccuracy became a significant challenge.
Non-volatile Storage:
Implications of the Datacenter’s Shifting Center
For the entire careers of most practicing computer scientists, a fundamental observation has consistently held true: CPUs are significantly more performant and more expensive than I/O devices. The fact that CPUs can process data at extremely high rates, while simultaneously servicing multiple I/O devices, has had a sweeping impact on the design of both hardware and software for systems of all sizes, for pretty much as long as we’ve been building them.
Schema.org: Evolution of Structured Data on the Web:
Big data makes common schemas even more necessary.
Separation between content and presentation has always been one of the important design aspects of the Web. Historically, however, even though most Web sites were driven off structured databases, they published their content purely in HTML. Services such as Web search, price comparison, reservation engines, etc. that operated on this content had access only to HTML. Applications requiring access to the structured data underlying these Web pages had to build custom extractors to convert plain HTML into structured data. These efforts were often laborious and the scrapers were fragile and error-prone, breaking every time a site changed its layout.
A Purpose-built Global Network: Google’s Move to SDN:
A discussion with Amin Vahdat, David Clark, and Jennifer Rexford
Everything about Google is at scale, of course -- a market cap of legendary proportions, an unrivaled talent pool, enough intellectual property to keep armies of attorneys in Guccis for life, and, oh yeah, a private WAN (wide area network) bigger than you can possibly imagine that also happens to be growing substantially faster than the Internet as a whole.
It Probably Works:
Probabilistic algorithms are all around us--not only are they acceptable, but some programmers actually seek out chances to use them.
Probabilistic algorithms exist to solve problems that are either impossible or unrealistic (too expensive, too time-consuming, etc.) to solve precisely. In an ideal world, you would never actually need to use probabilistic algorithms. To programmers who are not familiar with them, the idea can be positively nervewracking: "How do I know that it will actually work? What if it’s inexplicably wrong? How can I debug it? Maybe we should just punt on this problem, or buy a whole lot more servers..."
Challenges of Memory Management on Modern NUMA System:
Optimizing NUMA systems applications with Carrefour
Modern server-class systems are typically built as several multicore chips put together in a single system. Each chip has a local DRAM (dynamic random-access memory) module; together they are referred to as a node. Nodes are connected via a high-speed interconnect, and the system is fully coherent. This means that, transparently to the programmer, a core can issue requests to its node’s local memory as well as to the memories of other nodes. The key distinction is that remote requests will take longer, because they are subject to longer wire delays and may have to jump several hops as they traverse the interconnect.
Componentizing the Web:
We may be on the cusp of a new revolution in web development.
There is no task in software engineering today quite as herculean as web development. A typical specification for a web application might read: The app must work across a wide variety of browsers. It must run animations at 60 fps. It must be immediately responsive to touch. It must conform to a specific set of design principles and specs. It must work on just about every screen size imaginable, from TVs and 30-inch monitors to mobile phones and watch faces. It must be well-engineered and maintainable in the long term.
Fail at Scale:
Reliability in the face of rapid change
Failure is part of engineering any large-scale system. One of Facebook’s cultural values is embracing failure. This can be seen in the posters hung around the walls of our Menlo Park headquarters: "What Would You Do If You Weren’t Afraid?" and "Fortune Favors the Bold."
How to De-identify Your Data:
Balancing statistical accuracy and subject privacy in large social-science data sets
Big data is all the rage; using large data sets promises to give us new insights into questions that have been difficult or impossible to answer in the past. This is especially true in fields such as medicine and the social sciences, where large amounts of data can be gathered and mined to find insightful relationships among variables. Data in such fields involves humans, however, and thus raises issues of privacy that are not faced by fields such as physics or astronomy.
Crash Consistency:
Rethinking the Fundamental Abstractions of the File System
The reading and writing of data, one of the most fundamental aspects of any Von Neumann computer, is surprisingly subtle and full of nuance. For example, consider access to a shared memory in a system with multiple processors. While a simple and intuitive approach known as strong consistency is easiest for programmers to understand, many weaker models are in widespread use (e.g., x86 total store ordering); such approaches improve system performance, but at the cost of making reasoning about system behavior more complex and error-prone.
Testing a Distributed System:
Testing a distributed system can be trying even under the best of circumstances.
Distributed systems can be especially difficult to program, for a variety of reasons. They can be difficult to design, difficult to manage, and, above all, difficult to test. Testing a normal system can be trying even under the best of circumstances, and no matter how diligent the tester is, bugs can still get through. Now take all of the standard issues and multiply them by multiple processes written in multiple languages running on multiple boxes that could potentially all be on different operating systems, and there is potential for a real disaster.
Natural Language Translation at the Intersection of AI and HCI:
Old questions being answered with both AI and HCI
The fields of artificial intelligence (AI) and human-computer interaction (HCI) are influencing each other like never before. Widely used systems such as Google Translate, Facebook Graph Search, and RelateIQ hide the complexity of large-scale AI systems behind intuitive interfaces. But relations were not always so auspicious. The two fields emerged at different points in the history of computer science, with different influences, ambitions, and attendant biases. AI aimed to construct a rival, and perhaps a successor, to the human intellect. Early AI researchers such as McCarthy, Minsky, and Shannon were mathematicians by training, so theorem-proving and formal models were attractive research directions.
Beyond Page Objects: Testing Web Applications with State Objects:
Use states to drive your tests
End-to-end testing of Web applications typically involves tricky interactions with Web pages by means of a framework such as Selenium WebDriver. The recommended method for hiding such Web-page intricacies is to use page objects, but there are questions to answer first: Which page objects should you create when testing Web applications? What actions should you include in a page object? Which test scenarios should you specify, given your page objects?
Dismantling the Barriers to Entry:
We have to choose to build a web that is accessible to everyone.
A war is being waged in the world of web development. On one side is a vanguard of toolmakers and tool users, who thrive on the destruction of bad old ideas ("old," in this milieu, meaning anything that debuted on Hacker News more than a month ago) and raucous debates about transpilers and suchlike.
Hadoop Superlinear Scalability:
The perpetual motion of parallel performance
We often see more than 100 percent speedup efficiency! came the rejoinder to the innocent reminder that you can’t have more than 100 percent of anything. But this was just the first volley from software engineers during a presentation on how to quantify computer system scalability in terms of the speedup metric. In different venues, on subsequent occasions, that retort seemed to grow into a veritable chorus that not only was superlinear speedup commonly observed, but also the model used to quantify scalability for the past 20 years failed when applied to superlinear speedup data.
Evolution and Practice: Low-latency Distributed Applications in Finance:
The finance industry has unique demands for low-latency distributed systems.
Virtually all systems have some requirements for latency, defined here as the time required for a system to respond to input. Latency requirements appear in problem domains as diverse as aircraft flight controls, voice communications, multiplayer gaming, online advertising, and scientific experiments. Distributed systems present special latency considerations. In recent years the automation of financial trading has driven requirements for distributed systems with challenging latency requirements and global geographic distribution. Automated trading provides a window into the engineering challenges of ever-shrinking latency requirements, which may be useful to software engineers in other fields.
The Science of Managing Data Science:
Lessons learned managing a data science research team
What are they doing all day? When I first took over as VP of Engineering at a startup doing data mining and machine learning research, this was what the other executives wanted to know. They knew the team was super smart, and they seemed like they were working really hard, but the executives had lots of questions about the work itself. How did they know that the work they were doing was the "right" work? Were there other projects they could be doing instead? And how could we get this research into the hands of our customers faster?
Using Free and Open Source Tools to Manage Software Quality:
An agile process implementation
The principles of agile software development place more emphasis on individuals and interactions than on processes and tools. They steer us away from heavy documentation requirements and guide us along a path of reacting efficiently to change rather than sticking rigidly to a pre-defined plan. To support this flexible method of operation, it is important to have suitable applications to manage the team’s activities. It is also essential to implement effective frameworks to ensure quality is being built into the product early and at all levels.
From the EDVAC to WEBVACs:
Cloud computing for computer scientists
By now everyone has heard of cloud computing and realized that it is changing how both traditional enterprise IT and emerging startups are building solutions for the future. Is this trend toward the cloud just a shift in the complicated economics of the hardware and software industry, or is it a fundamentally different way of thinking about computing? Having worked in the industry, I can confidently say it is both.
Spicing Up Dart with Side Effects:
A set of extensions to the Dart programming language, designed to support asynchrony and generator functions
The Dart programming language has recently incorporated a set of extensions designed to support asynchrony and generator functions. Because Dart is a language for Web programming, latency is an important concern. To avoid blocking, developers must make methods asynchronous when computing their results requires nontrivial time. Generator functions ease the task of computing iterable sequences.
Reliable Cron across the Planet:
...or How I stopped worrying and learned to love time
This article describes Google’s implementation of a distributed Cron service, serving the vast majority of internal teams that need periodic scheduling of compute jobs. During its existence, we have learned many lessons on how to design and implement what might seem like a basic service. Here, we discuss the problems that distributed Crons face and outline some potential solutions.
There is No Now:
Problems with simultaneity in distributed systems
Now. The time elapsed between when I wrote that word and when you read it was at least a couple of weeks. That kind of delay is one that we take for granted and don’t even think about in written media. "Now." If we were in the same room and instead I spoke aloud, you might have a greater sense of immediacy. You might intuitively feel as if you were hearing the word at exactly the same time that I spoke it. That intuition would be wrong. If, instead of trusting your intuition, you thought about the physics of sound, you would know that time must have elapsed between my speaking and your hearing.
Parallel Processing with Promises:
A simple method of writing a collaborative system
In today’s world, there are many reasons to write concurrent software. The desire to improve performance and increase throughput has led to many different asynchronous techniques. The techniques involved, however, are generally complex and the source of many subtle bugs, especially if they require shared mutable state. If shared state is not required, then these problems can be solved with a better abstraction called promises. These allow programmers to hook asynchronous function calls together, waiting for each to return success or failure before running the next appropriate function in the chain.
META II: Digital Vellum in the Digital Scriptorium:
Revisiting Schorre’s 1962 compiler-compiler
Some people do living history -- reviving older skills and material culture by reenacting Waterloo or knapping flint knives. One pleasant rainy weekend in 2012, I set my sights a little more recently and settled in for a little meditative retro-computing, ca. 1962, following the ancient mode of transmission of knowledge: lecture and recitation -- or rather, grace of living in historical times, lecture (here, in the French sense, reading) and transcription (or even more specifically, grace of living post-Post, lecture and reimplementation).
Model-based Testing: Where Does It Stand?:
MBT has positive effects on efficiency and effectiveness, even if it only partially fulfills high expectations.
You have probably heard about MBT (model-based testing), but like many software-engineering professionals who have not used MBT, you might be curious about others’ experience with this test-design method. From mid-June 2014 to early August 2014, we conducted a survey to learn how MBT users view its efficiency and effectiveness. The 2014 MBT User Survey, a follow-up to a similar 2012 survey, was open to all those who have evaluated or used any MBT approach. Its 32 questions included some from a survey distributed at the 2013 User Conference on Advanced Automated Testing. Some questions focused on the efficiency and effectiveness of MBT, providing the figures that managers are most interested in.
Go Static or Go Home:
In the end, dynamic systems are simply less secure.
Most current and historic problems in computer and network security boil down to a single observation: letting other people control our devices is bad for us. At another time, I’ll explain what I mean by "other people" and "bad." For the purpose of this article, I’ll focus entirely on what I mean by control. One way we lose control of our devices is to external distributed denial of service (DDoS) attacks, which fill a network with unwanted traffic, leaving no room for real ("wanted") traffic. Other forms of DDoS are similar: an attack by the Low Orbit Ion Cannon (LOIC), for example, might not totally fill up a network, but it can keep a web server so busy answering useless attack requests that the server can’t answer any useful customer requests.
Securing the Network Time Protocol:
Crackers discover how to use NTP as a weapon for abuse.
In the late 1970s David L. Mills began working on the problem of synchronizing time on networked computers, and NTP (Network Time Protocol) version 1 made its debut in 1980. This was at a time when the net was a much friendlier place - the ARPANET days. NTP version 2 appeared approximately a year later, about the same time as CSNET (Computer Science Network). NSFNET (National Science Foundation Network) launched in 1986. NTP version 3 showed up in 1993.
Scalability Techniques for Practical Synchronization Primitives:
Designing locking primitives with performance in mind
In an ideal world, applications are expected to scale automatically when executed on increasingly larger systems. In practice, however, not only does this scaling not occur, but it is common to see performance actually worsen on those larger systems.
Internal Access Controls:
Trust, but Verify
Every day seems to bring news of another dramatic and high-profile security incident, whether it is the discovery of longstanding vulnerabilities in widely used software such as OpenSSL or Bash, or celebrity photographs stolen and publicized. There seems to be an infinite supply of zero-day vulnerabilities and powerful state-sponsored attackers. In the face of such threats, is it even worth trying to protect your systems and data? What can systems security designers and administrators do?
Disambiguating Databases:
Use the database built for your access model.
The topic of data storage is one that doesn’t need to be well understood until something goes wrong (data disappears) or something goes really right (too many customers). Because databases can be treated as black boxes with an API, their inner workings are often overlooked. They’re often treated as magic things that just take data when offered and supply it when asked. Since these two operations are the only understood activities of the technology, they are often the only features presented when comparing different technologies.
A New Software Engineering:
What happened to the promise of rigorous, disciplined, professional practices for software development?
What happened to software engineering? What happened to the promise of rigorous, disciplined, professional practices for software development, like those observed in other engineering disciplines? What has been adopted under the rubric of "software engineering" is a set of practices largely adapted from other engineering disciplines: project management, design and blueprinting, process control, and so forth. The basic analogy was to treat software as a manufactured product, with all the real "engineering" going on upstream of that - in requirements analysis, design, modeling, etc.
There’s No Such Thing as a General-purpose Processor:
And the belief in such a device is harmful
There is an increasing trend in computer architecture to categorize processors and accelerators as "general purpose." Of the papers published at this year’s International Symposium on Computer Architecture (ISCA 2014), nine out of 45 explicitly referred to general-purpose processors; one additionally referred to general-purpose FPGAs (field-programmable gate arrays), and another referred to general-purpose MIMD (multiple instruction, multiple data) supercomputers, stretching the definition to the breaking point. This article presents the argument that there is no such thing as a truly general-purpose processor and that the belief in such a device is harmful.
The Responsive Enterprise: Embracing the Hacker Way:
Soon every company will be a software company.
As of July 2014, Facebook, founded in 2004, is in the top 20 of the most valuable companies in the S&P 500, putting the 10-year-old software company in the same league as IBM, Oracle, and Coca-Cola. Of the top five fastest-growing companies with regard to market capitalization in 2014 (table 1), three are software companies: Apple, Google, and Microsoft (in fact, one could argue that Intel is also driven by software, making it four out of five).
Evolution of the Product Manager:
Better education needed to develop the discipline
Software practitioners know that product management is a key piece of software development. Product managers talk to users to help figure out what to build, define requirements, and write functional specifications. They work closely with engineers throughout the process of building software. They serve as a sounding board for ideas, help balance the schedule when technical challenges occur - and push back to executive teams when technical revisions are needed. Product managers are involved from before the first code is written, until after it goes out the door.
Productivity in Parallel Programming: A Decade of Progress:
Looking at the design and benefits of X10
In 2002 DARPA (Defense Advanced Research Projects Agency) launched a major initiative in HPCS (high-productivity computing systems). The program was motivated by the belief that the utilization of the coming generation of parallel machines was gated by the difficulty of writing, debugging, tuning, and maintaining software at peta scale.
JavaScript and the Netflix User Interface:
Conditional dependency resolution
In the two decades since its introduction, JavaScript has become the de facto official language of the Web. JavaScript trumps every other language when it comes to the number of runtime environments in the wild. Nearly every consumer hardware device on the market today supports the language in some way. While this is done most commonly through the integration of a Web browser application, many devices now also support Web views natively as part of the operating system UI (user interface).
Security Collapse in the HTTPS Market:
Assessing legal and technical solutions to secure HTTPS
HTTPS (Hypertext Transfer Protocol Secure) has evolved into the de facto standard for secure Web browsing. Through the certificate-based authentication protocol, Web services and Internet users first authenticate one another ("shake hands") using a TLS/SSL certificate, encrypt Web communications end-to-end, and show a padlock in the browser to signal that a communication is secure. In recent years, HTTPS has become an essential technology to protect social, political, and economic activities online.
Why Is It Taking So Long to Secure Internet Routing?:
Routing security incidents can still slip past deployed security defenses.
BGP (Border Gateway Protocol) is the glue that sticks the Internet together, enabling data communications between large networks operated by different organizations. BGP makes Internet communications global by setting up routes for traffic between organizations - for example, from Boston University’s network, through larger ISPs (Internet service providers) such as Level3, Pakistan Telecom, and China Telecom, then on to residential networks such as Comcast or enterprise networks such as Bank of America.
Certificate Transparency:
Public, verifiable, append-only logs
On August 28, 2011, a mis-issued wildcard HTTPS certificate for google.com was used to conduct a man-in-the-middle attack against multiple users in Iran. The certificate had been issued by a Dutch CA (certificate authority) known as DigiNotar, a subsidiary of VASCO Data Security International. Later analysis showed that DigiNotar had been aware of the breach in its systems for more than a month - since at least July 19. It also showed that at least 531 fraudulent certificates had been issued. The final count may never be known, since DigiNotar did not have records of all the mis-issued certificates.
Securing the Tangled Web:
Preventing script injection vulnerabilities through software design
Script injection vulnerabilities are a bane of Web application development: deceptively simple in cause and remedy, they are nevertheless surprisingly difficult to prevent in large-scale Web development.
Privacy, Anonymity, and Big Data in the Social Sciences:
Quality social science research and the privacy of human subjects requires trust.
Open data has tremendous potential for science, but, in human subjects research, there is a tension between privacy and releasing high-quality open data. Federal law governing student privacy and the release of student records suggests that anonymizing student data protects student privacy. Guided by this standard, we de-identified and released a data set from 16 MOOCs (massive open online courses) from MITx and HarvardX on the edX platform. In this article, we show that these and other de-identification procedures necessitate changes to data sets that threaten replication and extension of baseline analyses. To balance student privacy and the benefits of open data, we suggest focusing on protecting privacy without anonymizing data by instead expanding policies that compel researchers to uphold the privacy of the subjects in open data sets.
The Network is Reliable:
An informal survey of real-world communications failures
The network is reliable tops Peter Deutsch’s classic list, "Eight fallacies of distributed computing", "all [of which] prove to be false in the long run and all [of which] cause big trouble and painful learning experiences." Accounting for and understanding the implications of network behavior is key to designing robust distributed programs; in fact, six of Deutsch’s "fallacies" directly pertain to limitations on networked communications.
Undergraduate Software Engineering: Addressing the Needs of Professional Software Development:
Addressing the Needs of Professional Software Development
In the fall semester of 1996 RIT (Rochester Institute of Technology) launched the first undergraduate software engineering program in the United States. The culmination of five years of planning, development, and review, the program was designed from the outset to prepare graduates for professional positions in commercial and industrial software development.
Bringing Arbitrary Compute to Authoritative Data:
Many disparate use cases can be satisfied with a single storage system.
While the term ’big data’ is vague enough to have lost much of its meaning, today’s storage systems are growing more quickly and managing more data than ever before. Consumer devices generate large numbers of photos, videos, and other large digital assets. Machines are rapidly catching up to humans in data generation through extensive recording of system logs and metrics, as well as applications such as video capture and genome sequencing. Large data sets are now commonplace, and people increasingly want to run sophisticated analyses on the data.
Who Must You Trust?:
You must have some trust if you want to get anything done.
In his novel The Diamond Age, author Neal Stephenson describes a constructed society (called a phyle) based on extreme trust in one’s fellow members. Part of the membership requirements is that, from time to time, each member is called upon to undertake certain tasks to reinforce that trust. For example, a phyle member might be told to go to a particular location at the top of a cliff at a specific time, where he will find bungee cords with ankle harnesses attached. The other ends of the cords trail off into the bushes. At the appointed time he is to fasten the harnesses to his ankles and jump off the cliff.
Automated QA Testing at EA: Driven by Events:
A discussion with Michael Donat, Jafar Husain, and Terry Coatta
To millions of game geeks, the position of QA (quality assurance) tester at Electronic Arts must seem like a dream job. But from the company’s perspective, the overhead associated with QA can look downright frightening, particularly in an era of massively multiplayer games.
Design Exploration through Code-generating DSLs:
High-level DSLs for low-level programming
DSLs (domain-specific languages) make programs shorter and easier to write. They can be stand-alone - for example, LaTeX, Makefiles, and SQL - or they can be embedded in a host language. You might think that DSLs embedded in high-level languages would be abstract or mathematically oriented, far from the nitty-gritty of low-level programming. This is not the case. This article demonstrates how high-level EDSLs (embedded DSLs) really can ease low-level programming. There is no contradiction.
Finding More Than One Worm in the Apple:
If you see something, say something.
In February Apple revealed and fixed an SSL (Secure Sockets Layer) vulnerability that had gone undiscovered since the release of iOS 6.0 in September 2012. It left users vulnerable to man-in-the-middle attacks thanks to a short circuit in the SSL/TLS (Transport Layer Security) handshake algorithm introduced by the duplication of a goto statement. Since the discovery of this very serious bug, many people have written about potential causes.
Domain-specific Languages and Code Synthesis Using Haskell:
Looking at embedded DSLs
There are many ways to give instructions to a computer: an electrical engineer might write a MATLAB program; a database administrator might write an SQL script; a hardware engineer might write in Verilog; and an accountant might write a spreadsheet with embedded formulas. Aside from the difference in language used in each of these examples, there is an important difference in form and idiom. Each uses a language customized to the job at hand, and each builds computational requests in a form both familiar and productive for programmers (although accountants may not think of themselves as programmers).
The NSA and Snowden: Securing the All-Seeing Eye:
How good security at the NSA could have stopped him
Edward Snowden, while an NSA (National Security Agency) contractor at Booz Allen Hamilton in Hawaii, copied up to 1.7 million top-secret and above documents, smuggling copies on a thumb drive out of the secure facility in which he worked, and later released many to the press. This has altered the relationship of the U.S. government with the American people, as well as with other countries. This article examines the computer security aspects of how the NSA could have prevented this, perhaps the most damaging breach of secrets in U.S. history.
The Curse of the Excluded Middle:
Mostly functional programming does not work.
There is a trend in the software industry to sell "mostly functional" programming as the silver bullet for solving problems developers face with concurrency, parallelism (manycore), and, of course, Big Data. Contemporary imperative languages could continue the ongoing trend, embrace closures, and try to limit mutation and other side effects. Unfortunately, just as "mostly secure" does not work, "mostly functional" does not work either. Instead, developers should seriously consider a completely fundamentalist option as well: embrace pure lazy functional programming with all effects explicitly surfaced in the type system using monads.
Don’t Settle for Eventual Consistency:
Stronger properties for low-latency geo-replicated storage
Geo-replicated storage provides copies of the same data at multiple, geographically distinct locations. Facebook, for example, geo-replicates its data (profiles, friends lists, likes, etc.) to data centers on the east and west coasts of the United States, and in Europe. In each data center, a tier of separate Web servers accepts browser requests and then handles those requests by reading and writing data from the storage system.
A Primer on Provenance:
Better understanding of data requires tracking its history and context.
Assessing the quality or validity of a piece of data is not usually done in isolation. You typically examine the context in which the data appears and try to determine its original sources or review the process through which it was created. This is not so straightforward when dealing with digital data, however: the result of a computation might have been derived from numerous sources and by applying complex successive transformations, possibly over long periods of time.
Multipath TCP:
Decoupled from IP, TCP is at last able to support multihomed hosts.
The Internet relies heavily on two protocols. In the network layer, IP (Internet Protocol) provides an unreliable datagram service and ensures that any host can exchange packets with any other host. Since its creation in the 1970s, IP has seen the addition of several features, including multicast, IPsec (IP security), and QoS (quality of service). The latest revision, IPv6 (IP version 6), supports 16-byte addresses.
Major-league SEMAT: Why Should an Executive Care?:
Becoming better, faster, cheaper, and happier
In today’s ever more competitive world, boards of directors and executives demand that CIOs and their teams deliver "more with less." Studies show, without any real surprise, that there is no one-size-fits-all method to suit all software initiatives, and that a practice-based approach with some light but effective degree of order and governance is the goal of most software-development departments.
Eventually Consistent: Not What You Were Expecting?:
Methods of quantifying consistency (or lack thereof) in eventually consistent storage systems
Storage systems continue to lay the foundation for modern Internet services such as Web search, e-commerce, and social networking. Pressures caused by rapidly growing user bases and data sets have driven system designs away from conventional centralized databases and toward more scalable distributed solutions, including simple NoSQL key-value storage systems, as well as more elaborate NewSQL databases that support transactions at scale.
Scaling Existing Lock-based Applications with Lock Elision:
Lock elision enables existing lock-based programs to achieve the performance benefits of nonblocking synchronization and fine-grain locking with minor software engineering effort.
Multithreaded applications take advantage of increasing core counts to achieve high performance. Such programs, however, typically require programmers to reason about data shared among multiple threads. Programmers use synchronization mechanisms such as mutual-exclusion locks to ensure correct updates to shared data in the presence of accesses from multiple threads. Unfortunately, these mechanisms serialize thread accesses to the data and limit scalability.
Rate-limiting State:
The edge of the Internet is an unruly place
By design, the Internet core is dumb, and the edge is smart. This design decision has enabled the Internet’s wildcat growth, since without complexity the core can grow at the speed of demand. On the downside, the decision to put all smartness at the edge means we’re at the mercy of scale when it comes to the quality of the Internet’s aggregate traffic load. Not all device and software builders have the skills and the quality assurance budgets that something the size of the Internet deserves.
The API Performance Contract:
How can the expected interactions between caller and implementation be guaranteed?
When you call functions in an API, you expect them to work correctly; sometimes this expectation is called a contract between the caller and the implementation. Callers also have performance expectations about these functions, and often the success of a software system depends on the API meeting these expectations. So there’s a performance contract as well as a correctness contract. The performance contract is usually implicit, often vague, and sometimes breached (by caller or implementation). How can this aspect of API design and documentation be improved?
Provenance in Sensor Data Management:
A cohesive, independent solution for bringing provenance to scientific research
In today’s information-driven workplaces, data is constantly being moved around and undergoing transformation. The typical business-as-usual approach is to use e-mail attachments, shared network locations, databases, and more recently, the cloud. More often than not, there are multiple versions of the data sitting in different locations, and users of this data are confounded by the lack of metadata describing its provenance or in other words, its lineage. The ProvDMS project at the Oak Ridge National Laboratory (ORNL) described in this article aims to solve this issue in the context of sensor data.
Unikernels: Rise of the Virtual Library Operating System:
What if all the software layers in a virtual appliance were compiled within the same safe, high-level language framework?
Cloud computing has been pioneering the business of renting computing resources in large data centers to multiple (and possibly competing) tenants. The basic enabling technology for the cloud is operating-system virtualization such as Xen1 or VMWare, which allows customers to multiplex VMs (virtual machines) on a shared cluster of physical machines. Each VM presents as a self-contained computer, booting a standard operating-system kernel and running unmodified applications just as if it were executing on a physical machine.
Toward Software-defined SLAs:
Enterprise computing in the public cloud
The public cloud has introduced new technology and architectures that could reshape enterprise computing. In particular, the public cloud is a new design center for enterprise applications, platform software, and services. API-driven orchestration of large-scale, on-demand resources is an important new design attribute, which differentiates public-cloud from conventional enterprise data-center infrastructure. Enterprise applications must adapt to the new public-cloud design center, but at the same time new software and system design patterns can add enterprise attributes and service levels to public-cloud services.
The Road to SDN:
An intellectual history of programmable networks
Designing and managing networks has become more innovative over the past few years with the aid of SDN (software-defined networking). This technology seems to have appeared suddenly, but it is actually part of a long history of trying to make computer networks more programmable.
The Software Inferno:
Dante’s tale, as experienced by a software architect
The Software Inferno is a tale that parallels The Inferno, Part One of The Divine Comedy written by Dante Alighieri in the early 1300s. That literary masterpiece describes the condemnation and punishment faced by a variety of sinners in their hell-spent afterlives as recompense for atrocities committed during their earthly existences. The Software Inferno is a similar account, describing a journey where "sinners against software" are encountered amidst their torment, within their assigned areas of eternal condemnation, and paying their penance.
Making the Web Faster with HTTP 2.0:
HTTP continues to evolve
HTTP (Hypertext Transfer Protocol) is one of the most widely used application protocols on the Internet. Since its publication, RFC 2616 (HTTP 1.1) has served as a foundation for the unprecedented growth of the Internet: billions of devices of all shapes and sizes, from desktop computers to the tiny Web devices in our pockets, speak HTTP every day to deliver news, video, and millions of other Web applications we have all come to depend on in our everyday lives.
Intermediate Representation:
The increasing significance of intermediate representations in compilers
Program compilation is a complicated process. A compiler is a software program that translates a high-level source language program into a form ready to execute on a computer. Early in the evolution of compilers, designers introduced IRs (intermediate representations, also commonly called intermediate languages) to manage the complexity of the compilation process. The use of an IR as the compiler’s internal representation of the program enables the compiler to be broken up into multiple phases and components, thus benefiting from modularity.
The Challenge of Cross-language Interoperability:
Interfacing between languages is increasingly important.
Interoperability between languages has been a problem since the second programming language was invented. Solutions have ranged from language-independent object models such as COM (Component Object Model) and CORBA (Common Object Request Broker Architecture) to VMs (virtual machines) designed to integrate languages, such as JVM (Java Virtual Machine) and CLR (Common Language Runtime). With software becoming ever more complex and hardware less homogeneous, the likelihood of a single language being the correct tool for an entire program is lower than ever. As modern compilers become more modular, there is potential for a new generation of interesting solutions.
Agile and SEMAT - Perfect Partners:
Combining agile and SEMAT yields more advantages than either one alone
Today, as always, many different initiatives are under way to improve the ways in which software is developed. The most popular and prevalent of these is the agile movement. One of the newer kids on the block is the SEMAT (Software Engineering Method and Theory) initiative. As with any new initiative, people are struggling to see how it fits into the world and relates to all the other things going on. For example, does it improve or replace their current ways of working?
Adopting DevOps Practices in Quality Assurance:
Merging the art and science of software development
Software life-cycle management was, for a very long time, a controlled exercise. The duration of product design, development, and support was predictable enough that companies and their employees scheduled their finances, vacations, surgeries, and mergers around product releases. When developers were busy, QA (quality assurance) had it easy. As the coding portion of a release cycle came to a close, QA took over while support ramped up. Then when the product released, the development staff exhaled, rested, and started the loop again while the support staff transitioned to busily supporting the new product.
Passively Measuring TCP Round-trip Times:
A close look at RTT measurements with TCP
Measuring and monitoring network RTT (round-trip time) is important for multiple reasons: it allows network operators and end users to understand their network performance and help optimize their environment, and it helps businesses understand the responsiveness of their services to sections of their user base.
Leaking Space:
Eliminating memory hogs
A space leak occurs when a computer program uses more memory than necessary. In contrast to memory leaks, where the leaked memory is never released, the memory consumed by a space leak is released, but later than expected. This article presents example space leaks and how to spot and eliminate them.
Barbarians at the Gateways:
High-frequency Trading and Exchange Technology
I am a former high-frequency trader. For a few wonderful years I led a group of brilliant engineers and mathematicians, and together we traded in the electronic marketplaces and pushed systems to the edge of their capability.
Online Algorithms in High-frequency Trading:
The challenges faced by competing HFT algorithms
HFT (high-frequency trading) has emerged as a powerful force in modern financial markets. Only 20 years ago, most of the trading volume occurred in exchanges such as the New York Stock Exchange, where humans dressed in brightly colored outfits would gesticulate and scream their trading intentions. Nowadays, trading occurs mostly in electronic servers in data centers, where computers communicate their trading intentions through network messages. This transition from physical exchanges to electronic platforms has been particularly profitable for HFT firms, which invested heavily in the infrastructure of this new environment.
The Balancing Act of Choosing Nonblocking Features:
Design requirements of nonblocking systems
What is nonblocking progress? Consider the simple example of incrementing a counter C shared among multiple threads. One way to do so is by protecting the steps of incrementing C by a mutual exclusion lock L (i.e., acquire(L); old := C ; C := old+1; release(L);). If a thread P is holding L, then a different thread Q must wait for P to release L before Q can proceed to operate on C. That is, Q is blocked by P.
NUMA (Non-Uniform Memory Access): An Overview:
NUMA becomes more common because memory controllers get close to execution units on microprocessors.
NUMA (non-uniform memory access) is the phenomenon that memory at various points in the address space of a processor have different performance characteristics. At current processor speeds, the signal path length from the processor to memory plays a significant role. Increased signal path length not only increases latency to memory but also quickly becomes a throughput bottleneck if the signal path is shared by multiple processors. The performance differences to memory were noticeable first on large-scale systems where data paths were spanning motherboards or chassis. These systems required modified operating-system kernels with NUMA support that explicitly understood the topological properties of the system’s memory (such as the chassis in which a region of memory was located) in order to avoid excessively long signal path lengths.
20 Obstacles to Scalability:
Watch out for these pitfalls that can prevent Web application scaling.
Web applications can grow in fits and starts. Customer numbers can increase rapidly, and application usage patterns can vary seasonally. This unpredictability necessitates an application that is scalable. What is the best way of achieving scalability?
Rules for Mobile Performance Optimization:
An overview of techniques to speed page loading
Performance has always been crucial to the success of Web sites. A growing body of research has proven that even small improvements in page-load times lead to more sales, more ad revenue, more stickiness, and more customer satisfaction for enterprises ranging from small e-commerce shops to megachains such as Walmart.
Best Practices on the Move: Building Web Apps for Mobile Devices:
Which practices should be modified or avoided altogether by developers for the mobile Web?
If it wasn’t your priority last year or the year before, it’s sure to be your priority now: bring your Web site or service to mobile devices in 2013 or suffer the consequences. Early adopters have been talking about mobile taking over since 1999 - anticipating the trend by only a decade or so. Today, mobile Web traffic is dramatically on the rise, and creating a slick mobile experience is at the top of everyone’s mind. Total mobile data traffic is expected to exceed 10 exabytes per month by 2017.
The Antifragile Organization:
Embracing Failure to Improve Resilience and Maximize Availability
Failure is inevitable. Disks fail. Software bugs lie dormant waiting for just the right conditions to bite. People make mistakes. Data centers are built on farms of unreliable commodity hardware. If you’re running in a cloud environment, then many of these factors are outside of your control. To compound the problem, failure is not predictable and doesn’t occur with uniform probability and frequency. The lack of a uniform frequency increases uncertainty and risk in the system.
Nonblocking Algorithms and Scalable Multicore Programming:
Exploring some alternatives to lock-based synchronization
Real-world systems with complicated quality-of-service guarantees may require a delicate balance between throughput and latency to meet operating requirements in a cost-efficient manner. The increasing availability and decreasing cost of commodity multicore and many-core systems make concurrency and parallelism increasingly necessary for meeting demanding performance requirements. Unfortunately, the design and implementation of correct, efficient, and scalable concurrent software is often a daunting task.
Proving the Correctness of Nonblocking Data Structures:
So you’ve decided to use a nonblocking data structure, and now you need to be certain of its correctness. How can this be achieved?
Nonblocking synchronization can yield astonishing results in terms of scalability and realtime response, but at the expense of verification state space.
Structured Deferral: Synchronization via Procrastination:
We simply do not have a synchronization mechanism that can enforce mutual exclusion.
Developers often take a proactive approach to software design, especially those from cultures valuing industriousness over procrastination. Lazy approaches, however, have proven their value, with examples including reference counting, garbage collection, and lazy evaluation. This structured deferral takes the form of synchronization via procrastination, specifically reference counting, hazard pointers, and RCU (read-copy-update).
Realtime GPU Audio:
Finite difference-based sound synthesis using graphics processors
Today’s CPUs are capable of supporting realtime audio for many popular applications, but some compute-intensive audio applications require hardware acceleration. This article looks at some realtime sound-synthesis applications and shares the authors’ experiences implementing them on GPUs (graphics processing units).
There’s Just No Getting around It: You’re Building a Distributed System:
Building a distributed system requires a methodical approach to requirements.
Distributed systems are difficult to understand, design, build, and operate. They introduce exponentially more variables into a design than a single machine does, making the root cause of an application problem much harder to discover. It should be said that if an application does not have meaningful SLAs (service-level agreements) and can tolerate extended downtime and/or performance degradation, then the barrier to entry is greatly reduced. Most modern applications, however, have an expectation of resiliency from their users, and SLAs are typically measured by "the number of nines" (e.g., 99.9 or 99.99 percent availability per month).
A File System All Its Own:
Flash memory has come a long way. Now it’s time for software to catch up.
In the past five years, flash memory has progressed from a promising accelerator, whose place in the data center was still uncertain, to an established enterprise component for storing performance-critical data. It’s rise to prominence followed its proliferation in the consumer world and the volume economics that followed (see figure 1). With SSDs (solid-state devices), flash arrived in a form optimized for compatibility - just replace a hard drive with an SSD for radically better performance. But the properties of the NAND flash memory used by SSDs differ significantly from those of the magnetic media in the hard drives they often displace.
Eventual Consistency Today: Limitations, Extensions, and Beyond:
How can applications be built on eventually consistent infrastructure given no guarantee of safety?
In a July 2000 conference keynote, Eric Brewer, now VP of engineering at Google and a professor at the University of California, Berkeley, publicly postulated the CAP (consistency, availability, and partition tolerance) theorem, which would change the landscape of how distributed storage systems were architected. Brewer’s conjecture--based on his experiences building infrastructure for some of the first Internet search engines at Inktomi--states that distributed systems requiring always-on, highly available operation cannot guarantee the illusion of coherent, consistent single-system operation in the presence of network partitions, which cut communication between active servers.
Discrimination in Online Ad Delivery:
Google ads, black names and white names, racial discrimination, and click advertising
Do online ads suggestive of arrest records appear more often with searches of black-sounding names than white-sounding names? What is a black-sounding name or white-sounding name, anyway? How many more times would an ad have to appear adversely affecting one racial group for it to be considered discrimination? Is online activity so ubiquitous that computer scientists have to think about societal consequences such as structural racism in technology design? If so, how is this technology to be built? Let’s take a scientific dive into online ad delivery to find answers.
How Fast is Your Web Site?:
Web site performance data has never been more readily available.
The overwhelming evidence indicates that a Web site’s performance (speed) correlates directly to its success, across industries and business metrics. With such a clear correlation (and even proven causation), it is important to monitor how your Web site performs. So, how fast is your Web site?
FPGA Programming for the Masses:
The programmability of FPGAs must improve if they are to be part of mainstream computing.
When looking at how hardware influences computing performance, we have GPPs (general-purpose processors) on one end of the spectrum and ASICs (application-specific integrated circuits) on the other. Processors are highly programmable but often inefficient in terms of power and performance. ASICs implement a dedicated and fixed function and provide the best power and performance characteristics, but any functional change requires a complete (and extremely expensive) re-spinning of the circuits.
The Evolution of Web Development for Mobile Devices:
Building Web sites that perform well on mobile devices remains a challenge.
The biggest change in Web development over the past few years has been the remarkable rise of mobile computing. Mobile phones used to be extremely limited devices that were best used for making phone calls and sending short text messages. Today’s mobile phones are more powerful than the computers that took Apollo 11 to the moon with the ability to send data to and from nearly anywhere.
The Story of the Teapot in DHTML:
It’s easy to do amazing things, such as rendering the classic teapot in HTML and CSS.
Before there was SVG (Scalable Vector Graphics), WebGL (Web Graphics Library), Canvas, or much of anything for graphics in the browser, it was possible to do quite a lot more than was initially obvious. To demonstrate, we created a JavaScript program that renders polygonal 3D graphics using nothing more than HTML and CSS. Our proof-of-concept is fast enough to support physics-based small-game content, but we started with the iconic 3D "Utah teapot" because it tells the whole story in one picture. It’s feasible to render this classic object using just regular DIV elements, CSS styles, and a little bit of JavaScript code.
Making the Mobile Web Faster:
Mobile performance issues? Fix the back end, not just the client.
Mobile clients have been on the rise and will only continue to grow. This means that if you are serving clients over the Internet, you cannot ignore the customer experience on a mobile device. There are many informative articles on mobile performance, and just as many on general API design, but you’ll find few discussing the design considerations needed to optimize the back-end systems for mobile clients. Whether you have an app, mobile Web site, or both, it is likely that these clients are consuming APIs from your back-end systems.
Hazy: Making it Easier to Build and Maintain Big-data Analytics:
Racing to unleash the full potential of big data with the latest statistical and machine-learning techniques.
The rise of big data presents both big opportunities and big challenges in domains ranging from enterprises to sciences. The opportunities include better-informed business decisions, more efficient supply-chain management and resource allocation, more effective targeting of products and advertisements, better ways to "organize the world’s information," faster turnaround of scientific discoveries, etc.
A Decade of OS Access-control Extensibility:
Open source security foundations for mobile and embedded devices
To discuss operating system security is to marvel at the diversity of deployed access-control models: Unix and Windows NT multiuser security; Type Enforcement in SELinux; anti-malware products; app sandboxing in Apple OS X, Apple iOS, and Google Android; and application-facing systems such as Capsicum in FreeBSD. This diversity is the result of a stunning transition from the narrow 1990s Unix and NT status quo to ’security localization’ - the adaptation of operating-system security models to site-local or product-specific requirements.
Rethinking Passwords:
Our authentication system is lacking. Is improvement possible?
There is an authentication plague upon the land. We have to claim and assert our identity repeatedly to a host of authentication trolls, each jealously guarding an Internet service of some sort. Each troll has specific rules for passwords, and the rules vary widely and incomprehensibly.
Thinking Methodically about Performance:
The USE method addresses shortcomings in other commonly used methodologies.
Performance issues can be complex and mysterious, providing little or no clue to their origin. In the absence of a starting point, performance issues are often analyzed randomly: guessing where the problem may be and then changing things until it goes away. While this can deliver results it can also be time-consuming, disruptive, and may ultimately overlook certain issues. This article describes system-performance issues and the methodologies in use today for analyzing them, and it proposes a new methodology for approaching and solving a class of issues.
Splinternet Behind the Great Firewall of China:
Once China opened its door to the world, it could not close it again.
What if you could not access YouTube, Facebook, Twitter, and Wikipedia? How would you feel if Google informed you that your connection had been reset during a search? What if Gmail was only periodically available, and Google Docs, which was used to compose this article, was completely unreachable? What a mess!
Condos and Clouds:
Constraints in an environment empower the services.
Living in a condominium has its constraints and its services. By defining the lifestyle and limits on usage patterns, it is possible to pack many homes close together and to provide the residents with many conveniences. Condo living can offer a great value to those interested and willing to live within its constraints and enjoy the sharing of common services.
The Web Won’t Be Safe or Secure until We Break It:
Unless you’ve taken very particular precautions, assume every Web site you visit knows exactly who you are.
The Internet was designed to deliver information, but few people envisioned the vast amounts of information that would be involved or the personal nature of that information. Similarly, few could have foreseen the potential flaws in the design of the Internet that would expose this personal information, compromising the data of individuals and companies.
The Essence of Software Engineering: The SEMAT Kernel:
A thinking framework in the form of an actionable kernel
Everyone who develops software knows that it is a complex and risky business, and its participants are always on the lookout for new ideas that will lead to better software. Fortunately, software engineering is still a young and growing profession that sees innovations and improvements in best practices every year. Just look, for example, at the improvements and benefits that lean and agile thinking have brought to software-development teams.
Anatomy of a Solid-state Drive:
While the ubiquitous SSD shares many features with the hard-disk drive, under the surface they are completely different.
Over the past several years, a new type of storage device has entered laptops and data centers, fundamentally changing expectations regarding the power, size, and performance dynamics of storage. The SSD (solid-state drive) is a technology that has been around for more than 30 years but remained too expensive for broad adoption.
Sender-side Buffers and the Case for Multimedia Adaptation:
A proposal to improve the performance and availability of streaming video and other time-sensitive media
The Internet/Web architecture has developed to the point where it is common for the most popular sites to operate at a virtually unlimited scale, and many sites now cater to hundreds of millions of unique users. Performance and availability are generally essential to attract and sustain such user bases. As such, the network and server infrastructure plays a critical role in the fierce competition for users. Web pages should load in tens to a few hundred milliseconds at most.
Weathering the Unexpected:
Failures happen, and resilience drills help organizations prepare for them.
Whether it is a hurricane blowing down power lines, a volcanic-ash cloud grounding all flights for a continent, or a humble rodent gnawing through underground fibers -- the unexpected happens. We cannot do much to prevent it, but there is a lot we can do to be prepared for it. To this end, Google runs an annual, company-wide, multi-day Disaster Recovery Testing event -- DiRT -- the objective of which is to ensure that Google’s services and internal business operations continue to run following a disaster.
Disks from the Perspective of a File System:
Disks lie. And the controllers that run them are partners in crime.
Most applications do not deal with disks directly, instead storing their data in files in a file system, which protects us from those scoundrel disks. After all, a key task of the file system is to ensure that the file system can always be recovered to a consistent state after an unplanned system crash (for example, a power failure). While a good file system will be able to beat the disks into submission, the required effort can be great and the reduced performance annoying.
Toward Higher Precision:
An introduction to PTP and its significance to NTP practitioners
It is difficult to overstate the importance of synchronized time to modern computer systems. Our lives today depend on the financial transactions, telecommunications, power generation and delivery, high-speed manufacturing, and discoveries in "big physics," among many other things, that are driven by fast, powerful computing devices coordinated in time with each other.
Fault Injection in Production:
Making the case for resilience testing
When we build Web infrastructures at Etsy, we aim to make them resilient. This means designing them carefully so that they can sustain their (increasingly critical) operations in the face of failure. Thankfully, there have been a couple of decades and reams of paper spent on researching how fault tolerance and graceful degradation can be brought to computer systems. That helps the cause.
All Your Database Are Belong to Us:
In the big open world of the cloud, highly available distributed objects will rule.
In the database world, the raw physical data model is at the center of the universe, and queries freely assume intimate details of the data representation (indexes, statistics, metadata). This closed-world assumption and the resulting lack of abstraction have the pleasant effect of allowing the data to outlive the application. On the other hand, this makes it hard to evolve the underlying model independently from the queries over the model.
Software Needs Seatbelts and Airbags:
Finding and fixing bugs in deployed software is difficult and time-consuming. Here are some alternatives.
Like death and taxes, buggy code is an unfortunate fact of life. Nearly every program ships with known bugs, and probably all of them end up with bugs that are discovered only post-deployment. There are many reasons for this sad state of affairs.
A New Objective-C Runtime: from Research to Production:
Backward compatibility always trumps new features.
The path from the research prototype (Étoilé runtime) to the shipping version (GNUstep runtime) involved a complete rewrite and redesign. This isn’t necessarily a bad thing: part of the point of building a prototype is to learn what makes sense and what doesn’t, and to investigate what is feasible in a world where you control the entire system, but not necessarily in production.
Multitier Programming in Hop:
A first step toward programming 21st-century applications
The Web is becoming the richest platform on which to create computer applications. Its power comes from three elements: (1) modern Web browsers enable highly sophisticated GUIs with 3D, multimedia, fancy typesetting, etc.; (2) calling existing services through Web APIs makes it possible to develop sophisticated applications from independently available components; and (3) open data availability allows applications to access a wide set of information that was unreachable or that simply did not exist before. The combination of these three elements has already given birth to revolutionary applications such as Google Maps, radio podcasts, and social networks.
OpenFlow: A Radical New Idea in Networking:
An open standard that enables software-defined networking
Computer networks have historically evolved box by box, with individual network elements occupying specific ecological niches as routers, switches, load balancers, NATs (network address translations), or firewalls. Software-defined networking proposes to overturn that ecology, turning the network as a whole into a platform and the individual network elements into programmable entities. The apps running on the network platform can optimize traffic flows to take the shortest path, just as the current distributed protocols do, but they can also optimize the network to maximize link utilization, create different reachability domains for different users, or make device mobility seamless.
Extending the Semantics of Scheduling Priorities:
Increasing parallelism demands new paradigms.
Application performance is directly affected by the hardware resources that the application requires, the degree to which such resources are available, and how the operating system addresses its requirements with regard to the other processes in the system. Ideally, an application would have access to all the resources it could use and be allowed to complete its work without competing with any other activity in the system. In a world of highly shared hardware resources and generalpurpose, time-share-based operating systems, however, no guarantees can be made as to how well resourced an application will be.
Getting What You Measure:
Four common pitfalls in using software metrics for project management
Software metrics - helpful tools or a waste of time? For every developer who treasures these mathematical abstractions of software systems there is a developer who thinks software metrics are invented just to keep project managers busy. Software metrics can be very powerful tools that help achieve your goals but it is important to use them correctly, as they also have the power to demotivate project teams and steer development in the wrong direction.
Modeling People and Places with Internet Photo Collections:
Understanding the world from the sea of online photos
This article describes our work in using online photo collections to reconstruct information about the world and its inhabitants at both global and local scales. This work has been driven by the dramatic growth of social content-sharing Web sites, which have created immense online collections of user-generated visual data. Flickr.com alone currently hosts more than 6 billion images taken by more than 40 million unique users, while Facebook.com has said it grows by nearly 250 million photos every day.
Controlling Queue Delay:
A modern AQM is just one piece of the solution to bufferbloat.
Nearly three decades after it was first diagnosed, the "persistently full buffer problem" recently exposed as part of "bufferbloat", is still with us and made increasingly critical by two trends. First, cheap memory and a "more is better" mentality have led to the inflation and proliferation of buffers. Second, dynamically varying path characteristics are much more common today and are the norm at the consumer Internet edge. Reasonably sized buffers become extremely oversized when link rates and path delays fall below nominal values.
A Guided Tour through Data-center Networking:
A good user experience depends on predictable performance within the data-center network.
The magic of the cloud is that it is always on and always available from anywhere. Users have come to expect that services are there when they need them. A data center (or warehouse-scale computer) is the nexus from which all the services flow. It is often housed in a nondescript warehouse-sized building bearing no indication of what lies inside. Amidst the whirring fans and refrigerator-sized computer racks is a tapestry of electrical cables and fiber optics weaving everything together -- the data-center network.
Realtime Computer Vision with OpenCV:
Mobile computer-vision technology will soon become as ubiquitous as touch interfaces.
Computer vision is a rapidly growing field devoted to analyzing, modifying, and high-level understanding of images. Its objective is to determine what is happening in front of a camera and use that understanding to control a computer or robotic system, or to provide people with new images that are more informative or aesthetically pleasing than the original camera images. Application areas for computer-vision technology include video surveillance, biometrics, automotive, photography, movie production, Web search, medicine, augmented reality gaming, new user interfaces, and many more.
Idempotence Is Not a Medical Condition:
An essential property for reliable systems
The definition of distributed computing can be confusing. Sometimes, it refers to a tightly coupled cluster of computers working together to look like one larger computer. More often, however, it refers to a bunch of loosely related applications chattering together without a lot of system-level support. This lack of support in distributed computing environments makes it difficult to write applications that work together. Messages sent between systems do not have crisp guarantees for delivery. They can get lost, and so, after a timeout, they are retried. The application on the other side of the communication may see multiple messages arrive where one was intended.
CPU DB: Recording Microprocessor History:
With this open database, you can mine microprocessor trends over the past 40 years.
In November 1971, Intel introduced the world’s first single-chip microprocessor, the Intel 4004. It had 2,300 transistors, ran at a clock speed of up to 740 KHz, and delivered 60,000 instructions per second while dissipating 0.5 watts. The following four decades witnessed exponential growth in compute power, a trend that has enabled applications as diverse as climate modeling, protein folding, and computing real-time ballistic trajectories of angry birds.
Your Mouse is a Database:
Web and mobile applications are increasingly composed of asynchronous and realtime streaming services and push notifications.
Among the hottest buzzwords in the IT industry these days is "big data," but the "big" is something of a misnomer: big data is not just about volume, but also about velocity and variety. The volume of data ranges from a small number of items stored in the closed world of a conventional RDMS (relational database management system) to a large number of items spread out over a large cluster of machines or across the entire World Wide Web.
Managing Technical Debt:
Shortcuts that save money and time today can cost you down the road.
In 1992, Ward Cunningham published a report at OOPSLA (Object-oriented Programming, Systems, Languages, and Applications) in which he proposed the concept of technical debt. He defines it in terms of immature code: "Shipping first-time code is like going into debt." Technical debt isn’t limited to first-time code, however. There are many ways and reasons (not all bad) to take on technical debt.
Interactive Dynamics for Visual Analysis:
A taxonomy of tools that support the fluent and flexible use of visualizations
The increasing scale and availability of digital data provides an extraordinary resource for informing public policy, scientific discovery, business strategy, and even our personal lives. To get the most out of such data, however, users must be able to make sense of it: to pursue questions, uncover patterns of interest, and identify (and potentially correct) errors. In concert with data-management systems and statistical algorithms, analysis requires contextualized human judgments regarding the domain-specific significance of the clusters, trends, and outliers discovered in data.
Why LINQ Matters: Cloud Composability Guaranteed:
The benefits of composability are becoming clear in software engineering.
In this article we use LINQ (Language-integrated Query) as the guiding example of composability. LINQ is a specification of higher-order operators designed specifically to be composable. This specification is broadly applicable over anything that fits a loose definition of "collection," from objects in memory to asynchronous data streams to resources distributed in the cloud. With such a design, developers build up complexity by chaining together transforms and filters in various orders and by nesting the chains--that is, by building expression trees of operators.
Revisiting Network I/O APIs: The netmap Framework:
It is possible to achieve huge performance improvements in the way packet processing is done on modern operating systems.
Today 10-gigabit interfaces are used more and more in datacenters and servers. On these links, packets flow as fast as one every 67.2 nanoseconds, yet modern operating systems can take 10-20 times longer just to move one packet between the wire and the application. We can do much better, not with more powerful hardware but by revising architectural decisions made long ago regarding the design of device drivers and network stacks.
SAGE: Whitebox Fuzzing for Security Testing:
SAGE has had a remarkable impact at Microsoft.
Most ACM Queue readers might think of "program verification research" as mostly theoretical with little impact on the world at large. Think again. If you are reading these lines on a PC running some form of Windows (like 93-plus percent of PC users--that is, more than a billion people), then you have been affected by this line of work--without knowing it, which is precisely the way we want it to be.
You Don’t Know Jack about Shared Variables or Memory Models:
Data races are evil.
A Google search for "Threads are evil" generates 18,000 hits, but threads are ubiquitous. Almost all of the processes running on a modern Windows PC use them. Software threads are typically how programmers get machines with multiple cores to work together to solve problems faster. And often they are what allow user interfaces to remain responsive while the application performs a background calculation.
Advances and Challenges in Log Analysis:
Logs contain a wealth of information for help in managing systems.
Computer-system logs provide a glimpse into the states of a running system. Instrumentation occasionally generates short messages that are collected in a system-specific log. The content and format of logs can vary widely from one system to another and even among components within a system. A printer driver might generate messages indicating that it had trouble communicating with the printer, while a Web server might record which pages were requested and when.
Bufferbloat: Dark Buffers in the Internet:
Networks without effective AQM may again be vulnerable to congestion collapse.
Today’s networks are suffering from unnecessary latency and poor system performance. The culprit is bufferbloat, the existence of excessively large and frequently full buffers inside the network. Large buffers have been inserted all over the Internet without sufficient thought or testing. They damage or defeat the fundamental congestion-avoidance algorithms of the Internet’s most common transport protocol. Long delays from bufferbloat are frequently attributed incorrectly to network congestion, and this misinterpretation of the problem leads to the wrong solutions being proposed.
I/O Virtualization:
Decoupling a logical device from its physical implementation offers many compelling advantages.
The term virtual is heavily overloaded, evoking everything from virtual machines running in the cloud to avatars running across virtual worlds. Even within the narrowfigureer context of computer I/O, virtualization has a long, diverse history, exemplified by logical devices that are deliberately separate from their physical instantiations.
Creating Languages in Racket:
Sometimes you just have to make a better mousetrap.
Choosing the right tool for a simple job is easy: a screwdriver is usually the best option when you need to change the battery in a toy, and grep is the obvious choice to check for a word in a text document. For more complex tasks, the choice of tool is rarely so straightforward--all the more so for a programming task, where programmers have an unparalleled ability to construct their own tools. Programmers frequently solve programming problems by creating new tool programs, such as scripts that generate source code from tables of data.
Coding Guidelines: Finding the Art in the Science:
What separates good code from great code?
Computer science is both a science and an art. Its scientific aspects range from the theory of computation and algorithmic studies to code design and program architecture. Yet, when it comes time for implementation, there is a combination of artistic flare, nuanced style, and technical prowess that separates good code from great code.
How Will Astronomy Archives Survive the Data Tsunami?:
Astronomers are collecting more data than ever. What practices can keep them ahead of the flood?
Astronomy is already awash with data: currently 1 PB of public data is electronically accessible, and this volume is growing at 0.5 PB per year. The availability of this data has already transformed research in astronomy, and the STScI now reports that more papers are published with archived data sets than with newly acquired data. This growth in data size and anticipated usage will accelerate in the coming few years as new projects such as the LSST, ALMA, and SKA move into operation. These new projects will use much larger arrays of telescopes and detectors or much higher data acquisition rates than are now used.
Postmortem Debugging in Dynamic Environments:
Modern dynamic languages lack tools for understanding software failures.
Despite the best efforts of software engineers to produce high-quality software, inevitably some bugs escape even the most rigorous testing process and are first encountered by end users. When this happens, such failures must be understood quickly, the underlying bugs fixed, and deployments patched to avoid another user (or the same one) running into the same problem again.
OCaml for the Masses:
Why the next language you learn should be functional
Functional programming is an old idea with a distinguished history. Lisp, a functional language inspired by Alonzo Church’s lambda calculus, was one of the first programming languages developed at the dawn of the computing age. Statically typed functional languages such as OCaml and Haskell are newer, but their roots go deep.
Java Security Architecture Revisited:
Hard technical problems and tough business challenges
This article looks back at a few of the hardest technical problems from a design and engineering perspective, as well as some tough business challenges for which research scientists are rarely trained. Li Gong offers a retrospective here culled from four previous occasions when he had the opportunity to dig into old notes and refresh his memory.
The World According to LINQ:
Big data is about more than size, and LINQ is more than up to the task.
Programmers building Web- and cloud-based applications wire together data from many different sources such as sensors, social networks, user interfaces, spreadsheets, and stock tickers. Most of this data does not fit in the closed and clean world of traditional relational databases. It is too big, unstructured, denormalized, and streaming in realtime. Presenting a unified programming model across all these disparate data models and query languages seems impossible at first. By focusing on the commonalities instead of the differences, however, most data sources will accept some form of computation to filter and transform collections of data.
Verification of Safety-critical Software:
Avionics software safety certification is achieved through objective-based standards.
Avionics software has become a keystone in today’s aircraft design. Advances in avionics systems have reduced aircraft weight thereby reducing fuel consumption, enabled precision navigation, improved engine performance, and provided a host of other benefits. These advances have turned modern aircraft into flying data centers with computers controlling or monitoring many of the critical systems onboard. The software that runs these aircraft systems must be as safe as we can make it.
Abstraction in Hardware System Design:
Applying lessons from software languages to hardware languages using Bluespec SystemVerilog
The history of software engineering is one of continuing development of abstraction mechanisms designed to tackle ever-increasing complexity. Hardware design, however, is not as current. For example, the two most commonly used HDLs date back to the 1980s. Updates to the standards lag behind modern programming languages in structural abstractions such as types, encapsulation, and parameterization. Their behavioral semantics lag even further. They are specified in terms of event-driven simulators running on uniprocessor von Neumann machines.
Arrogance in Business Planning:
Technology business plans that assume no competition (ever)
In the Internet addressing and naming market there’s a lot of competition, margins are thin, and the premiums on good planning and good execution are nowhere higher. To survive, investors and entrepreneurs have to be bold. Some entrepreneurs, however, go beyond "bold" and enter the territory of "arrogant" by making the wild assumption that they will have no competitors if they create a new and profitable niche. So it is with those who would unilaterally supplant or redraw the existing Internet resource governance or allocation systems.
The Pain of Implementing LINQ Providers:
It’s no easy task for NoSQL
I remember sitting on the edge of my seat watching the 2005 PDC (Professional Developers Conference) videos that first showed LINQ (Language Integrated Query). I wanted LINQ: it offered just about everything that I could hope for to make working with data easy. The impetus for building queries into the language is quite simple; it is something that is used all the time; and the promise of a unified querying model is good enough, even before you add all the language goodies that were dropped on us. Being able to write in C# and have the database magically understand what I am doing?
Computing without Processors:
Heterogeneous systems allow us to target our programming to the appropriate environment.
From the programmer’s perspective the distinction between hardware and software is being blurred. As programmers struggle to meet the performance requirements of today’s systems, they will face an ever increasing need to exploit alternative computing elements such as GPUs (graphics processing units), which are graphics cards subverted for data-parallel computing, and FPGAs (field-programmable gate arrays), or soft hardware.
The Robustness Principle Reconsidered:
Seeking a middle ground
In 1981, Jon Postel formulated the Robustness Principle, also known as Postel’s Law, as a fundamental implementation guideline for the then-new TCP. The intent of the Robustness Principle was to maximize interoperability between network service implementations, particularly in the face of ambiguous or incomplete specifications. If every implementation of some service that generates some piece of protocol did so using the most conservative interpretation of the specification and every implementation that accepted that piece of protocol interpreted it using the most generous interpretation, then the chance that the two services would be able to talk with each other would be maximized.
DSL for the Uninitiated:
Domain-specific languages bridge the semantic gap in programming.
One of the main reasons why software projects fail is the lack of communication between the business users, who actually know the problem domain, and the developers who design and implement the software model. Business users understand the domain terminology, and they speak a vocabulary that may be quite alien to the software people; it’s no wonder that the communication model can break down right at the beginning of the project life cycle.
If You Have Too Much Data, then "Good Enough" Is Good Enough:
In today’s humongous database systems, clarity may be relaxed, but business needs can still be met.
Classic database systems offer crisp answers for a relatively small amount of data. These systems hold their data in one or a relatively small number of computers. With a tightly defined schema and transactional consistency, the results returned from queries are crisp and accurate. New systems have humongous amounts of data content, change rates, and querying rates and take lots of computers to hold and process. The data quality and meaning are fuzzy. The schema, if present, is likely to vary across the data. The origin of the data may be suspect, and its staleness may vary.
Passing a Language through the Eye of a Needle:
How the embeddability of Lua impacted its design
Scripting languages are an important element in the current landscape of programming languages. A key feature of a scripting language is its ability to integrate with a system language. This integration takes two main forms: extending and embedding. In the first form, you extend the scripting language with libraries and functions written in the system language and write your main program in the scripting language. In the second form, you embed the scripting language in a host program (written in the system language) so that the host can run scripts and call functions defined in the scripts; the main program is the host program.
Scalable SQL:
How do large-scale sites and applications remain SQL-based?
One of the leading motivators for NoSQL innovation is the desire to achieve very high scalability to handle the vagaries of Internet-size workloads. Yet many big social Web sites and many other Web sites and distributed tier 1 applications that require high scalability reportedly remain SQL-based for their core data stores and services. The question is, how do they do it?
Mobile Application Development: Web vs. Native:
Web apps are cheaper to develop and deploy than native apps, but can they match the native user experience?
A few short years ago, most mobile devices were, for want of a better word, "dumb." Sure, there were some early smartphones, but they were either entirely e-mail focused or lacked sophisticated touch screens that could be used without a stylus. Even fewer shipped with a decent mobile browser capable of displaying anything more than simple text, links, and maybe an image. This meant if you had one of these devices, you were either a businessperson addicted to e-mail or an alpha geek hoping that this would be the year of the smartphone.
Weapons of Mass Assignment:
A Ruby on Rails app highlights some serious, yet easily avoided, security vulnerabilities.
In May 2010, during a news cycle dominated by users’ widespread disgust with Facebook privacy policies, a team of four students from New York University published a request for $10,000 in donations to build a privacy-aware Facebook alternative. The software, Diaspora, would allow users to host their own social networks and own their own data. The team promised to open-source all the code they wrote, guaranteeing the privacy and security of users’ data by exposing the code to public scrutiny. With the help of front-page coverage from the New York Times, the team ended up raising more than $200,000.
A co-Relational Model of Data for Large Shared Data Banks:
Contrary to popular belief, SQL and noSQL are really just two sides of the same coin.
Fueled by their promise to solve the problem of distilling valuable information and business insight from big data in a scalable and programmer-friendly way, noSQL databases have been one of the hottest topics in our field recently. With a plethora of open source and commercial offerings and a surrounding cacophony of technical terms, however, it is hard for businesses and practitioners to see the forest for the trees.
Successful Strategies for IPv6 Rollouts. Really.:
Knowing where to begin is half the battle.
The design of TCP/IP began in 1973 when Robert Kahn and I started to explore the ramifications of interconnecting different kinds of packet-switched networks. We published a concept paper in May 1974, and a fairly complete specification for TCP was published in December 1974. By the end of 1975, several implementations had been completed and many problems were identified. Iteration began, and by 1977 it was concluded that TCP (by now called Transmission Control Protocol) should be split into two protocols: a simple Internet Protocol that carried datagrams end to end through packet networks interconnected through gateways; and a TCP that managed the flow and sequencing of packets exchanged between hosts on the contemplated Internet.
Returning Control to the Programmer:
Exposing SIMD units within interpreted languages could simplify programs and unleash floods of untapped processor power.
Server and workstation hardware architecture is continually improving, yet interpreted languages have failed to keep pace with the proper utilization of modern processors. SIMD (single instruction, multiple data) units are available in nearly every current desktop and server processor and are greatly underutilized, especially with interpreted languages. If multicore processors continue their current growth pattern, interpreted-language performance will begin to fall behind, since current native compilers and languages offer better automated SIMD optimization and direct SIMD mapping support.
Testable System Administration:
Models of indeterminism are changing IT management.
The methods of system administration have changed little in the past 20 years. While core IT technologies have improved in a multitude of ways, for many if not most organizations system administration is still based on production-line build logistics (aka provisioning) and reactive incident handling. As we progress into an information age, humans will need to work less like the machines they use and embrace knowledge-based approaches. That means exploiting simple (hands-free) automation that leaves us unencumbered to discover patterns and make decisions.
National Internet Defense - Small States on the Skirmish Line:
Attacks in Estonia and Georgia highlight key vulnerabilities in national Internet infrastructure.
Despite the global and borderless nature of the Internet’s underlying protocols and driving philosophy, there are significant ways in which it remains substantively territorial. Nations have policies and laws that govern and attempt to defend "their Internet". This is far less palpable than a nation’s physical territory or even than "its air" or "its water". Cyberspace is still a much wilder frontier, hard to define and measure. Where its effects are noted and measurable, all too often they are hard to attribute to responsible parties.
Finding Usability Bugs with Automated Tests:
Automated usability tests can be valuable companions to in-person tests.
Ideally, all software should be easy to use and accessible for a wide range of people; however, even software that appears to be modern and intuitive often falls short of the most basic usability and accessibility goals. Why does this happen? One reason is that sometimes our designs look appealing so we skip the step of testing their usability and accessibility; all in the interest of speed, reducing costs, and competitive advantage.
System Administration Soft Skills:
How can system administrators reduce stress and conflict in the workplace?
System administration can be both stressful and rewarding. Stress generally comes from outside factors such as conflict between SAs (system administrators) and their colleagues, a lack of resources, a high-interrupt environment, conflicting priorities, and SAs being held responsible for failures outside their control. What can SAs and their managers do to alleviate the stress? There are some well-known interpersonal and time-management techniques that can help, but these can be forgotten in times of crisis or just through force of habit.
A Plea to Software Vendors from Sysadmins - 10 Do’s and Don’ts:
What can software vendors do to make the lives of sysadmins a little easier?
A friend of mine is a grease monkey: the kind of auto enthusiast who rebuilds engines for fun on a Saturday night. He explained to me that certain brands of automobiles were designed in ways to make the mechanic’s job easier. Others, however, were designed as if the company had a pact with the aspirin industry to make sure there are plenty of mechanics with headaches. He said those car companies hate mechanics. I understood completely because, as a system administrator, I can tell when software vendors hate me. It shows in their products.
Collaboration in System Administration:
For sysadmins, solving problems usually involves collaborating with others. How can we make it more effective?
George was in trouble. A seemingly simple deployment was taking all morning, and there seemed no end in sight. His manager kept coming in to check on his progress, as the customer was anxious to have the deployment done. He was supposed to be leaving for a goodbye lunch for a departing co-worker, adding to the stress. He had called in all kinds of help, including colleagues, an application architect, technical support, and even one of the system developers. He used e-mail, instant messaging, face-to-face contacts, his phone, and even his office mate’s phone to communicate with everyone. And George was no novice.
Virtualization: Blessing or Curse?:
Managing virtualization at a large scale is fraught with hidden challenges.
Virtualization is often touted as the solution to many challenging problems, from resource underutilization to data-center optimization and carbon emission reduction. The hidden costs of virtualization, largely stemming from the complex and difficult system administration challenges it poses, are often overlooked, however. Reaping the fruits of virtualization requires the enterprise to navigate scalability limitations, revamp traditional operational practices, manage performance, and achieve unprecedented cross-silo collaboration. Virtualization is not a curse: it can bring material benefits, but only to the prepared.
The Case Against Data Lock-in:
Want to keep your users? Just make it easy for them to leave.
Engineers employ many different tactics to focus on the user when writing software: for example, listening to user feedback, fixing bugs, and adding features that their users are clamoring for. Since Web-based services have made it easier for users to move to new applications, it’s becoming even more important to focus on building and retaining user trust. We’ve found that an incredibly effective way to earn and maintain user trust is to make it easy for users to leave your product with their data in tow. This not only prevents lock-in and engenders trust, but also forces your team to innovate and compete on technical merit.
Keeping Bits Safe: How Hard Can It Be?:
As storage systems grow larger and larger, protecting their data for long-term storage is becoming more and more challenging.
These days, we are all data pack rats. Storage is cheap, so if there’s a chance the data could possibly be useful, we keep it. We know that storage isn’t completely reliable, so we keep backup copies as well. But the more data we keep, and the longer we keep it, the greater the chance that some of it will be unrecoverable when we need it.
Tackling Architectural Complexity with Modeling:
Component models can help diagnose architectural problems in both new and existing systems.
The ever-increasing might of modern computers has made it possible to solve problems once thought too difficult to tackle. Far too often, however, the systems for these functionally complex problem spaces have overly complicated architectures. In this article I use the term architecture to refer to the overall macro design of a system rather than the details of how the individual parts are implemented. The system architecture is what is behind the scenes of usable functionality, including internal and external communication mechanisms, component boundaries and coupling, and how the system will make use of any underlying infrastructure (databases, networks, etc.) .
Thinking Clearly about Performance:
Improving the performance of complex software is difficult, but understanding some fundamental principles can make it easier.
When I joined Oracle Corporation in 1989, performance was difficult. Only a few people claimed they could do it very well, and those people commanded high consulting rates. When circumstances thrust me into the "Oracle tuning" arena, I was quite unprepared. Recently, I’ve been introduced to the world of "MySQL tuning," and the situation seems very similar to what I saw in Oracle more than 20 years ago.
Computers in Patient Care: The Promise and the Challenge:
Information technology has the potential to radically transform health care. Why has progress been so slow?
A 29-year-old female from New York City comes in at 3 a.m. to an ED (emergency department) in California, complaining of severe acute abdominal pain that woke her up. She reports that she is in California attending a wedding and that she has suffered from similar abdominal pain in the recent past, most recently resulting in an appendectomy. The emergency physician performs an abdominal CAT scan and sees what he believes to be an artifact from the appendectomy in her abdominal cavity. He has no information about the patient’s past history other than what she is able to tell him; he has no access to any images taken before or after the appendectomy, nor does he have any other vital information about the surgical operative note or follow-up.
Injecting Errors for Fun and Profit:
Error-detection and correction features are only as good as our ability to test them.
It is an unfortunate fact of life that anything with moving parts eventually wears out and malfunctions, and electronic circuitry is no exception. In this case, of course, the moving parts are electrons. In addition to the wear-out mechanisms of electromigration (the moving electrons gradually push the metal atoms out of position, causing wires to thin, thus increasing their resistance and eventually producing open circuits) and dendritic growth (the voltage difference between adjacent wires causes the displaced metal atoms to migrate toward each other, just as magnets will attract each other, eventually causing shorts), electronic circuits are also vulnerable to background radiation.
Lessons from the Letter:
Security flaws in a large organization
I recently received a letter in which a company notified me that they had exposed some of my personal information. While it is now quite common for personal data to be stolen, this letter amazed me because of how well it pointed out two major flaws in the systems of the company that lost the data. I am going to insert three illuminating paragraphs here and then discuss what they actually can teach us.
Software Development with Code Maps:
Could those ubiquitous hand-drawn code diagrams become a thing of the past?
To better understand how professional software developers use visual representations of their code, we interviewed nine developers at Microsoft to identify common scenarios, and then surveyed more than 400 developers to understand the scenarios more deeply.
The Ideal HPC Programming Language:
Maybe it’s Fortran. Or maybe it just doesn’t matter.
The DARPA HPCS program sought a tenfold productivity improvement in trans-petaflop systems for HPC. This article describes programmability studies undertaken by Sun Microsystems in its HPCS participation. These studies were distinct from Sun’s ongoing development of a new HPC programming language (Fortress) and the company’s broader HPCS productivity studies, though there was certainly overlap with both activities.
Visualizing System Latency:
Heat maps are a unique and powerful way to visualize latency data. Explaining the results, however, is an ongoing challenge.
When I/O latency is presented as a visual heat map, some intriguing and beautiful patterns can emerge. These patterns provide insight into how a system is actually performing and what kinds of latency end-user applications experience. Many characteristics seen in these patterns are still not understood, but so far their analysis is revealing systemic behaviors that were previously unknown.
A Tour through the Visualization Zoo:
A survey of powerful visualization techniques, from the obvious to the obscure
Thanks to advances in sensing, networking, and data management, our society is producing digital information at an astonishing rate. According to one estimate, in 2010 alone we will generate 1,200 exabytes -- 60 million times the content of the Library of Congress. Within this deluge of data lies a wealth of valuable information on how we conduct our businesses, governments, and personal lives. To put the information to good use, we must find ways to explore, relate, and communicate the data meaningfully.
Securing Elasticity in the Cloud:
Elastic computing has great potential, but many security challenges remain.
As somewhat of a technology-hype curmudgeon, I was until very recently in the camp that believed cloud computing was not much more than the latest marketing-driven hysteria for an idea that has been around for years. Outsourced IT infrastructure services, aka IaaS (Infrastructure as a Service), has been around since at least the 1980s, delivered by the telecommunication companies and major IT outsourcers. Hosted applications, aka PaaS (Platform as a Service) and SaaS (Software as a Service), were in vogue in the 1990s in the form of ASPs (application service providers).
Principles of Robust Timing over the Internet:
The key to synchronizing clocks over networks is taming delay variability.
Everyone, and most everything, needs a clock, and computers are no exception. Clocks tend to drift off if left to themselves, however, so it is necessary to bring them to heel periodically through synchronizing to some other reference clock of higher accuracy. An inexpensive and convenient way to do this is over a computer network.
Why Cloud Computing Will Never Be Free:
The competition among cloud providers may drive prices downward, but at what cost?
The last time the IT industry delivered outsourced shared-resource computing to the enterprise was with timesharing in the 1980s, when it evolved to a high art, delivering the reliability, performance, and service the enterprise demanded. Today, cloud computing is poised to address the needs of the same market, based on a revolution of new technologies, significant unused computing capacity in corporate data centers, and the development of a highly capable Internet data communications infrastructure. The economies of scale of delivering computing from a centralized, shared infrastructure have set the expectation among customers that cloud-computing costs will be significantly lower than those incurred from providing their own computing.
Simplicity Betrayed:
Emulating a video system shows how even a simple interface can be more complex—and capable—than it appears.
An emulator is a program that runs programs built for different computer architectures from the host platform that supports the emulator. Approaches differ, but most emulators simulate the original hardware in some way. At a minimum the emulator interprets the original CPU instructions and provides simulated hardware-level devices for input and output. For example, keyboard input is taken from the host platform and translated into the original hardware format, resulting in the emulated program "seeing" the same sequence of keystrokes. Conversely, the emulator will translate the original hardware screen format into an equivalent form on the host machine.
Enhanced Debugging with Traces:
An essential technique used in emulator development is a useful addition to any programmer’s toolbox.
Creating an emulator to run old programs is a difficult task. You need a thorough understanding of the target hardware and the correct functioning of the original programs that the emulator is to execute. In addition to being functionally correct, the emulator must hit a performance target of running the programs at their original realtime speed. Reaching these goals inevitably requires a considerable amount of debugging. The bugs are often subtle errors in the emulator itself but could also be a misunderstanding of the target hardware or an actual known bug in the original program. (It is also possible the binary data for the original program has become subtly corrupted or is not the version expected.)
Cooling the Data Center:
What can be done to make cooling systems in data centers more energy efficient?
Power generation accounts for about 40 to 45 percent of the primary energy supply in the US and the UK, and a good fraction is used to heat, cool, and ventilate buildings. A new and growing challenge in this sector concerns computer data centers and other equipment used to cool computer data systems. On the order of 6 billion kilowatt hours of power was used in data centers in 2006 in the US, representing about 1.5 percent of the country’s electricity consumption.
Toward Energy-Efficient Computing:
What will it take to make server-side computing more energy efficient?
By now, most everyone is aware of the energy problem at its highest level: our primary sources of energy are running out, while the demand for energy in both commercial and domestic environments is increasing, and the side effects of energy use have important global environmental considerations. The emission of greenhouse gases such as CO, now seen by most climatologists to be linked to global warming, is only one issue.
Managing Contention for Shared Resources on Multicore Processors:
Contention for caches, memory controllers, and interconnects can be alleviated by contention-aware scheduling algorithms.
Modern multicore systems are designed to allow clusters of cores to share various hardware structures, such as LLCs (last-level caches; for example, L2 or L3), memory controllers, and interconnects, as well as prefetching hardware. We refer to these resource-sharing clusters as memory domains, because the shared resources mostly have to do with the memory hierarchy.
Power-Efficient Software:
Power-manageable hardware can help save energy, but what can software developers do to address the problem?
The rate at which power-management features have evolved is nothing short of amazing. Today almost every size and class of computer system, from the smallest sensors and handheld devices to the "big iron" servers in data centers, offers a myriad of features for reducing, metering, and capping power consumption. Without these features, fan noise would dominate the office ambience, and untethered laptops would remain usable for only a few short hours (and then only if one could handle the heat), while data-center power and cooling costs and capacity would become unmanageable.
Triple-Parity RAID and Beyond:
As hard-drive capacities continue to outpace their throughput, the time has come for a new level of RAID.
How much longer will current RAID techniques persevere? The RAID levels were codified in the late 1980s; double-parity RAID, known as RAID-6, is the current standard for high-availability, space-efficient storage. The incredible growth of hard-drive capacities, however, could impose serious limitations on the reliability even of RAID-6 systems. Recent trends in hard drives show that triple-parity RAID must soon become pervasive. In 2005, Scientific American reported on Kryder’s law, which predicts that hard-drive density will double annually. While the rate of doubling has not quite maintained that pace, it has been close.
Data in Flight:
How streaming SQL technology can help solve the Web 2.0 data crunch.
Web applications produce data at colossal rates, and those rates compound every year as the Web becomes more central to our lives. Other data sources such as environmental monitoring and location-based services are a rapidly expanding part of our day-to-day experience. Even as throughput is increasing, users and business owners expect to see their data with ever-decreasing latency. Advances in computer hardware (cheaper memory, cheaper disks, and more processing cores) are helping somewhat, but not enough to keep pace with the twin demands of rising throughput and decreasing latency.
Maximizing Power Efficiency with Asymmetric Multicore Systems:
Asymmetric multicore systems promise to use a lot less energy than conventional symmetric processors. How can we develop software that makes the most out of this potential?
In computing systems, a CPU is usually one of the largest consumers of energy. For this reason, reducing CPU power consumption has been a hot topic in the past few years in both the academic community and the industry. In the quest to create more power-efficient CPUs, several researchers have proposed an asymmetric multicore architecture that promises to save a significant amount of power while delivering similar performance to conventional symmetric multicore processors.
Other People’s Data:
Companies have access to more types of external data than ever before. How can they integrate it most effectively?
Every organization bases some of its critical decisions on external data sources. In addition to traditional flat file data feeds, Web services and Web pages are playing an increasingly important role in data warehousing. The growth of Web services has made data feeds easily consumable at the departmental and even end-user levels. There are now more than 1,500 publicly available Web services and thousands of data mashups ranging from retail sales data to weather information to United States census data. These mashups are evidence that when users need information, they will find a way to get it.
What DNS Is Not:
DNS is many things to many people - perhaps too many things to too many people.
DNS (Domain Name System) is a hierarchical, distributed, autonomous, reliable database. The first and only of its kind, it offers realtime performance levels to a global audience with global contributors. Every TCP/IP traffic flow including every World Wide Web page view begins with at least one DNS transaction. DNS is, in a word, glorious.
You Don’t Know Jack About Software Maintenance:
Long considered an afterthought, software maintenance is easiest and most effective when built into a system from the ground up.
Everyone knows maintenance is hard and boring, and avoids doing it. Besides, their pointy-haired bosses say things like: "No one needs to do maintenance - that’s a waste of time."
Metamorphosis: the Coming Transformation of Translational Systems Biology:
In the future computers will mine patient data to deliver faster, cheaper healthcare, but how will we design them to give informative causal explanations? Ideas from philosophy, model checking, and statistical testing can pave the way for the needed translational systems biology.
One morning, as Gregorina Samsa was waking up from anxious dreams, she discovered that she had become afflicted with certain mysterious flu-like symptoms that appeared without any warning. Equally irritating, this capricious metamorphosis seemed impervious to a rational explanation in terms of causes and effects. "What’s happened to me?" she thought. Before seeing a doctor, she decided to find out more about what might ail her. She logged on to a Web site where she annotated a timeline with what she could remember. Since March, she’d had more headaches than usual, and then in April she had begun to experience more fatigue after exercise, and as of July she had also experienced occasional lapses in memory.
Probing Biomolecular Machines with Graphics Processors:
The evolution of GPU processors and programming tools is making advanced simulation and analysis techniques accessible to a growing community of biomedical scientists.
Computer simulation has become an integral part of the study of the structure and function of biological molecules. For years, parallel computers have been used to conduct these computationally demanding simulations and to analyze their results. These simulations function as a "computational microscope," allowing the scientist to observe details of molecular processes too small, fast, or delicate to capture with traditional instruments. Over time, commodity GPUs (graphics processing units) have evolved into massively parallel computing devices, and more recently it has become possible to program them in dialects of the popular C/C++ programming languages.
Unifying Biological Image Formats with HDF5:
The biosciences need an image format capable of high performance and long-term maintenance. Is HDF5 the answer?
The biological sciences need a generic image format suitable for long-term storage and capable of handling very large images. Images convey profound ideas in biology, bridging across disciplines. Digital imagery began 50 years ago as an obscure technical phenomenon. Now it is an indispensable computational tool. It has produced a variety of incompatible image file formats, most of which are already obsolete.
A Threat Analysis of RFID Passports:
Do RFID passports make us vulnerable to identity theft?
It’s a beautiful day when your plane touches down at the airport. After a long vacation, you feel rejuvenated, refreshed, and relaxed. When you get home, everything is how you left it. Everything, that is, but a pile of envelopes on the floor that jammed the door as you tried to swing it open. You notice a blinking light on your answering machine and realize you’ve missed dozens of messages. As you click on the machine and pick up the envelopes, you find that most of the messages and letters are from debt collectors. Most of the envelopes are stamped "urgent," and as you sift through the pile you can hear the messages from angry creditors demanding that you call them immediately.
Communications Surveillance: Privacy and Security at Risk:
As the sophistication of wiretapping technology grows, so too do the risks it poses to our privacy and security.
We all know the scene: It is the basement of an apartment building and the lights are dim. The man is wearing a trench coat and a fedora pulled down low to hide his face. Between the hat and the coat we see headphones, and he appears to be listening intently to the output of a set of alligator clips attached to a phone line. He is a detective eavesdropping on a suspect’s phone calls. This is wiretapping. It doesn’t have much to do with modern electronic eavesdropping, which is about bits, packets, switches, and routers.
Four Billion Little Brothers? Privacy, mobile phones, and ubiquitous data collection:
Participatory sensing technologies could improve our lives and our communities, but at what cost to our privacy?
They place calls, surf the Internet, and there are close to 4 billion of them in the world. Their built-in microphones, cameras, and location awareness can collect images, sound, and GPS data. Beyond chatting and texting, these features could make phones ubiquitous, familiar tools for quantifying personal patterns and habits. They could also be platforms for thousands to document a neighborhood, gather evidence to make a case, or study mobility and health. This data could help you understand your daily carbon footprint, exposure to air pollution, exercise habits, and frequency of interactions with family and friends.
Making Sense of Revision-control Systems:
Whether distributed or centralized, all revision-control systems come with complicated sets of tradeoffs. How do you find the best match between tool and team?
Modern software is tremendously complicated, and the methods that teams use to manage its development reflect this complexity. Though many organizations use revision-control software to track and manage the complexity of a project as it evolves, the topic of how to make an informed choice of revision-control tools has received scant attention. Until fairly recently, the world of revision control was moribund, so there was simply not much to say on this subject.
Monitoring and Control of Large Systems with MonALISA:
MonALISA developers describe how it works, the key design principles behind it, and the biggest technical challenges in building it.
The HEP (high energy physics) group at the California Institute of Technology started developing the MonALISA (Monitoring Agents using a Large Integrated Services Architecture) framework in 2002, aiming to provide a distributed service system capable of controlling and optimizing large-scale, data-intensive applications. Its initial target field of applications is the grid systems and the networks supporting data processing and analysis for HEP collaborations. Our strategy in trying to satisfy the demands of data-intensive applications was to move to more synergetic relationships between the applications, computing, and storage facilities and the network infrastructure.
The Pathologies of Big Data:
Scale up your datasets enough and all your apps will come undone. What are the typical problems and where do the bottlenecks generally surface?
What is "big data" anyway? Gigabytes? Terabytes? Petabytes? A brief personal memory may provide some perspective. In the late 1980s at Columbia University I had the chance to play around with what at the time was a truly enormous "disk": the IBM 3850 MSS (Mass Storage System). The MSS was actually a fully automatic robotic tape library and associated staging disks to make random access, if not exactly instantaneous, at least fully transparent. In Columbia’s configuration, it stored a total of around 100 GB. It was already on its way out by the time I got my hands on it, but in its heyday, the early to mid-1980s, it had been used to support access by social scientists to what was unquestionably "big data" at the time: the entire 1980 U.S.
Browser Security: Lessons from Google Chrome:
Google Chrome developers focused on three key problems to shield the browser from attacks.
The Web has become one of the primary ways people interact with their computers, connecting people with a diverse landscape of content, services, and applications. Users can find new and interesting content on the Web easily, but this presents a security challenge: malicious Web-site operators can attack users through their Web browsers. Browsers face the challenge of keeping their users safe while providing a rich platform for Web applications.
Whither Sockets?:
High bandwidth, low latency, and multihoming challenge the sockets API.
One of the most pervasive and longest-lasting interfaces in software is the sockets API. Developed by the Computer Systems Research Group at the University of California at Berkeley, the sockets API was first released as part of the 4.1c BSD operating system in 1982. While there are longer-lived APIs, it is quite impressive for an API to have remained in use and largely unchanged for 27 years. The only major update to the sockets API has been the extension of ancillary routines to accommodate the larger addresses used by IPv6.
Network Front-end Processors, Yet Again:
The history of NFE processors sheds light on the tradeoffs involved in designing network stack software.
The history of the NFE (network front-end) processor, currently best known as a TOE (TCP offload engine), extends all the way back to the Arpanet IMP (interface message processor) and possibly before. The notion is beguilingly simple: partition the work of executing communications protocols from the work of executing the "applications" that require the services of those protocols. That way, the applications and the network machinery can achieve maximum performance and efficiency, possibly taking advantage of special hardware performance assistance. While this looks utterly compelling on the whiteboard, architectural and implementation realities intrude, often with considerable force.
Fighting Physics: A Tough Battle:
Thinking of doing IPC over the long haul? Think again. The laws of physics say you’re hosed.
Over the past several years, SaaS (software as a service) has become an attractive option for companies looking to save money and simplify their computing infrastructures. SaaS is an interesting group of techniques for moving computing from the desktop to the cloud; however, as it grows in popularity, engineers should be aware of some of the fundamental limitations they face when developing these kinds of distributed applications - in particular, the finite speed of light.
Cybercrime 2.0: When the Cloud Turns Dark:
Web-based malware attacks are more insidious than ever. What can be done to stem the tide?
As the Web has become vital for day-to-day transactions, it has also become an attractive avenue for cybercrime. Financially motivated, the crime we see on the Web today is quite different from the more traditional network attacks. A few years ago Internet attackers relied heavily on remotely exploiting servers identified by scanning the Internet for vulnerable network services. Autonomously spreading computer worms such as Code Red and SQLSlammer were examples of such scanning attacks. Their huge scale put even the Internet at large at risk; for example, SQLSlammer generated traffic sufficient to melt down backbones.
How Do I Model State? Let Me Count the Ways:
A study of the technology and sociology of Web services specifications
There is nothing like a disagreement concerning an arcane technical matter to bring out the best (and worst) in software architects and developers. As every reader knows from experience, it can be hard to get to the bottom of what exactly is being debated. One reason for this lack of clarity is often that different people care about different aspects of the problem. In the absence of agreement concerning the problem, it can be difficult to reach an agreement about the solutions.
Security in the Browser:
Web browsers leave users vulnerable to an ever-growing number of attacks. Can we make them secure while preserving their usability?
Sealed in a depleted uranium sphere at the bottom of the ocean. That’s the often-mentioned description of what it takes to make a computer reasonably secure. Obviously, in the Internet age or any other, such a machine would be fairly useless.
The Obama Campaign:
The Obama campaign has been praised for its innovative use of technology. What was the key to its success?
On January 3, 2008, I sat in the boiler room waiting for the caucus to commence. At 7 p.m. the doors had been open for about an hour: months of preparation were coming to fruition. The phone calls had been made, volunteers had been canvassing, and now the moment had come. Could Barack Obama win the Iowa caucus? Doors closed and the first text message came from a precinct: it looked like a large attendance. Then came the second, the third, the fourth. Each was typed into our model, and a projection was starting to form. The fifth, the sixth, and now the seventh.
Purpose-Built Languages:
While often breaking the rules of traditional language design, the growing ecosystem of purpose-built "little" languages is an essential part of systems development.
In my college computer science lab, two eternal debates flourished during breaks from long nights of coding and debugging: "emacs versus vi?"; and "what is the best programming language?" Later, as I began my career in industry, I noticed that the debate over programming languages was also going on in the hallways of Silicon Valley campuses. It was the ’90s, and at Sun many of us were watching Java claim significant mindshare among developers, particularly those previously developing in C or C++.
Code Spelunking Redux:
Is this subject important enough to warrant two articles in five years? I believe it is.
It has been five years since I first wrote about code spelunking, and though systems continue to grow in size and scope, the tools we use to understand those systems are not growing at the same rate. In fact, I believe we are steadily losing ground. So why should we go over the same ground again? Is this subject important enough to warrant two articles in five years? I believe it is.
Better Scripts, Better Games:
Smarter, more powerful scripting languages will improve game performance while making gameplay development more efficient.
The video game industry earned $8.85 billion in revenue in 2007, almost as much as movies made at the box office. Much of this revenue was generated by blockbuster titles created by large groups of people. Though large development teams are not unheard of in the software industry, game studios tend to have unique collections of developers. Software engineers make up a relatively small portion of the game development team, while the majority of the team consists of content creators such as artists, musicians, and designers.
Scaling in Games & Virtual Worlds:
Online games and virtual worlds have familiar scaling requirements, but don’t be fooled: everything you know is wrong.
I used to be a systems programmer, working on infrastructure used by banks, telecom companies, and other engineers. I worked on operating systems. I worked on distributed middleware. I worked on programming languages. I wrote tools. I did all of the things that hard-core systems programmers do.
XML Fever:
Don’t let delusions about XML develop into a virulent strain of XML fever.
XML (Extensible Markup Language), which just celebrated its 10th birthday, is one of the big success stories of the Web. Apart from basic Web technologies (URIs, HTTP, and HTML) and the advanced scripting driving the Web 2.0 wave, XML is by far the most successful and ubiquitous Web technology. With great power, however, comes great responsibility, so while XML’s success is well earned as the first truly universal standard for structured data, it must now deal with numerous problems that have grown up around it.
High Performance Web Sites:
Want to make your Web site fly? Focus on front-end performance.
Google Maps, Yahoo! Mail, Facebook, MySpace, YouTube, and Amazon are examples of Web sites built to scale. They access petabytes of data sending terabits per second to millions of users worldwide. The magnitude is awe-inspiring. Users view these large-scale Web sites from a narrower perspective. The typical user has megabytes of data that are downloaded at a few hundred kilobits per second. Users are not so interested in the massive number of requests per second being served; they care more about their individual requests.
Improving Performance on the Internet:
Given the Internet’s bottlenecks, how can we build fast, scalable content-delivery systems?
When it comes to achieving performance, reliability, and scalability for commercial-grade Web applications, where is the biggest bottleneck? In many cases today, we see that the limiting bottleneck is the middle mile, or the time data spends traveling back and forth across the Internet, between origin server and end user.
Eventually Consistent:
Building reliable distributed systems at a worldwide scale demands trade-offs?between consistency and availability.
At the foundation of Amazon’s cloud computing are infrastructure services such as Amazon’s S3 (Simple Storage Service), SimpleDB, and EC2 (Elastic Compute Cloud) that provide the resources for constructing Internet-scale computing platforms and a great variety of applications. The requirements placed on these infrastructure services are very strict; they need to score high marks in the areas of security, scalability, availability, performance, and cost effectiveness, and they need to meet these requirements while serving millions of customers around the globe, continuously.
Building Scalable Web Services:
Build only what you really need.
In the early days of the Web we severely lacked tools and frameworks, and in retrospect it seems noteworthy that those early Web services scaled at all. Nowadays, while the tools have progressed, so too have expectations with respect to richness of interaction, performance, and scalability. In view of these raised expectations it is advisable to build only what you really need, relying on other people’s work where possible. Above all, be cautious in choosing when, what, and how to optimize.
Software Transactional Memory: Why Is It Only a Research Toy?:
The promise of STM may likely be undermined by its overheads and workload applicabilities.
TM (transactional memory) is a concurrency control paradigm that provides atomic and isolated execution for regions of code. TM is considered by many researchers to be one of the most promising solutions to address the problem of programming multicore processors. Its most appealing feature is that most programmers only need to reason locally about shared data accesses, mark the code region to be executed transactionally, and let the underlying system ensure the correct concurrent execution. This model promises to provide the scalability of fine-grain locking, while avoiding common pitfalls of lock composition such as deadlock.
Parallel Programming with Transactional Memory:
While sometimes even writing regular, single-threaded programs can be quite challenging, trying to split a program into multiple pieces that can be executed in parallel adds a whole dimension of additional problems. Drawing upon the transaction concept familiar to most programmers, transactional memory was designed to solve some of these problems and make parallel programming easier. Ulrich Drepper from Red Hat shows us how it’s done.
With the speed of individual cores no longer increasing at the rate we came to love over the past decades, programmers have to look for other ways to increase the speed of our ever-more-complicated applications. The functionality provided by the CPU manufacturers is an increased number of execution units, or CPU cores.
Erlang for Concurrent Programming:
What role can programming languages play in dealing with concurrency? One answer can be found in Erlang, a language designed for concurrency from the ground up.
Erlang is a language developed to let mere mortals write, test, deploy, and debug fault-tolerant concurrent software. Developed at the Swedish telecom company Ericsson in the late 1980s, it started as a platform for developing soft realtime software for managing phone switches. It has since been open-sourced and ported to several common platforms, finding a natural fit not only in distributed Internet server applications, but also in graphical user interfaces and ordinary batch applications.
Real-World Concurrency:
In this look at how concurrency affects practitioners in the real world, Cantrill and Bonwick argue that much of the anxiety over concurrency is unwarranted.
Software practitioners today could be forgiven if recent microprocessor developments have given them some trepidation about the future of software. While Moore’s law continues to hold (that is, transistor density continues to double roughly every 18 months), as a result of both intractable physical limitations and practical engineering considerations, that increasing density is no longer being spent on boosting clock rate. Instead, it is being used to put multiple CPU cores on a single CPU die.
The Five-Minute Rule 20 Years Later: and How Flash Memory Changes the Rules:
The old rule continues to evolve, while flash memory adds two new rules.
In 1987, Jim Gray and Gianfranco Putzolu published their now-famous five-minute rule for trading off memory and I/O capacity. Their calculation compares the cost of holding a record (or page) permanently in memory with the cost of performing disk I/O each time the record (or page) is accessed, using appropriate fractions of prices for RAM chips and disk drives. The name of their rule refers to the break-even interval between accesses. If a record (or page) is accessed more often, it should be kept in memory; otherwise, it should remain on disk and read when needed.
Enterprise SSDs:
Solid-state drives are finally ready for the enterprise. But beware, not all SSDs are created alike.
For designers of enterprise systems, ensuring that hardware performance keeps pace with application demands is a mind-boggling exercise. The most troubling performance challenge is storage I/O. Spinning media, while exceptional in scaling areal density, will unfortunately never keep pace with I/O requirements. The most cost-effective way to break through these storage I/O limitations is by incorporating high-performance SSDs (solid-state drives) into the systems.
Flash Storage Today:
Can flash memory become the foundation for a new tier in the storage hierarchy?
The past few years have been an exciting time for flash memory. The cost has fallen dramatically as fabrication has become more efficient and the market has grown; the density has improved with the advent of better processes and additional bits per cell; and flash has been adopted in a wide array of applications. The flash ecosystem has expanded and continues to expand especially for thumb drives, cameras, ruggedized laptops, and phones in the consumer space.
Flash Disk Opportunity for Server Applications:
Future flash-based disks could provide breakthroughs in IOPS, power, reliability, and volumetric capacity when compared with conventional disks.
NAND flash densities have been doubling each year since 1996. Samsung announced that its 32-gigabit NAND flash chips would be available in 2007. This is consistent with Chang-gyu Hwang’s flash memory growth model1 that NAND flash densities will double each year until 2010. Hwang recently extended that 2003 prediction to 2012, suggesting 64 times the current density250 GB per chip. This is hard to credit, but Hwang and Samsung have delivered 16 times since his 2003 article when 2-GB chips were just emerging. So, we should be prepared for the day when a flash drive is a terabyte(!) .
A Pioneer’s Flash of Insight:
Jim Gray’s vision of flash-based storage anchors this issue’s theme.
In the May/June issue of Queue, Eric Allman wrote a tribute to Jim Gray, mentioning that Queue would be running some of Jim’s best works in the months to come. I’m embarrassed to confess that when this idea was first discussed, I assumed these papers would consist largely of Jim’s seminal work on databases showing only that I (unlike everyone else on the Queue editorial board) never knew Jim. In an attempt to learn more about both his work and Jim himself, I attended the tribute held for him at UC Berkeley in May.
Distributed Computing Economics:
Computing economics are changing. Today there is rough price parity between: (1) one database access; (2) 10 bytes of network traffic; (3) 100,000 instructions; (4) 10 bytes of disk storage; and (5) a megabyte of disk bandwidth. This has implications for how one structures Internet-scale distributed computing: one puts computing as close to the data as possible in order to avoid expensive network traffic.
Computing is free. The world’s most powerful computer is free (SETI@Home is a 54-teraflop machine). Google freely provides a trillion searches per year to the world’s largest online database (two petabytes). Hotmail freely carries a trillion e-mail messages per year. Amazon.com offers a free book-search tool. Many sites offer free news and other free content. Movies, sports events, concerts, and entertainment are freely available via television.
Ode to a Sailor:
sailor, fleeting mood image of you; all sailor in bear grace, rough hands and poetic dream;
In memory of Jim Gray
A Tribute to Jim Gray
Computer science attracts many very smart people, but a few stand out above the others, somehow blessed with a kind of creativity that most of us are denied. Names such as Alan Turing, Edsger Dijkstra, and John Backus come to mind. Jim Gray is another.
BASE: An Acid Alternative:
In partitioned databases, trading some consistency for availability can lead to dramatic improvements in scalability.
Web applications have grown in popularity over the past decade. Whether you are building an application for end users or application developers (i.e., services), your hope is most likely that your application will find broad adoption and with broad adoption will come transactional growth. If your application relies upon persistence, then data storage will probably become your bottleneck.
Exposing the ORM Cache:
Familiarity with ORM caching issues can help prevent performance problems and bugs.
In the early 1990s, when object-oriented languages emerged into the mainstream of software development, a noticeable surge in productivity occurred as developers saw new and better ways to create software programs. Although the new and efficient object programming paradigm was hailed and accepted by a growing number of organizations, relational database management systems remained the preferred technology for managing enterprise data. Thus was born ORM (object-relational mapping), out of necessity, and the complex challenge of saving the persistent state of an object environment in a relational database subsequently became known as the object-relational impedance mismatch.
ORM in Dynamic Languages:
O/R mapping frameworks for dynamic languages such as Groovy provide a different flavor of ORM that can greatly simplify application code.
A major component of most enterprise applications is the code that transfers objects in and out of a relational database. The easiest solution is often to use an ORM (object-relational mapping) framework, which allows the developer to declaratively define the mapping between the object model and database schema and express database-access operations in terms of objects. This high-level approach significantly reduces the amount of database-access code that needs to be written and boosts developer productivity.
Bridging the Object-Relational Divide:
ORM technologies can simplify data access, but be aware of the challenges that come with introducing this new layer of abstraction.
Modern applications are built using two very different technologies: object-oriented programming for business logic; and relational databases for data storage. Object-oriented programming is a key technology for implementing complex systems, providing benefits of reusability, robustness, and maintainability. Relational databases are repositories for persistent data. ORM (object-relational mapping) is a bridge between the two that allows applications to access relational data in an object-oriented way.
The Yin and Yang of Software Development:
How infrastructure elements allow development teams to increase productivity without restricting creativity
The C/C++ Solution Manager at Parasoft explains how infrastructure elements allow development teams to increase productivity without restricting creativity.
Managing Collaboration:
Jeff Johnstone of TechExcel explains why there is a need for a new approach to application lifecycle management that better reflects the business requirements and challenges facing development teams.
I think that fundamentally development is thought of, has become more of a business process than simply a set of tools. In the past, like you said, developers and development organizations were kind of on their own. They were fairly autonomous and they would do things that were appropriate for each piece of the process and they would adopt technologies that were appropriate at a technology and tool level, but they didn’t really think of themselves as an integral part of any higher business process.
Getting Bigger Reach Through Speech:
Developers have a chance to significantly expand the appeal and reach of their applications by voice-enabling their applications, but is that going to be enough?
Mark Ericson, vice president of product strategy for BlueNote Networks argues that in order to take advantage of new voice technologies you have to have a plan for integrating that capability directly into the applications that drive your existing business processes.
From Liability to Advantage: A Conversation with John Graham-Cumming and John Ousterhout:
Software production has become a bottleneck in many development organizations.
Software production (the back-end of software development, including tasks such as build, test, package and deploy) has become a bottleneck in many development organizations. In this interview Electric Cloud founder John Ousterhout explains how you can turn software production from a liability to a competitive advantage.
Arm Your Applications for Bulletproof Deployment: A Conversation with Tom Spalthoff:
Companies can achieve a reliable desktop environment while reducing the time and cost spent preparing high-quality application packages.
The deployment of applications, updates, and patches is one of the most common - and risky - functions of any IT department. Deploying any application that isn’t properly configured for distribution can disrupt or crash critical applications and cost companies dearly in lost productivity and help-desk expenses - and companies do it every day. In fact, Gartner reports that even after 10 years of experience, most companies cannot automatically deploy software with a success rate of 90 percent or better.
Intellectual Property and Software Piracy:
The Power of IP Protection and Software Licensing, an interview with Aladdin vice president Gregg Gronowski
We’re here today to talk about intellectual property and the whole issue of software piracy and our friends at Aladdin are considered one of the de facto standards today for protecting software IP, preventing software piracy, and enabling software licensing and compliance. So joining us today to discuss that topic is Aladdin Vice President, Greg Gronowski.
Reconfigurable Future:
The ability to produce cheaper, more compact chips is a double-edged sword.
Predicting the future is notoriously hard. Sometimes I feel that the only real guarantee is that the future will happen, and that someone will point out how it’s not like what was predicted. Nevertheless, we seem intent on trying to figure out what will happen, and worse yet, recording these views so they can be later used against us. So here I go... Scaling has been driving the whole electronics industry, allowing it to produce chips with more transistors at a lower cost.
The Emergence of iSCSI:
Modern SCSI, as defined by the SCSI-3 Architecture Model, or SAM, really considers the cable and physical interconnections to storage as only one level in a larger hierarchy.
When most IT pros think of SCSI, images of fat cables with many fragile pins come to mind. Certainly, that’s one manifestation. But modern SCSI, as defined by the SCSI-3 Architecture Model, or SAM, really considers the cable and physical interconnections to storage as only one level in a larger hierarchy. By separating the instructions or commands sent to and from devices from the physical layers and their protocols, you arrive at a more generic approach to storage communication
DAFS: A New High-Performance Networked File System:
This emerging file-access protocol dramatically enhances the flow of data over a network, making life easier in the data center.
The Direct Access File System (DAFS) is a remote file-access protocol designed to take advantage of new high-throughput, low-latency network technology.
Future Graphics Architectures:
GPUs continue to evolve rapidly, but toward what?
Graphics architectures are in the midst of a major transition. In the past, these were specialized architectures designed to support a single rendering algorithm: the standard Z buffer. Realtime 3D graphics has now advanced to the point where the Z-buffer algorithm has serious shortcomings for generating the next generation of higher-quality visual effects demanded by games and other interactive 3D applications. There is also a desire to use the high computational capability of graphics architectures to support collision detection, approximate physics simulations, scene management, and simple artificial intelligence.
Scalable Parallel Programming with CUDA:
Is CUDA the parallel programming model that application developers have been waiting for?
The advent of multicore CPUs and manycore GPUs means that mainstream processor chips are now parallel systems. Furthermore, their parallelism continues to scale with Moore’s law. The challenge is to develop mainstream application software that transparently scales its parallelism to leverage the increasing number of processor cores, much as 3D graphics applications transparently scale their parallelism to manycore GPUs with widely varying numbers of cores.
Data-Parallel Computing:
Data parallelism is a key concept in leveraging the power of today’s manycore GPUs.
Users always care about performance. Although often it’s just a matter of making sure the software is doing only what it should, there are many cases where it is vital to get down to the metal and leverage the fundamental characteristics of the processor.
GPUs: A Closer Look:
As the line between GPUs and CPUs begins to blur, it’s important to understand what makes GPUs tick.
A gamer wanders through a virtual world rendered in near- cinematic detail. Seconds later, the screen fills with a 3D explosion, the result of unseen enemies hiding in physically accurate shadows. Disappointed, the user exits the game and returns to a computer desktop that exhibits the stylish 3D look-and-feel of a modern window manager. Both of these visual experiences require hundreds of gigaflops of computing performance, a demand met by the GPU (graphics processing unit) present in every consumer PC.
How OSGi Changed My Life:
The promises of the Lego hypothesis have yet to materialize fully, but they remain a goal worth pursuing.
In the early 1980s I discovered OOP (object-oriented programming) and fell in love with it, head over heels. As usual, this kind of love meant convincing management to invest in this new technology, and most important of all, send me to cool conferences. So I pitched the technology to my manager. I sketched him the rosy future, how one day we would create applications from ready-made classes. We would get those classes from a repository, put them together, and voila, a new application would be born. Today we take objects more or less for granted, but if I am honest, the pitch I gave to my manager in 1985 never really materialized.
Network Virtualization: Breaking the Performance Barrier:
Shared I/O in virtualization platforms has come a long way, but performance concerns remain.
The recent resurgence in popularity of virtualization has led to its use in a growing number of contexts, many of which require high-performance networking. Consider server consolidation, for example. The efficiency of network virtualization directly impacts the number of network servers that can effectively be consolidated onto a single physical machine. Unfortunately, modern network virtualization techniques incur significant overhead, which limits the achievable network performance. We need new network virtualization techniques to realize the full benefits of virtualization in network-intensive domains.
The Cost of Virtualization:
Software developers need to be aware of the compromises they face when using virtualization technology.
Virtualization can be implemented in many different ways. It can be done with and without hardware support. The virtualized operating system can be expected to be changed in preparation for virtualization, or it can be expected to work unchanged. Regardless, software developers must strive to meet the three goals of virtualization spelled out by Gerald Popek and Robert Goldberg: fidelity, performance, and safety.
Beyond Server Consolidation:
Server consolidation helps companies improve resource utilization, but virtualization can help in other ways, too.
Virtualization technology was developed in the late 1960s to make more efficient use of hardware. Hardware was expensive, and there was not that much available. Processing was largely outsourced to the few places that did have computers. On a single IBM System/360, one could run in parallel several environments that maintained full isolation and gave each of its customers the illusion of owning the hardware. Virtualization was time sharing implemented at a coarse-grained level, and isolation was the key achievement of the technology.
Meet the Virts:
Virtualization technology isn’t new, but it has matured a lot over the past 30 years.
When you dig into the details of supposedly overnight success stories, you frequently discover that they’ve actually been years in the making. Virtualization has been around for more than 30 years since the days when some of you were feeding stacks of punch cards into very physical machines yet in 2007 it tipped. VMware was the IPO sensation of the year; in November 2007 no fewer than four major operating system vendors (Microsoft, Oracle, Red Hat, and Sun) announced significant new virtualization capabilities; and among fashionable technologists it seems virtual has become the new black.
Big Games, Small Screens:
Developing 3D games for mobile devices is full of challenges, but the rich, evolving toolset enables some stunning results.
One thing that becomes immediately apparent when creating and distributing mobile 3D games is that there are fundamental differences between the cellphone market and the more traditional games markets, such as consoles and handheld gaming devices. The most striking of these are the number of delivery platforms; the severe constraints of the devices, including small screens whose orientation can be changed; limited input controls; the need to deal with other tasks; the nonphysical delivery mechanism; and the variations in handset performance and input capability.
Understanding DRM:
Recognizing the tradeoffs associated with different DRM systems can pave the way for a more flexible and capable DRM.
The explosive growth of the Internet and digital media has created both tremendous opportunities and new threats for content creators. Advances in digital technology offer new ways of marketing, disseminating, interacting with, and monetizing creative works, giving rise to expanding markets that did not exist just a few years ago. At the same time, however, the technologies have created major challenges for copyright holders seeking to control the distribution of their works and protect against piracy.
Document & Media Exploitation:
The DOMEX challenge is to turn digital bits into actionable intelligence.
A computer used by Al Qaeda ends up in the hands of a Wall Street Journal reporter. A laptop from Iran is discovered that contains details of that country’s nuclear weapons program. Photographs and videos are downloaded from terrorist Web sites. As evidenced by these and countless other cases, digital documents and storage devices hold the key to many ongoing military and criminal investigations. The most straightforward approach to using these media and documents is to explore them with ordinary tools—open the word files with Microsoft Word, view the Web pages with Internet Explorer, and so on.
Powering Down:
Smart power management is all about doing more with the resources we have.
Power management is a topic of interest to everyone. In the beginning there was the desktop computer. It ran at a fixed speed and consumed less power than the monitor it was plugged into. Where computers were portable, their sheer size and weight meant that you were more likely to be limited by physical strength than battery life. It was not a great time for power management. Now consider the present. Laptops have increased in speed by more than 5,000 times. Battery capacity, sadly, has not. With hardware becoming increasingly mobile, however, users are demanding that battery life start matching the way they work.
Storage Virtualization Gets Smart:
The days of overprovisioned, underutilized storage resources might soon become a thing of the past.
Over the past 20 years we have seen the transformation of storage from a dumb resource with fixed reliability, performance, and capacity to a much smarter resource that can actually play a role in how data is managed. In spite of the increasing capabilities of storage systems, however, traditional storage management models have made it hard to leverage these data management capabilities effectively. The net result has been overprovisioning and underutilization. In short, although the promise was that smart shared storage would simplify data management, the reality has been different.
Hard Disk Drives: The Good, the Bad and the Ugly!:
HDDs are like the bread in a peanut butter and jelly sandwich.
HDDs are like the bread in a peanut butter and jelly sandwich—sort of an unexciting piece of hardware necessary to hold the “software.” They are simply a means to an end. HDD reliability, however, has always been a significant weak link, perhaps the weak link, in data storage. In the late 1980s people recognized that HDD reliability was inadequate for large data storage systems so redundancy was added at the system level with some brilliant software algorithms, and RAID (redundant array of inexpensive disks) became a reality. RAID moved the reliability requirements from the HDD itself to the system of data disks.
Standardizing Storage Clusters:
Will pNFS become the new standard for parallel data access?
Data-intensive applications such as data mining, movie animation, oil and gas exploration, and weather modeling generate and process huge amounts of data. File-data access throughput is critical for good performance. To scale well, these HPC (high-performance computing) applications distribute their computation among numerous client machines. HPC clusters can range from hundreds to thousands of clients with aggregate I/O demands ranging into the tens of gigabytes per second.
Voyage in the Agile Memeplex:
In the world of agile development, context is key.
Agile processes are not a technology, not a science, not a product. They constitute a space somewhat hard to define. Agile methods, or more precisely agile software development methods or processes, are a family of approaches and practices for developing software systems. Any attempt to define them runs into egos and marketing posturing. For our purposes here, we can define this space in two ways: By enumeration. Pointing to recognizable members of the set: XP, scrum, lean development, DSDM, Crystal, FDD, Agile RUP or OpenUP, etc.
Usablity Testing for the Web:
Today’s sophisticated Web applications make tracking and listening to users more important than ever.
Today’s Internet user has more choices than ever before, with many competing sites offering similar services. This proliferation of options provides ample opportunity for users to explore different sites and find out which one best suits their needs for any particular service. Users are further served by the latest generation of Web technologies and services, commonly dubbed Web 2.0, which enables a better, more personalized user experience and encourages user-generated content.
Phishing Forbidden:
Current anti-phishing technologies prevent users from taking the bait.
Phishing is a significant risk facing Internet users today. Through e-mails or instant messages, users are led to counterfeit Web sites designed to trick them into divulging usernames, passwords, account numbers, and personal information. It is up to the user to ensure the authenticity of the Web site. Browsers provide some tools, but these are limited by at least three issues.
Building Secure Web Applications:
Believe it or not, it’s not a lost cause.
In these days of phishing and near-daily announcements of identity theft via large-scale data losses, it seems almost ridiculous to talk about securing the Web. At this point most people seem ready to throw up their hands at the idea or to lock down one small component that they can control in order to keep the perceived chaos at bay.
Toward a Commodity Enterprise Middleware:
Can AMQP enable a new era in messaging middleware? A look inside standards-based messaging with AMQP
AMQP was born out of my own experience and frustrations in developing front- and back-office processing systems at investment banks. It seemed to me that we were living in integration Groundhog Day - the same problems of connecting systems together would crop up with depressing regularity. Each time the same discussions about which products to use would happen, and each time the architecture of some system would be curtailed to allow for the fact that the chosen middleware was reassuringly expensive. From 1996 through to 2003 I was waiting for the solution to this obvious requirement to materialize as a standard, and thereby become a commodity.
The Seven Deadly Sins of Linux Security:
Avoid these common security risks like the devil.
The problem with security advice is that there is too much of it and that those responsible for security certainly have too little time to implement all of it. The challenge is to determine what the biggest risks are and to worry about those first and about others as time permits. Presented here are the seven common problems - the seven deadly sins of security - most likely to allow major damage to occur to your system or bank account.
API: Design Matters:
Why changing APIs might become a criminal offense.
After more than 25 years as a software engineer, I still find myself underestimating the time it will take to complete a particular programming task. Sometimes, the resulting schedule slip is caused by my own shortcomings: as I dig into a problem, I simply discover that it is a lot harder than I initially thought, so the problem takes longer to solve—such is life as a programmer. Just as often I know exactly what I want to achieve and how to achieve it, but it still takes far longer than anticipated. When that happens, it is usually because I am struggling with an API that seems to do its level best to throw rocks in my path and make my life difficult.
Beyond Beowulf Clusters:
As clusters grow in size and complexity, it becomes harder and harder to manage their configurations.
In the early ’90s, the Berkeley NOW Project under David Culler posited that groups of less capable machines could be used to solve scientific and other computing problems at a fraction of the cost of larger computers. In 1994, Donald Becker and Thomas Sterling worked to drive the costs even lower by adopting the then-fledgling Linux operating system to build Beowulf clusters at NASA’s Goddard Space Flight Center. By tying desktop machines together with open source tools such as PVM, MPI, and PBS, early clusters—which were often PC towers stacked on metal shelves with a nest of wires interconnecting them—fundamentally altered the balance of scientific computing.
The Evolution of Security:
What can nature tell us about how best to manage our risks?
Security people are never in charge unless an acute embarrassment has occurred. Otherwise, their advice is tempered by “economic reality,” which is to say that security is a means, not an end. This is as it should be. Since means are about tradeoffs, security is about trade-offs, but you knew all that. Our tradeoff decisions can be hard to make, and these hard-to-make decisions come in two varieties. One type occurs when the uncertainty of the alternatives is so great that they can’t be sorted in terms of probable effect. As such, other factors such as familiarity or convenience will drive the decision.
DNS Complexity:
Although it contains just a few simple rules, DNS has grown into an enormously complex system.
DNS is a distributed, coherent, reliable, autonomous, hierarchical database, the first and only one of its kind. Created in the 1980s when the Internet was still young but overrunning its original system for translating host names into IP addresses, DNS is one of the foundation technologies that made the worldwide Internet possible. Yet this did not all happen smoothly, and DNS technology has been periodically refreshed and refined. Though it’s still possible to describe DNS in simple terms, the underlying details are by now quite sublime.
Unified Communications with SIP:
SIP can provide realtime communications as a network service.
Communications systems based on the SIP (Session Initiation Protocol) standard have come a long way over the past several years. SIP is now largely complete and covers even advanced telephony and multimedia features and feature interactions. Interoperability between solutions from different vendors is repeatedly demonstrated at events such as the SIPit (interoperability test) meetings organized by the SIP Forum, and several manufacturers have proven that proprietary extensions to the standard are no longer driven by technical needs but rather by commercial considerations.
Making SIP Make Cents:
P2P payments using SIP could enable new classes of micropayment applications and business models.
The Session Initiation Protocol (SIP) is used to set up realtime sessions in IP-based networks. These sessions might be for audio, video, or IM communications, or they might be used to relay presence information. SIP service providers are mainly focused on providing a service that copies that provided by the PSTN (public switched telephone network) or the PLMN (public land mobile network) to the Internet-based environment.
Decentralizing SIP:
If you’re looking for a low-maintenance IP communications network, peer-to-peer SIP might be just the thing.
SIP (Session Initiation Protocol) is the most popular protocol for VoIP in use today.1 It is widely used by enterprises, consumers, and even carriers in the core of their networks. Since SIP is designed for establishing media sessions of any kind, it is also used for a variety of multimedia applications beyond VoIP, including IPTV, videoconferencing, and even collaborative video gaming.
SIP: Basics and Beyond:
More than just a simple telephony application protocol, SIP is a framework for developing communications systems.
Chances are you’re already using SIP (Session Initiation Protocol). It is one of the key innovations driving the current evolution of communications systems. Its first major use has been signaling in Internet telephony. Large carriers have been using SIP inside their networks for interconnect and trunking across long distances for several years. If you’ve made a long-distance call, part of that call probably used SIP.
Realtime Garbage Collection:
It’s now possible to develop realtime systems using Java.
Traditional computer science deals with the computation of correct results. Realtime systems interact with the physical world, so they have a second correctness criterion: they have to compute the correct result within a bounded amount of time. Simply building functionally correct software is hard enough. When timing is added to the requirements, the cost and complexity of building the software increase enormously.
Open vs. Closed:
Which source is more secure?
There is no better way to start an argument among a group of developers than proclaiming Operating System A to be "more secure" than Operating System B. I know this from first-hand experience, as previous papers I have published on this topic have led to reams of heated e-mails directed at me - including some that were, quite literally, physically threatening. Despite the heat (not light!) generated from attempting to investigate the relative security of different software projects, investigate we must.
One Step Ahead:
Security vulnerabilities abound, but a few simple steps can minimize your risk.
Every day IT departments are involved in an ongoing struggle against hackers trying to break into corporate networks. A break-in can carry a hefty price: loss of valuable information, tarnishing of the corporate image and brand, service interruption, and hundreds of resource hours of recovery time. Unlike other aspects of information technology, security is adversarial. It pits IT departments against hackers.
Better, Faster, More Secure:
Who’s in charge of the Internet’s future?
Since I started a stint as chair of the IETF in March 2005, I have frequently been asked, “What’s coming next?” but I have usually declined to answer. Nobody is in charge of the Internet, which is a good thing, but it makes predictions difficult. The reason the lack of central control is a good thing is that it has allowed the Internet to be a laboratory for innovation throughout its life—and it’s a rare thing for a major operational system to serve as its own development lab.
The Virtualization Reality:
Are hypervisors the new foundation for system software?
A number of important challenges are associated with the deployment and configuration of contemporary computing infrastructure. Given the variety of operating systems and their many versions—including the often-specific configurations required to accommodate the wide range of popular applications—it has become quite a conundrum to establish and manage such systems.
Unlocking Concurrency:
Multicore programming with transactional memory
Multicore architectures are an inflection point in mainstream software development because they force developers to write parallel programs. In a previous article in Queue, Herb Sutter and James Larus pointed out, “The concurrency revolution is primarily a software revolution. The difficult problem is not building multicore hardware, but programming it in a way that lets mainstream applications benefit from the continued exponential growth in CPU performance.” In this new multicore world, developers must write explicitly parallel applications that can take advantage of the increasing number of cores that each successive multicore generation will provide.
Playing for Keeps:
Will security threats bring an end to general-purpose computing?
Inflection points come at you without warning and quickly recede out of reach. We may be nearing one now. If so, we are now about to play for keeps, and “we” doesn’t mean just us security geeks. If anything, it’s because we security geeks have not worked the necessary miracles already that an inflection point seems to be approaching at high velocity.
Criminal Code: The Making of a Cybercriminal:
A fictional account of malware creators and their experiences
This is a fictional account of malware creators and their experiences. Although the characters are made up, the techniques and events are patterned on real activities of many different groups developing malicious software. “Make some money!” Misha’s father shouted. “You spent all that time for a stupid contest and where did it get you? Nowhere! You have no job and you didn’t even win! You need to stop playing silly computer games and earn some money!”
E-mail Authentication: What, Why, How?:
Perhaps we should have figured out what was going to happen when Usenet started to go bad.
Internet e-mail was conceived in a different world than we live in today. It was a small, tightly knit community, and we didn’t really have to worry too much about miscreants. Generally, if someone did something wrong, the problem could be dealt with through social means; “shunning” is very effective in small communities. Perhaps we should have figured out what was going to happen when Usenet started to go bad. Usenet was based on an inexpensive network called UUCP, which was fairly easy to join, so it gave us a taste of what happens when the community becomes larger and more distributed—and harder to manage.
Cybercrime: An Epidemic:
Who commits these crimes, and what are their motivations?
Painted in the broadest of strokes, cybercrime essentially is the leveraging of information systems and technology to commit larceny, extortion, identity theft, fraud, and, in some cases, corporate espionage. Who are the miscreants who commit these crimes, and what are their motivations? One might imagine they are not the same individuals committing crimes in the physical world. Bank robbers and scam artists garner a certain public notoriety after only a few occurrences of their crimes, yet cybercriminals largely remain invisible and unheralded. Based on sketchy news accounts and a few public arrests, such as Mafiaboy, accused of paralyzing Amazon, CNN, and other Web sites, the public may infer these miscreants are merely a subculture of teenagers.
Breaking the Major Release Habit:
Can agile development make your team more productive?
Keeping up with the rapid pace of change can be a daunting task. Just as you finally get your software working with a new technology to meet yesterday’s requirements, a newer technology is introduced or a new business trend comes along to upset the apple cart. Whether your new challenge is Web services, SOA (service-oriented architecture), ESB (enterprise service bus), AJAX, Linux, the Sarbanes-Oxley Act, distributed development, outsourcing, or competitive pressure, there is an increasing need for development methodologies that help to shorten the development cycle time, respond to user needs faster, and increase quality all at the same time.
The Heart of Eclipse:
A look inside an extensible plug-in architecture
ECLIPSE is both an open, extensible development environment for building software and an open, extensible application framework upon which software can be built. Considered the most popular Java IDE, it provides a common UI model for working with tools and promotes rapid development of modular features based on a plug-in component model. The Eclipse Foundation designed the platform to run natively on multiple operating systems, including Macintosh, Windows, and Linux, providing robust integration with each and providing rich clients that support the GUI interactions everyone is familiar with: drag and drop, cut and paste (clipboard), navigation, and customization.
The Long Road to 64 Bits:
Double, double, toil and trouble
Shakespeare’s words often cover circumstances beyond his wildest dreams. Toil and trouble accompany major computing transitions, even when people plan ahead. To calibrate “tomorrow’s legacy today,” we should study “tomorrow’s legacy yesterday.” Much of tomorrow’s software will still be driven by decades-old decisions. Past decisions have unanticipated side effects that last decades and can be difficult to undo.
Keeping Score in the IT Compliance Game:
ALM can help organizations meet tough IT compliance requirements.
Achieving developer acceptance of standardized procedures for managing applications from development to release is one of the largest hurdles facing organizations today. Establishing a standardized development-to-release workflow, often referred to as the ALM (application lifecycle management) process, is particularly critical for organizations in their efforts to meet tough IT compliance mandates. This is much easier said than done, as different development teams have created their own unique procedures that are undocumented, unclear, and nontraceable.
Compliance Deconstructed:
When you break it down, compliance is largely about ensuring that business processes are executed as expected.
The topic of compliance becomes increasingly complex each year. Dozens of regulatory requirements can affect a company’s business processes. Moreover, these requirements are often vague and confusing. When those in charge of compliance are asked if their business processes are in compliance, it is understandably difficult for them to respond succinctly and with confidence. This article looks at how companies can deconstruct compliance, dealing with it in a systematic fashion and applying technology to automate compliance-related business processes. It also looks specifically at how Microsoft approaches compliance to SOX.
Box Their SOXes Off:
Being proactive with SAS 70 Type II audits helps both parties in a vendor relationship.
Data is a precious resource for any large organization. The larger the organization, the more likely it will rely to some degree on third-party vendors and partners to help it manage and monitor its mission-critical data. In the wake of new regulations for public companies, such as Section 404 of SOX, the folks who run IT departments for Fortune 1000 companies have an ever-increasing need to know that when it comes to the 24/7/365 monitoring of their critical data transactions, they have business partners with well-planned and well-documented procedures. In response to a growing need to validate third-party controls and procedures, some companies are insisting that certain vendors undergo SAS 70 Type II audits.
Complying with Compliance:
Blowing it off is not an option.
“Hey, compliance is boring. Really, really boring. And besides, I work neither in the financial industry nor in health care. Why should I care about SOX and HIPAA?” Yep, you’re absolutely right. You write payroll applications, or operating systems, or user interfaces, or (heaven forbid) e-mail servers. Why should you worry about compliance issues?
A Requirements Primer:
A short primer that provides background on four of the most important compliance challenges that organizations face today.
Many software engineers and architects are exposed to compliance through the growing number of rules, regulations, and standards with which their employers must comply. Some of these requirements, such as HIPAA, focus primarily on one industry, whereas others, such as SOX, span many industries. Some apply to only one country, while others cross national boundaries.
Too Much Information:
Two applications reveal the key challenges in making context-aware computing a reality.
As mobile computing devices and a variety of sensors become ubiquitous, new resources for applications and services - often collectively referred to under the rubric of context-aware computing - are becoming available to designers and developers. In this article, we consider the potential benefits and issues that arise from leveraging context awareness in new communication services that include the convergence of VoIP (voice over IP) and traditional information technology.
The Invisible Assistant:
One lab’s experiment with ubiquitous computing
Ubiquitous computing seeks to place computers everywhere around us—into the very fabric of everyday life1—so that our lives are made better. Whether it is improving our job productivity, our ability to stay connected with family and friends, or our entertainment, the goal is to find ways to put technology to work for us by getting all those computers—large and small, visible and invisible—to work together.
Social Perception:
Modeling human interaction for the next generation of communication services
Bob manages a team that designs and builds widgets. Life would be sweet, except that Bob’s team is distributed over three sites, located in three different time zones. Bob used to collect lots of frequent flyer miles traveling to attend meetings. Lately, however, business travel has evolved into a humanly degrading, wasteful ordeal. So Bob has invested in a high-bandwidth video communications system to cut down on business travel. Counting direct costs, the system was supposed to pay for itself within three months. There is a problem, however.
The Future of Human-Computer Interaction:
Is an HCI revolution just around the corner?
Personal computing launched with the IBM PC. But popular computing—computing for the masses—launched with the modern WIMP (windows, icons, mouse, pointer) interface, which made computers usable by ordinary people. As popular computing has grown, the role of HCI (human-computer interaction) has increased. Most software today is interactive, and code related to the interface is more than half of all code. HCI also has a key role in application design. In a consumer market, a product’s success depends on each user’s experience with it.
ASPs: The Integration Challenge:
The promise of software as a service is becoming a reality with many ASPs.
Organizations using ASPs and third-party vendors that provide value-added products to ASPs need to integrate with them. ASPs enable this integration by providing Web service-based APIs. There are significant differences between integrating with ASPs over the Internet and integrating with a local application. When integrating with ASPs, users have to consider a number of issues, including latency, unavailability, upgrades, performance, load limiting, and lack of transaction support.
Untangling Enterprise Java:
A new breed of framework helps eliminate crosscutting concerns.
Separation of concerns is one of the oldest concepts in computer science. The term was coined by Dijkstra in 1974.1 It is important because it simplifies software, making it easier to develop and maintain. Separation of concerns is commonly achieved by decomposing an application into components. There are, however, crosscutting concerns, which span (or cut across) multiple components. These kinds of concerns cannot be handled by traditional forms of modularization and can make the application more complex and difficult to maintain.
The Rise and Fall of CORBA:
There’s a lot we can learn from CORBA’s mistakes.
Depending on exactly when one starts counting, CORBA is about 10-15 years old. During its lifetime, CORBA has moved from being a bleeding-edge technology for early adopters, to being a popular middleware, to being a niche technology that exists in relative obscurity. It is instructive to examine why CORBA—despite once being heralded as the “next-generation technology for e-commerce”—suffered this fate. CORBA’s history is one that the computing industry has seen many times, and it seems likely that current middleware efforts, specifically Web services, will reenact a similar history.
From COM to Common:
Component software’s 10-year journey toward ubiquity
Ten years ago, the term component software meant something relatively specific and concrete. A small number of software component frameworks more or less defined the concept for most people. Today, few terms in the software industry are less precise than component software. There are now many different forms of software componentry for many different purposes. The technologies and methodologies of 10 years ago have evolved in fundamental ways and have been joined by an explosion of new technologies and approaches that have redefined our previously held notions of component software.
The Network’s New Role:
Application-oriented networks can help bridge the gap between enterprises.
Companies have always been challenged with integrating systems across organizational boundaries. With the advent of Internet-native systems, this integration has become essential for modern organizations, but it has also become more and more complex, especially as next-generation business systems depend on agile, flexible, interoperable, reliable, and secure cross-enterprise systems.
Search Considered Integral:
A combination of tagging, categorization, and navigation can help end-users leverage the power of enterprise search.
Most corporations must leverage their data for competitive advantage. The volume of data available to a knowledge worker has grown dramatically over the past few years, and, while a good amount lives in large databases, an important subset exists only as unstructured or semi-structured data. Without the right systems, this leads to a continuously deteriorating signal-to-noise ratio, creating an obstacle for busy users trying to locate information quickly. Three flavors of enterprise search solutions help improve knowledge discovery.
AI Gets a Brain:
New technology allows software to tap real human intelligence.
In the 50 years since John McCarthy coined the term artificial intelligence, much progress has been made toward identifying, understanding, and automating many classes of symbolic and computational problems that were once the exclusive domain of human intelligence. Much work remains in the field because humans still significantly outperform the most powerful computers at completing such simple tasks as identifying objects in photographs - something children can do even before they learn to speak.
Java in a Teacup:
Programming Bluetooth-enabled devices using J2ME
Few technology sectors evolve as fast as the wireless industry. As the market and devices mature, the need (and potential) for mobile applications grows. More and more mobile devices are delivered with the Java platform installed, enabling a large base of Java programmers to try their hand at embedded programming. Unfortunately, not all Java mobile devices are created equal, presenting many challenges to the new J2ME (Java 2 Platform, Micro Edition) programmer. Using a sample game application, this article illustrates some of the challenges associated with J2ME and Bluetooth programming.
TiVo-lution:
The challenges of delivering a reliable, easy-to-use DVR service to the masses
One of the greatest challenges of designing a computer system is in making sure the system itself is "invisible" to the user. The system should simply be a conduit to the desired result. There are many examples of such purpose-built systems, ranging from modern automobiles to mobile phones.
The (not so) Hidden Computer:
The growing complexity of purpose-built systems is making it difficult to conceal the computers within.
Ubiquitous computing may not have arrived yet, but ubiquitous computers certainly have. The sustained improvements wrought by the fulfillment of Moore’s law have led to the use of microprocessors in a vast array of consumer products. A typical car contains 50 to 100 processors. Your microwave has one or maybe more. They’re in your TV, your phone, your refrigerator, your kids’ toys, and in some cases, your toothbrush.
Under New Management:
Autonomic computing is revolutionizing the way we manage complex systems.
In an increasingly competitive global environment, enterprises are under extreme pressure to reduce operating costs. At the same time they must have the agility to respond to business opportunities offered by volatile markets.
Best Practice (BPM):
In business process management, finding the right tool suite is just the beginning.
Just as BPM (business process management) technology is markedly different from conventional approaches to application support, the methodology of BPM development is markedly different from traditional software implementation techniques. With CPI (continuous process improvement) as the core discipline of BPM, the models that drive work through the company evolve constantly. Indeed, recent studies suggest that companies fine-tune their BPM-based applications at least once a quarter (and sometimes as often as eight times per year). The point is that there is no such thing as a "finished" process; it takes multiple iterations to produce highly effective solutions. Every working BPM-based process is just a starting point for the future.
People and Process:
Minimizing the pain of business process change
When Mike Hammer and I published Reengineering the Corporation in 1992, we understood the impact that real business process change would have on people. I say "real" process change, because managers have used the term reengineering to describe any and all corporate change programs. One misguided executive told me that his company did not know how to do real reengineering; so it just downsized large departments and business units, and expected that the people who were left would figure out how to get their work done. Sadly, this is how some companies still practice process redesign: leaving people overworked and demoralized, while customers experience bad service and poor quality.
Going with the Flow:
Workflow systems can provide value beyond automating business processes.
An organization consists of two worlds. The real world contains the organization’s structure, physical goods, employees, and other organizations. The virtual world contains the organization’s computerized infrastructure, including its applications and databases. Workflow systems bridge the gap between these two worlds. They provide both a model of the organization’s design and a runtime to execute the model.
Modern Performance Monitoring:
Today’s diverse and decentralized computer world demands new thinking about performance monitoring and analysis.
The modern Unix server floor can be a diverse universe of hardware from several vendors and software from several sources. Often, the personnel needed to resolve server floor performance issues are not available or, for security reasons, not allowed to be present at the very moment of occurrence. Even when, as luck might have it, the right personnel are actually present to witness a performance "event," the tools to measure and analyze the performance of the hardware and software have traditionally been sparse and vendor-specific.
Performance Anti-Patterns:
Want your apps to run faster? Here’s what not to do.
Performance pathologies can be found in almost any software, from user to kernel, applications, drivers, etc. At Sun we’ve spent the last several years applying state-of-the-art tools to a Unix kernel, system libraries, and user applications, and have found that many apparently disparate performance problems in fact have the same underlying causes. Since software patterns are considered abstractions of positive experience, we can talk about the various approaches that led to these performance problems as anti-patterns: something to be avoided rather than emulated.
A High-Performance Team:
From design to production, performance should be part of the process.
You work in the product development group of a software company, where the product is often compared with the competition on performance grounds. Performance is an important part of your business; but so is adding new functionality, fixing bugs, and working on new projects. So how do you lead your team to develop high-performance software, as well as doing everything else? And how do you keep that performance high throughout cycles of maintenance and enhancement?
Hidden in Plain Sight:
Improvements in the observability of software can help you diagnose your most crippling performance problems.
In December 1997, Sun Microsystems had just announced its new flagship machine: a 64-processor symmetric multiprocessor supporting up to 64 gigabytes of memory and thousands of I/O devices. As with any new machine launch, Sun was working feverishly on benchmarks to prove the machine’s performance. While the benchmarks were generally impressive, there was one in particular that was exhibiting unexpectedly low performance. The benchmark machine would occasionally become mysteriously distracted: Benchmark activity would practically cease, but the operating system kernel remained furiously busy. After some number of minutes spent on unknown work, the operating system would suddenly right itself: Benchmark activity would resume at full throttle and run to completion.
Coding for the Code:
Can models provide the DNA for software development?
Despite the considerable effort invested by industry and academia in modeling standards such as UML (Unified Modeling Language), software modeling has long played a subordinate role in commercial software development. Although modeling is generally perceived as state of the art and thus as something that ought to be done, its appreciation seems to pale along with the progression from the early, more conceptual phases of a software project to those where the actual handcrafting is done.
Monitoring, at Your Service:
Automated monitoring can increase the reliability and scalability of today’s online software services.
Internet services are becoming more and more a part of our daily lives. We derive value from them, depend on them, and are now beginning to assume their ubiquity as we do the phone system and electricity grid. The implementation of Internet services, though, is an unsolved problem, and Internet services remain far from fulfilling their potential in our world.
Lessons from the Floor:
The manufacturing industry can teach us a lot about measuring performance in large-scale Internet services.
The January monthly service quality meeting started normally. Around the table were representatives from development, operations, marketing, and product management, and the agenda focused on the prior month’s performance. As usual, customer-impacting incidents and quality of service were key topics, and I was armed with the numbers showing the average uptime for the part of the service that I represent: MSN, the Microsoft family of services that includes e-mail, Instant Messenger, news, weather and sports, etc.
Information Extraction:
Distilling structured data from unstructured text
In 2001 the U.S. Department of Labor was tasked with building a Web site that would help people find continuing education opportunities at community colleges, universities, and organizations across the country. The department wanted its Web site to support fielded Boolean searches over locations, dates, times, prerequisites, instructors, topic areas, and course descriptions. Ultimately it was also interested in mining its new database for patterns and educational trends. This was a major data-integration project, aiming to automatically gather detailed, structured information from tens of thousands of individual institutions every three months.
Threads without the Pain:
Multithreaded programming need not be so angst-ridden.
Much of today’s software deals with multiple concurrent tasks. Web browsers support multiple concurrent HTTP connections, graphical user interfaces deal with multiple windows and input devices, and Web and DNS servers handle concurrent connections or transactions from large numbers of clients. The number of concurrent tasks that needs to be handled increases while software grows more complex. Structuring concurrent software in a way that meets the increasing scalability requirements while remaining simple, structured, and safe enough to allow mortal programmers to construct ever-more complex systems is a major engineering challenge.
Fighting Spam with Reputation Systems:
User-submitted spam fingerprints
Spam is everywhere, clogging the inboxes of e-mail users worldwide. Not only is it an annoyance, it erodes the productivity gains afforded by the advent of information technology. Workers plowing through hours of legitimate e-mail every day also must contend with removing a significant amount of illegitimate e-mail. Automated spam filters have dramatically reduced the amount of spam seen by the end users who employ them, but the amount of training required rivals the amount of time needed simply to delete the spam without the assistance of a filter.
Social Bookmarking in the Enterprise:
Can your organization benefit from social bookmarking tools?
One of the greatest challenges facing people who use large information spaces is to remember and retrieve items that they have previously found and thought to be interesting. One approach to this problem is to allow individuals to save particular search strings to re-create the search in the future. Another approach has been to allow people to create personal collections of material. Collections of citations can be created manually by readers or through execution of (and alerting to) a saved search.
Why Your Data Won’t Mix:
New tools and techniques can help ease the pain of reconciling schemas.
When independent parties develop database schemas for the same domain, they will almost always be quite different from each other. These differences are referred to as semantic heterogeneity, which also appears in the presence of multiple XML documents, Web services, and ontologies—or more broadly, whenever there is more than one way to structure a body of data. The presence of semi-structured data exacerbates semantic heterogeneity, because semi-structured schemas are much more flexible to start with. For multiple data systems to cooperate with each other, they must understand each other’s schemas.
Order from Chaos:
Will ontologies help you structure your semi-structured data?
There is probably little argument that the past decade has brought the “big bang” in the amount of online information available for processing by humans and machines. Two of the trends that it spurred (among many others) are: first, there has been a move to more flexible and fluid (semi-structured) models than the traditional centralized relational databases that stored most of the electronic data before; second, today there is simply too much information available to be processed by humans, and we really need help from machines.
XML <and Semi-Structured Data>:
XML provides a natural representation for hierarchical structures and repeating fields or structures.
Vocabulary designers can require XML data to be perfectly regular, or they can allow a little variation, or a lot. In the extreme case, an XML vocabulary can effectively say that there are no rules at all beyond those required of all well-formed XML. Because XML syntax records only what is present, not everything that might be present, sparse data does not make the XML representation awkward; XML storage systems are typically built to handle sparse data gracefully.
Learning from the Web:
The Web has taught us many lessons about distributed computing, but some of the most important ones have yet to fully take hold.
In the past decade we have seen a revolution in computing that transcends anything seen to date in terms of scope and reach, but also in terms of how we think about what makes up “good” and “bad” computing.
Managing Semi-Structured Data:
I vividly remember during my first college class my fascination with the relational database.
In that class I learned how to build a schema for my information, and I learned that to obtain an accurate schema there must be a priori knowledge of the structure and properties of the information to be modeled. I also learned the ER (entity-relationship) model as a basic tool for all further data modeling, as well as the need for an a priori agreement on both the general structure of the information and the vocabularies used by all communities producing, processing, or consuming this information.
Software and the Concurrency Revolution:
Leveraging the full power of multicore processors demands new tools and new thinking from the software industry.
Concurrency has long been touted as the "next big thing" and "the way of the future," but for the past 30 years, mainstream software development has been able to ignore it. Our parallel future has finally arrived: new machines will be parallel machines, and this will require major changes in the way we develop software. The introductory article in this issue describes the hardware imperatives behind this shift in computer architecture from uniprocessors to multicore processors, also known as CMPs.
The Price of Performance:
An Economic Case for Chip Multiprocessing
In the late 1990s, our research group at DEC was one of a growing number of teams advocating the CMP (chip multiprocessor) as an alternative to highly complex single-threaded CPUs. We were designing the Piranha system,1 which was a radical point in the CMP design space in that we used very simple cores (similar to the early RISC designs of the late ’80s) to provide a higher level of thread-level parallelism. Our main goal was to achieve the best commercial workload performance for a given silicon budget. Today, in developing Google’s computing infrastructure, our focus is broader than performance alone. The merits of a particular architecture are measured by answering the following question: Are you able to afford the computational capacity you need?
Extreme Software Scaling:
Chip multiprocessors have introduced a new dimension in scaling for application developers, operating system designers, and deployment specialists.
The advent of SMP (symmetric multiprocessing) added a new degree of scalability to computer systems. Rather than deriving additional performance from an incrementally faster microprocessor, an SMP system leverages multiple processors to obtain large gains in total system performance. Parallelism in software allows multiple jobs to execute concurrently on the system, increasing system throughput accordingly. Given sufficient software parallelism, these systems have proved to scale to several hundred processors.
The Future of Microprocessors:
Chip multiprocessors’ promise of huge performance gains is now a reality.
The performance of microprocessors that power modern computers has continued to increase exponentially over the years for two main reasons. First, the transistors that are the heart of the circuits in all processors and memory chips have simply become faster over time on a course described by Moore’s law, and this directly affects the performance of processors built with those transistors. Moreover, actual processor performance has increased faster than Moore’s law would predict, because processor designers have been able to harness the increasing numbers of transistors available on modern chips to extract more parallelism from software.
Enterprise Grid Computing:
Grid computing holds great promise for the enterprise data center, but many technical and operational hurdles remain.
I have to admit a great measure of sympathy for the IT populace at large, when it is confronted by the barrage of hype around grid technology, particularly within the enterprise. Individual vendors have attempted to plant their flags in the notionally virgin technological territory and proclaim it as their own, using terms such as grid, autonomic, self-healing, self-managing, adaptive, utility, and so forth. Analysts, well, analyze and try to make sense of it all, and in the process each independently creates his or her own map of this terra incognita, naming it policy-based computing, organic computing, and so on. Unfortunately, this serves only to further muddy the waters for most people.
Web Services and IT Management:
Web services aren’t just for application integration anymore.
Platform and programming language independence, coupled with industry momentum, has made Web services the technology of choice for most enterprise integration projects. Their close relationship with SOA (service-oriented architecture) has also helped them gain mindshare. Consider this definition of SOA: "An architectural style whose goal is to achieve loose coupling among interacting software agents. A service is a unit of work done by a service provider to achieve desired end results for a service consumer.
Enterprise Software as Service:
Online services are changing the nature of software.
While the practice of outsourcing business functions such as payroll has been around for decades, its realization as online software services has only recently become popular. In the online service model, a provider develops an application and operates the servers that host it. Customers access the application over the Internet using industry-standard browsers or Web services clients. A wide range of online applications, including e-mail, human resources, business analytics, CRM (customer relationship management), and ERP (enterprise resource planning), are available.
Describing the Elephant: The Different Faces of IT as Service:
Terms such as grid, on-demand, and service-oriented architecture are mired in confusion, but there is an overarching trend behind them all.
In a well-known fable, a group of blind men are asked to describe an elephant. Each encounters a different part of the animal and, not surprisingly, provides a different description. We see a similar degree of confusion in the IT industry today, as terms such as service-oriented architecture, grid, utility computing, on-demand, adaptive enterprise, data center automation, and virtualization are bandied about. As when listening to the blind men, it can be difficult to know what reality lies behind the words, whether and how the different pieces fit together, and what we should be doing about the animal(s) that are being described.
Programmers Are People, too:
Programming language and API designers can learn a lot from the field of human-factors design.
I would like to start out this article with an odd, yet surprisingly uncontroversial assertion, which is this: programmers are human. I wish to use this as a premise to explore how to improve the programmer’s lot. So, please, no matter your opinion on the subject, grant me this assumption for the sake of argument.
Attack Trends: 2004 and 2005:
Hacking has moved from a hobbyist pursuit with a goal of notoriety to a criminal pursuit with a goal of money.
Counterpane Internet Security Inc. monitors more than 450 networks in 35 countries, in every time zone. In 2004 we saw 523 billion network events, and our analysts investigated 648,000 security “tickets.” What follows is an overview of what’s happening on the Internet right now, and what we expect to happen in the coming months.
Security - Problem Solved?:
Solutions to many of our security problems already exist, so why are we still so vulnerable?
There are plenty of security problems that have solutions. Yet, our security problems don’t seem to be going away. What’s wrong here? Are consumers being offered snake oil and rejecting it? Are they not adopting solutions they should be adopting? Or, is there something else at work, entirely? We’ll look at a few places where the world could easily be a better place, but isn’t, and build some insight as to why.
The Answer is 42 of Course:
If we want our networks to be sufficiently difficult to penetrate, we’ve got to ask the right questions.
Why is security so hard? As a security consultant, I’m glad that people feel that way, because that perception pays my mortgage. But is it really so difficult to build systems that are impenetrable to the bad guys?
You Don’t Know Jack about Network Performance:
Bandwidth is only part of the problem.
Why does an application that works just fine over a LAN come to a grinding halt across the wide-area network? You may have experienced this firsthand when trying to open a document from a remote file share or remotely logging in over a VPN to an application running in headquarters. Why is it that an application that works fine in your office can become virtually useless over the WAN? If you think it’s simply because there’s not enough bandwidth in the WAN, then you don’t know jack about network performance.
Streams and Standards: Delivering Mobile Video:
The era of video served up to mobile phones has arrived and threatens to be the next “killer app” after wireless calling itself.
Don’t believe me? Follow along… Mobile phones are everywhere. Everybody has one. Think about the last time you were on an airplane and the flight was delayed on the ground. Immediately after the dreaded announcement, you heard everyone reach for their phones and start dialing.
Mobile Media: Making It a Reality:
Two prototype apps reveal the challenges in delivering mobile media services.
Many future mobile applications are predicated on the existence of rich, interactive media services. The promise and challenge of such services is to provide applications under the most hostile conditions - and at low cost to a user community that has high expectations. Context-aware services require information about who, where, when, and what a user is doing and must be delivered in a timely manner with minimum latency. This article reveals some of the current state-of-the-art "magic" and the research challenges.
Enterprise-Grade Wireless:
Wireless technology has come a long way, but is it robust enough for today’s enterprise?
We have been working in the wireless space in one form or another in excess of 10 years and have participated in every phase of its maturation process. We saw wireless progress from a toy technology before the dot-com boom, to something truly promising during the boom, only to be left wanting after the bubble when the technology was found to be not ready for prime time. Fortunately, it appears that we have finally reached the point where the technology and the enterprise’s expectations have finally converged.
Beyond Relational Databases:
There is more to data access than SQL.
The number and variety of computing devices in the environment are increasing rapidly. Real computers are no longer tethered to desktops or locked in server rooms. PDAs, highly mobile tablet and laptop devices, palmtop computers, and mobile telephony handsets now offer powerful platforms for the delivery of new applications and services. These devices are, however, only the tip of the iceberg. Hidden from sight are the many computing and network elements required to support the infrastructure that makes ubiquitous computing possible.
Databases of Discovery:
Open-ended database ecosystems promote new discoveries in biotech. Can they help your organization, too?
The National Center for Biotechnology Information is responsible for massive amounts of data. A partial list includes the largest public bibliographic database in biomedicine, the U.S. national DNA sequence database, an online free full text research article database, assembly, annotation, and distribution of a reference set of genes, genomes, and chromosomes, online text search and retrieval systems, and specialized molecular biology data search engines. At this writing, NCBI receives about 50 million Web hits per day, at peak rates of about 1,900 hits per second, and about 400,000 BLAST searches per day from about 2.5 million users.
A Call to Arms:
Long anticipated, the arrival of radically restructured database architectures is now finally at hand.
We live in a time of extreme change, much of it precipitated by an avalanche of information that otherwise threatens to swallow us whole. Under the mounting onslaught, our traditional relational database constructs—always cumbersome at best—are now clearly at risk of collapsing altogether. In fact, rarely do you find a DBMS anymore that doesn’t make provisions for online analytic processing. Decision trees, Bayes nets, clustering, and time-series analysis have also become part of the standard package, with allowances for additional algorithms yet to come. Also, text, temporal, and spatial data access methods have been added—along with associated probabilistic logic, since a growing number of applications call for approximated results.
UML Fever: Diagnosis and Recovery:
Acknowledgment is only the first step toward recovery from this potentially devastating affliction.
The Institute of Infectious Diseases has recently published research confirming that the many and varied strains of UML Fever1 continue to spread worldwide, indiscriminately infecting software analysts, engineers, and managers alike. One of the fever’s most serious side effects has been observed to be a significant increase in both the cost and duration of developing software products. This increase is largely attributable to a decrease in productivity resulting from fever-stricken individuals investing time and effort in activities that are of little or no value to producing deliverable products. For example, afflictees of Open Loop Fever continue to create UML (Unified Modeling Language) diagrams for unknown stakeholders.
On Plug-ins and Extensible Architectures:
Extensible application architectures such as Eclipse offer many advantages, but one must be careful to avoid “plug-in hell.”
In a world of increasingly complex computing requirements, we as software developers are continually searching for that ultimate, universal architecture that allows us to productively develop high-quality applications. This quest has led to the adoption of many new abstractions and tools. Some of the most promising recent developments are the new pure plug-in architectures. What began as a callback mechanism to extend an application has become the very foundation of applications themselves. Plug-ins are no longer just add-ons to applications; today’s applications are made entirely of plug-ins.
Patching the Enterprise:
Organizations of all sizes are spending considerable efforts on getting patch management right - their businesses depend on it.
Software patch management has grown to be a business-critical issue—from both a risk and a financial management perspective. According to a recent Aberdeen Group study, corporations spent more than $2 billion in 2002 on patch management for operating systems.1 Gartner research further notes the cost of operating a well-managed PC was approximately $2,000 less annually than that of an unmanaged PC.2 You might think that with critical mass and more sophisticated tools, the management cost per endpoint in large organizations would be lower, though in reality this may not be the case.
Understanding Software Patching:
Developing and deploying patches is an increasingly important part of the software development process.
Software patching is an increasingly important aspect of today’s computing environment as the volume, complexity, and number of configurations under which a piece of software runs have grown considerably. Software architects and developers do everything they can to build secure, bug-free software products. To ensure quality, development teams leverage all the tools and techniques at their disposal. For example, software architects incorporate security threat models into their designs, and QA engineers develop automated test suites that include sophisticated code-defect analysis tools.
A Passage to India:
Pitfalls that the outsourcing vendor forgot to mention
Most American IT employees take a dim view of offshore outsourcing. It’s considered unpatriotic and it drains valuable intellectual capital and jobs from the United States to destinations such as India or China. Online discussion forums on sites such as isyourjobgoingoffshore.com are headlined with titles such as “How will you cope?” and “Is your career in danger?” A cover story in BusinessWeek magazine a couple of years ago summed up the angst most people suffer when faced with offshoring: “Is your job next?”
Orchestrating an Automated Test Lab:
Composing a score can help us manage the complexity of testing distributed apps.
Networking and the Internet are encouraging increasing levels of interaction and collaboration between people and their software. Whether users are playing games or composing legal documents, their applications need to manage the complex interleaving of actions from multiple machines over potentially unreliable connections. As an example, Silicon Chalk is a distributed application designed to enhance the in-class experience of instructors and students. Its distributed nature requires that we test with multiple machines. Manual testing is too tedious, expensive, and inconsistent to be effective. While automating our testing, however, we have found it very labor intensive to maintain a set of scripts describing each machine’s portion of a given test.
Sifting Through the Software Sandbox: SCM Meets QA:
Source control—it’s not just for tracking changes anymore.
Thanks to modern SCM (software configuration management) systems, when developers work on a codeline they leave behind a trail of clues that can reveal what parts of the code have been modified, when, how, and by whom. From the perspective of QA (quality assurance) and test engineers, is this all just “data,” or is there useful information that can improve the test coverage and overall quality of a product?
Too Darned Big to Test:
Testing large systems is a daunting task, but there are steps we can take to ease the pain.
The increasing size and complexity of software, coupled with concurrency and distributed systems, has made apparent the ineffectiveness of using only handcrafted tests. The misuse of code coverage and avoidance of random testing has exacerbated the problem. We must start again, beginning with good design (including dependency analysis), good static checking (including model property checking), and good unit testing (including good input selection). Code coverage can help select and prioritize tests to make you more efficient, as can the all-pairs technique for controlling the number of configurations.
Quality Assurance: Much More than Testing:
Good QA is not only about technology, but also methods and approaches.
Quality assurance isn’t just testing, or analysis, or wishful thinking. Although it can be boring, difficult, and tedious, QA is nonetheless essential. Ensuring that a system will work when delivered requires much planning and discipline. Convincing others that the system will function properly requires even more careful and thoughtful effort. QA is performed through all stages of the project, not just slapped on at the end. It is a way of life.
Self-Healing in Modern Operating Systems:
A few early steps show there’s a long (and bumpy) road ahead.
Driving the stretch of Route 101 that connects San Francisco to Menlo Park each day, billboard faces smilingly reassure me that all is well in computerdom in 2004. Networks and servers, they tell me, can self-defend, self-diagnose, self-heal, and even have enough computing power left over from all this introspection to perform their owner-assigned tasks.
How Not to Write Fortran in Any Language:
There are characteristics of good coding that transcend all programming languages.
There’s no obfuscated Perl contest because it’s pointless.
Extensible Programming for the 21st Century:
Is an open, more flexible programming environment just around the corner?
In his keynote address at OOPSLA ’98, Sun Microsystems Fellow Guy L. Steele Jr. said, “From now on, a main goal in designing a language should be to plan for growth.” Functions, user-defined types, operator overloading, and generics (such as C++ templates) are no longer enough: tomorrow’s languages must allow programmers to add entirely new kinds of information to programs, and control how it is processed. This article argues that next-generation programming systems can accomplish this by combining three specific technologies.
Fuzzy Boundaries: Objects, Components, and Web Services:
It’s easy to transform objects into components and Web services, but how do we know which is right for the job?
If you are an object-oriented programmer, you will understand the code snippet, even if you are not familiar with the language (C#, not that it matters). You will not be surprised to learn that this program will print out the following line to the console: woof.
Languages, Levels, Libraries, and Longevity:
New programming languages are born every day. Why do some succeed and some fail?
In 50 years, we’ve already seen numerous programming systems come and (mostly) go, although some have remained a long time and will probably do so for: decades? centuries? millennia? The questions about language designs, levels of abstraction, libraries, and resulting longevity are numerous. Why do new languages arise? Why is it sometimes easier to write new software than to adapt old software that works? How many different levels of languages make sense? Why do some languages last in the face of “better” ones?
Lack of Priority Queuing Considered Harmful:
We’re in sore need of critical Internet infrastructure protection.
Most modern routers consist of several line cards that perform packet lookup and forwarding, all controlled by a control plane that acts as the brain of the router, performing essential tasks such as management functions, error reporting, control functions including route calculations, and adjacency maintenance. This control plane has many names; in this article it is the route processor, or RP. The route processor calculates the forwarding table and downloads it to the line cards using a control-plane bus. The line cards perform the actual packet lookup and forwarding.
Outsourcing: Devising a Game Plan:
What types of projects make good candidates for outsourcing?
Your CIO just summoned you to duty by handing off the decision-making power about whether to outsource next years big development project to rewrite the internal billing system. That’s quite a daunting task! How can you possibly begin to decide if outsourcing is the right option for your company? There are a few strategies that you can follow to help you avoid the pitfalls of outsourcing and make informed decisions. Outsourcing is not exclusively a technical issue, but it is a decision that architects or development managers are often best qualified to make because they are in the best position to know what technologies make sense to keep in-house.
Error Messages:
What’s the Problem?
Computer users spend a lot of time chasing down errors - following the trail of clues that starts with an error message and that sometimes leads to a solution and sometimes to frustration. Problems with error messages are particularly acute for system administrators (sysadmins) - those who configure, install, manage, and maintain the computational infrastructure of the modern world - as they spend a lot of effort to keep computers running amid errors and failures.
Automating Software Failure Reporting:
We can only fix those bugs we know about.
There are many ways to measure quality before and after software is released. For commercial and internal-use-only products, the most important measurement is the user’s perception of product quality. Unfortunately, perception is difficult to measure, so companies attempt to quantify it through customer satisfaction surveys and failure/behavioral data collected from its customer base. This article focuses on the problems of capturing failure data from customer sites.
Oops! Coping with Human Error in IT Systems:
Errors Happen. How to Deal.
Human operator error is one of the most insidious sources of failure and data loss in today’s IT environments. In early 2001, Microsoft suffered a nearly 24-hour outage in its Web properties as a result of a human error made while configuring a name resolution system. Later that year, an hour of trading on the Nasdaq stock exchange was disrupted because of a technicians mistake while testing a development system. More recently, human error has been blamed for outages in instant messaging networks, for security and privacy breaches, and for banking system failures.
Trials and Tribulations of Debugging Concurrency:
You can run, but you can’t hide.
We now sit firmly in the 21st century where the grand challenge to the modern-day programmer is neither memory leaks nor type issues (both of those problems are now effectively solved), but rather issues of concurrency. How does one write increasingly complex programs where concurrency is a first-class concern. Or even more treacherous, how does one debug such a beast? These questions bring fear into the hearts of even the best programmers.
Thread Scheduling in FreeBSD 5.2:
To help get a better handle on thread scheduling, we take a look at how FreeBSD 5.2 handles it.
A busy system makes thousands of scheduling decisions per second, so the speed with which scheduling decisions are made is critical to the performance of the system as a whole. This article - excerpted from the forthcoming book, “The Design and Implementation of the FreeBSD Operating System“ - uses the example of the open source FreeBSD system to help us understand thread scheduling. The original FreeBSD scheduler was designed in the 1980s for large uniprocessor systems. Although it continues to work well in that environment today, the new ULE scheduler was designed specifically to optimize multiprocessor and multithread environments. This article first studies the original FreeBSD scheduler, then describes the new ULE scheduler.
Integrating RFID:
Data management and inventory control are about to get a whole lot more interesting.
RFID (radio frequency identification) has received a great deal of attention in the commercial world over the past couple of years. The excitement stems from a confluence of events. First, through the efforts of the former Auto-ID Center and its sponsor companies, the prospects of low-cost RFID tags and a networked supply chain have come within reach of a number of companies. Second, several commercial companies and government bodies, such as Wal-Mart and Target in the United States, Tesco in Europe, and the U.S. Department of Defense, have announced RFID initiatives in response to technology improvements.
The Magic of RFID:
Just how do those little things work anyway?
Many modern technologies give the impression they work by magic, particularly when they operate automatically and their mechanisms are invisible. A technology called RFID (radio frequency identification), which is relatively new to the mass market, has exactly this characteristic and for many people seems a lot like magic. RFID is an electronic tagging technology that allows an object, place, or person to be automatically identified at a distance without a direct line-of-sight, using an electromagnetic challenge/response exchange.
A Time and a Place for Standards:
History shows how abuses of the standards process have impeded progress.
Over the next decade, we will encounter at least three major opportunities where success will hinge largely on our ability to define appropriate standards. That’s because intelligently crafted standards that surface at just the right time can do much to nurture nascent industries and encourage product development simply by creating a trusted and reliable basis for interoperability. From where I stand, the three specific areas I see as particularly promising are: (1) all telecommunications and computing capabilities that work together to facilitate collaborative work; (2) hybrid computing/home entertainment products providing for the online distribution of audio and/or video content; and (3) wireless sensor and network platforms (the sort that some hope the 802.15.4 and ZigBee Alliance standards will ultimately enable).
VoIP Security: Not an Afterthought:
DDOS takes on a whole new meaning.
Voice over IP (VoIP) promises to up-end a century-old model of voice telephony by breaking the traditional monolithic service model of the public switched telephone network (PSTN) and changing the point of control and provision from the central office switch to the end user’s device.
VoIP: What is it Good for?:
If you think VoIP is just an IP version of telecom-as-usual, think again. A host of applications are changing the phone call as we know it.
VoIP (voice over IP) technology is a rapidly expanding field. More and more VoIP components are being developed, while existing VoIP technology is being deployed at a rapid and still increasing pace. This growth is fueled by two goals: decreasing costs and increasing revenues.
Not Your Father’s PBX?:
Integrating VoIP into the enterprise could mean the end of telecom business-as-usual.
Perhaps no piece of office equipment is more taken for granted than the common business telephone. The technology behind this basic communication device, however, is in the midst of a major transformation. Businesses are now converging their voice and data networks in order to simplify their network operations and take advantage of the new functional benefits and capabilities that a converged network delivers from greater productivity and cost savings to enhanced mobility.
You Don’t Know Jack About VoIP:
The Communications they are a-changin’.
Telecommunications worldwide has experienced a significant revolution over recent years. The long-held promise of network convergence is occurring at an increasing pace. This convergence of data, voice, and video using IP-based networks is delivering advanced services at lower cost across the spectrum, including residential users, business customers of varying sizes, and service providers.
Leveraging Application Frameworks:
Why frameworks are important and how to apply them effectively
In today’s competitive, fast-paced computing industry, successful software must increasingly be: (1) extensible to support successions of quick updates and additions to address new requirements and take advantage of emerging markets; (2) flexible to support a growing range of multimedia data types, traffic flows, and end-to-end QoS (quality of service) requirements; (3) portable to reduce the effort required to support applications on heterogeneous operating-system platforms and compilers; (4) reliable to ensure that applications are robust and tolerant to faults; (5) scalable to enable applications to handle larger numbers of clients simultaneously; and (6) affordable to ensure that the total ownership costs of software acquisition and evolution are not prohibitively high.
Security is Harder than You Think:
It’s not just about the buffer overflow.
Many developers see buffer overflows as the biggest security threat to software and believe that there is a simple two-step process to secure software: switch from C or C++ to Java, then start using SSL (Secure Sockets Layer) to protect data communications. It turns out that this naïve tactic isn’t sufficient. In this article, we explore why software security is harder than people expect, focusing on the example of SSL.
Simulators: Virtual Machines of the Past (and Future):
Has the time come to kiss that old iron goodbye?
Simulators are a form of “virtual machine” intended to address a simple problem: the absence of real hardware. Simulators for past systems address the loss of real hardware and preserve the usability of software after real hardware has vanished. Simulators for future systems address the variability of future hardware designs and facilitate the development of software before real hardware exists.
Building Systems to Be Shared, Securely:
Want to securely partition VMs? One option is to put ’em in Jail.
The history of computing has been characterized by continuous transformation resulting from the dramatic increases in performance and drops in price described by Moore’s law. Computing power has migrated from centralized mainframes/servers to distributed systems and the commodity desktop. Despite these changes, system sharing remains an important tool for computing. From the multitasking, file-sharing, and virtual machines of the desktop environment to the large-scale sharing of server-class ISP hardware in collocation centers, safely sharing hardware between mutually untrusting parties requires addressing critical concerns of accidental and malicious damage.
Building Systems to Be Shared, Securely:
Want to securely partition VMs? One option is to put ’em in Jail.
The history of computing has been characterized by continuous transformation resulting from the dramatic increases in performance and drops in price described by Moore’s law. Computing power has migrated from centralized mainframes/servers to distributed systems and the commodity desktop. Despite these changes, system sharing remains an important tool for computing. From the multitasking, file-sharing, and virtual machines of the desktop environment to the large-scale sharing of server-class ISP hardware in collocation centers, safely sharing hardware between mutually untrusting parties requires addressing critical concerns of accidental and malicious damage.
The Reincarnation of Virtual Machines:
Virtualization makes a comeback.
The term virtual machine initially described a 1960s operating system concept: a software abstraction with the looks of a computer system’s hardware (real machine). Forty years later, the term encompasses a large range of abstractions?for example, Java virtual machines that don’t match an existing real machine. Despite the variations, in all definitions the virtual machine is a target for a programmer or compilation system. In other words, software is written to run on the virtual machine.
The Hitchhiker’s Guide to Biomorphic Software:
The natural world may be the inspiration we need for solving our computer problems.
The natural world may be the inspiration we need for solving our computer problems. While it is certainly true that "the map is not the territory," most visitors to a foreign country do prefer to take with them at least a guidebook to help locate themselves as they begin their explorations. That is the intent of this article. Although there will not be enough time to visit all the major tourist sites, with a little effort and using the information in the article as signposts, the intrepid explorer can easily find numerous other, interesting paths to explore.
The Insider, Naivety, and Hostility: Security Perfect Storm?:
Keeping nasties out if only half the battle.
Every year corporations and government installations spend millions of dollars fortifying their network infrastructures. Firewalls, intrusion detection systems, and antivirus products stand guard at network boundaries, and individuals monitor countless logs and sensors for even the subtlest hints of network penetration. Vendors and IT managers have focused on keeping the wily hacker outside the network perimeter, but very few technological measures exist to guard against insiders - those entities that operate inside the fortified network boundary. The 2002 CSI/FBI survey estimates that 70 percent of successful attacks come from the inside. Several other estimates place those numbers even higher.
Network Forensics:
Good detective work means paying attention before, during, and after the attack.
The dictionary defines forensics as “the use of science and technology to investigate and establish facts in criminal or civil courts of law.” I am more interested, however, in the usage common in the computer world: using evidence remaining after an attack on a computer to determine how the attack was carried out and what the attacker did. The standard approach to forensics is to see what can be retrieved after an attack has been made, but this leaves a lot to be desired. The first and most obvious problem is that successful attackers often go to great lengths to ensure that they cover their trails.
Security: The Root of the Problem:
Why is it we can’t seem to produce secure, high-quality code?
Security bug? My programming language made me do it! It doesn’t seem that a day goes by without someone announcing a critical flaw in some crucial piece of software or other. Is software that bad? Are programmers so inept? What the heck is going on, and why is the problem getting worse instead of better?
Blaster Revisited:
A second look at the cost of Blaster sheds new light on today’s blended threats.
What lessons can we learn from the carnage the Blaster worm created? The following tale is based upon actual circumstances from corporate enterprises that were faced with confronting and eradicating the Blaster worm, which hit in August 2003. The story provides views from many perspectives, illustrating the complexity and sophistication needed to combat new blended threats.
From IR to Search, and Beyond:
Searching has come a long way since the 60s, but have we only just begun?
It’s been nearly 60 years since Vannevar Bush’s seminal article, ’As We May Think,’ portrayed the image of a scholar aided by a machine, “a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility.”
TCP Offload to the Rescue:
Getting a toehold on TCP offload engines—and why we need them
In recent years, TCP/IP offload engines, known as TOEs, have attracted a good deal of industry attention and a sizable share of venture capital dollars. A TOE is a specialized network device that implements a significant portion of the TCP/IP protocol in hardware, thereby offloading TCP/IP processing from software running on a general-purpose CPU. This article examines the reasons behind the interest in TOEs and looks at challenges involved in their implementation and deployment.
Desktop Linux: Where Art Thou?:
Catching up, meeting new challenges, moving ahead
Linux on the desktop has come a long way - and it’s been a roller-coaster ride. At the height of the dot-com boom, around the time of Red Hat’s initial public offering, people expected Linux to take off on the desktop in short order. A few years later, after the stock market crash and the failure of a couple of high-profile Linux companies, pundits were quick to proclaim the stillborn death of Linux on the desktop.
There’s No Such Thing as a Free (Software) Lunch:
What every developer should know about open source licensing
The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software to make sure the software is free for all its users. So begins the GNU General Public License, or GPL, which has become the most widely used of open source software licenses. Freedom is the watchword; it’s no coincidence that the organization that wrote the GPL is called the Free Software Foundation and that open source developers everywhere proclaim, “Information wants to be free.”
Is Open Source Right for You?:
A fictional case study of open source in a commercial software shop
The media often present open source software as a direct competitor to commercial software. This depiction, usually pitting David (Linux) against Goliath (Microsoft), makes for fun reading in the weekend paper. However, it mostly misses the point of what open source means to a development organization. In this article, I use the experiences of GizmoSoft (a fictitious software company) to present some perspectives on the impact of open source software usage in a software development shop.
Open Source to the Core:
Using open source in real-world software products: The good, the bad and the ugly
The open source development model is not exactly new. Individual engineers have been using open source as a collaborative development methodology for decades. Now that it has come to the attention of upper and middle management, however, it’s finally being openly acknowledged as a commercial engineering force-multiplier and important option for avoiding significant software development costs.
Instant Messaging or Instant Headache?:
IM has found a home within the enterprise, but it’s far from secure.
It’s a reality. You have IM (instant messaging) clients in your environment. You have already recognized that it is eating up more and more of your network bandwidth and with Microsoft building IM capability into its XP operating system and applications, you know this will only get worse. Management is also voicing concerns over the lost user productivity caused by personal conversations over this medium. You have tried blocking these conduits for conversation, but it is a constant battle.
Gaming Graphics: The Road to Revolution:
From laggard to leader, game graphics are taking us in new directions.
It has been a long journey from the days of multicolored sprites on tiled block backgrounds to the immersive 3D environments of modern games. What used to be a job for a single game creator is now a multifaceted production involving staff from every creative discipline. The next generation of console and home computer hardware is going to bring a revolutionary leap in available computing power; a teraflop (trillion floating-point operations per second) or more will be on tap from commodity hardware.
Building Nutch: Open Source Search:
A case study in writing an open source search engine
Search engines are as critical to Internet use as any other part of the network infrastructure, but they differ from other components in two important ways. First, their internal workings are secret, unlike, say, the workings of the DNS (domain name system). Second, they hold political and cultural power, as users increasingly rely on them to navigate online content.
Why Writing Your Own Search Engine Is Hard:
Big or small, proprietary or open source, Web or intranet, it’s a tough job.
There must be 4,000 programmers typing away in their basements trying to build the next “world’s most scalable” search engine. It has been done only a few times. It has never been done by a big group; always one to four people did the core work, and the big team came on to build the elaborations and the production infrastructure. Why is it so hard? We are going to delve a bit into the various issues to consider when writing a search engine. This article is aimed at those individuals or small groups that are considering this endeavor for their Web site or intranet.
Enterprise Search: Tough Stuff:
Why is it that searching an intranet is so much harder than searching the Web?
The last decade has witnessed the growth of information retrieval from a boutique discipline in information and library science to an everyday experience for billions of people around the world. This revolution has been driven in large measure by the Internet, with vendors focused on search and navigation of Web resources and Web content management. Simultaneously, enterprises have invested in networking all of their information together to the point where it is increasingly possible for employees to have a single window into the enterprise.
Searching vs. Finding:
Why systems need knowledge to find what you really want
Finding information and organizing it so that it can be found are two key aspects of any company’s knowledge management strategy. Nearly everyone is familiar with the experience of searching with a Web search engine and using a search interface to search a particular Web site once you get there. (You may have even noticed that the latter often doesn’t work as well as the former.) After you have a list of hits, you typically spend a significant amount of time following links, waiting for pages to download, reading through a page to see if it has what you want, deciding that it doesn’t, backing up to try another link, deciding to try another way to phrase your request, et cetera.
BPM: The Promise and the Challenge:
It’s all about closing the loop from conception to execution and back.
Over the last decade, businesses and governments have been giving increasing attention to business processes - to their description, automation, and management. This interest grows out of the need to streamline business operations, consolidate organizations, and save costs, reflecting the fact that the process is the basic unit of business value within an organization.
Death by UML Fever:
Self-diagnosis and early treatment are crucial in the fight against UML Fever.
A potentially deadly illness, clinically referred to as UML (Unified Modeling Language) fever, is plaguing many software-engineering efforts today. This fever has many different strains that vary in levels of lethality and contagion. A number of these strains are symptomatically related, however. Rigorous laboratory analysis has revealed that each is unique in origin and makeup. A particularly insidious characteristic of UML fever, common to most of its assorted strains, is the difficulty individuals and organizations have in self-diagnosing the affliction. A consequence is that many cases of the fever go untreated and often evolve into more complex and lethal strains.
Digitally Assisted Analog Integrated Circuits:
Closing the gap between analog and digital
In past decades, “Moore’s law”1 has governed the revolution in microelectronics. Through continuous advancements in device and fabrication technology, the industry has maintained exponential progress rates in transistor miniaturization and integration density. As a result, microchips have become cheaper, faster, more complex, and more power efficient.
Stream Processors: Progammability and Efficiency:
Will this new kid on the block muscle out ASIC and DSP?
Many signal processing applications require both efficiency and programmability. Baseband signal processing in 3G cellular base stations, for example, requires hundreds of GOPS (giga, or billions, of operations per second) with a power budget of a few watts, an efficiency of about 100 GOPS/W (GOPS per watt), or 10 pJ/op (picoJoules per operation). At the same time programmability is needed to follow evolving standards, to support multiple air interfaces, and to dynamically provision processing resources over different air interfaces. Digital television, surveillance video processing, automated optical inspection, and mobile cameras, camcorders, and 3G cellular handsets have similar needs.
DSPs: Back to the Future:
To understand where DSPs are headed, we must look at where they’ve come from.
From the dawn of the DSP (digital signal processor), an old quote still echoes: "Oh, no! We’ll have to use state-of-the-art 5µm NMOS!" The speaker’s name is lost in the fog of history, as are many things from the ancient days of 5µm chip design. This quote refers to the first Bell Labs DSP whose mask set in fact underwent a 10 percent linear lithographic shrink to 4.5µm NMOS (N-channel metal oxide semiconductor) channel length and taped out in late 1979 with an aggressive full-custom circuit design.
On Mapping Alogrithms to DSP Architectures:
Knowledge of both the algorithm and target architecture is crucial.
Our complex world is characterized by representation, transmission, and storage of information - and information is mostly processed in digital form. With the advent of DSPs (digital signal processors), engineers are able to implement complex algorithms with relative ease. Today we find DSPs all around us - in cars, digital cameras, MP3 and DVD players, modems, and so forth. Their widespread use and deployment in complex systems has triggered a revolution in DSP architectures, which in turn has enabled engineers to implement algorithms of ever-increasing complexity.
Of Processors and Processing:
There’s more than one way to DSP
Digital signal processing is a stealth technology. It is the core enabling technology in everything from your cellphone to the Mars Rover. It goes much further than just enabling a one-time breakthrough product. It provides ever-increasing capability; compare the performance gains made by dial-up modems with the recent performance gains of DSL and cable modems. Remarkably, digital signal processing has become ubiquitous with little fanfare, and most of its users are not even aware of what it is.
People in Our Software:
A person-centric approach could make software come alive, but at what cost?
People are not well represented in today’s software. With the exception of IM (instant messaging) clients, today’s applications offer few clues that people are actually living beings. Static strings depict things associated with people like e-mail addresses, phone numbers, and home-page URLs. Applications also tend to show the same information about a person, no matter who is viewing it.
Sensible Authentication:
According to the author of Beyond Fear, it’s not enough to know who you are; you’ve got to prove it.
The problem with securing assets and their functionality is that, by definition, you don’t want to protect them from everybody. It makes no sense to protect assets from their owners, or from other authorized individuals (including the trusted personnel who maintain the security system). In effect, then, all security systems need to allow people in, even as they keep people out. Designing a security system that accurately identifies, authenticates, and authorizes trusted individuals is highly complex and filled with nuance, but critical to security.
The Scalability Problem:
The coexistence of high-end systems and value PCs can make life hell for game developers.
Back in the mid-1990s, I worked for a company that developed multimedia kiosk demos. Our biggest client was Intel, and we often created demos that appeared in new PCs on the end-caps of major computer retailers such as CompUSA. At that time, performance was in demand for all application classes from business to consumer. We created demos that showed, for example, how much faster a spreadsheet would recalculate (you had to do that manually back then) on a new processor as compared with the previous year’s processor. The differences were immediately noticeable to even a casual observer - and it mattered.
AI in Computer Games:
Smarter games are making for a better user experience. What does the future hold?
If you’ve been following the game development scene, you’ve probably heard many remarks such as: "The main role of graphics in computer games will soon be over; artificial intelligence is the next big thing!" Although you should hardly buy into such statements, there is some truth in them. The quality of AI (artificial intelligence) is a high-ranking feature for game fans in making their purchase decisions and an area with incredible potential to increase players’ immersion and fun.
Fun and Games: Multi-Language Development:
Game development can teach us much about the common practice of combining multiple languages in a single project.
Computer games (or "electronic games" if you encompass those games played on console-class hardware) comprise one of the fastest-growing application markets in the world. Within the development community that creates these entertaining marvels, multi-language development is becoming more commonplace as games become more and more complex. Today, asking a development team to construct a database-enabled Web site with the requirement that it be written entirely in C++ would earn scornful looks and rolled eyes, but not long ago the idea that multiple languages were needed to accomplish a given task was scoffed at.
Massively Multiplayer Middleware:
Building scaleable middleware for ultra-massive online games teaches a lesson we all can use: Big project, simple design.
Wish is a multiplayer, online, fantasy role-playing game being developed by Mutable Realms. It differs from similar online games in that it allows tens of thousands of players to participate in a single game world. Allowing such a large number of players requires distributing the processing load over a number of machines and raises the problem of choosing an appropriate distribution technology.
Game Development: Harder Than You Think:
Ten or twenty years ago it was all fun and games. Now it’s blood, sweat, and code.
The hardest part of making a game has always been the engineering. In times past, game engineering was mainly about low-level optimization—writing code that would run quickly on the target computer, leveraging clever little tricks whenever possible. But in the past ten years, games have ballooned in complexity. Now the primary technical challenge is simply getting the code to work to produce an end result that bears some semblance to the desired functionality. To the extent that we optimize, we are usually concerned with high-level algorithmic choices.
Black Box Debugging:
It’s all about what takes place at the boundary of an application.
Modern software development practices build applications as a collection of collaborating components. Unlike older practices that linked compiled components into a single monolithic application, modern executables are made up of any number of executable components that exist as separate binary files. This design means that as an application component needs resources from another component, calls are made to transfer control or data from one component to another. Thus, we can observe externally visible application behaviors by watching the activity that occurs across the boundaries of the application’s constituent components.
Sink or Swim: Know When It’s Time to Bail:
A diagnostic to help you measure organizational dysfunction and take action
There are endless survival challenges for newly created businesses. The degree to which a business successfully meets these challenges depends largely on the nature of the organization and the culture that evolves within it. That’s to say that while market size, technical quality, and product design are obviously crucial factors, company failures are typically rooted in some form of organizational dysfunction.
Culture Surprises in Remote Software Development Teams:
When in Rome doesn’t help when your team crosses time zones, and your deadline doesn’t.
Technology has made it possible for organizations to construct teams of people who are not in the same location, adopting what one company calls "virtual collocation." Worldwide groups of software developers, financial analysts, automobile designers, consultants, pricing analysts, and researchers are examples of teams that work together from disparate locations, using a variety of collaboration technologies that allow communication across space and time.
Building Collaboration into IDEs:
Edit>Compile>Run>Debug>Collaborate?
Software development is rarely a solo coding effort. More often, it is a collaborative process, with teams of developers working together to design solutions and produce quality code. The members of these close-knit teams often look at one another’s code, collectively make plans about how to proceed, and even fix each other’s bugs when necessary. Teamwork does not stop there, however. An extended team may include project managers, testers, architects, designers, writers, and other specialists, as well as other programming teams.
The Sun Never Sits on Distributed Development:
People around the world can work around the clock on a distributed project, but the real challenge lies in taming the social dynamics.
More and more software development is being distributed across greater and greater distances. The motives are varied, but one of the most predominant is the effort to keep costs down. As talent is where you find it, why not use it where you find it, rather than spending the money to relocate it to some ostensibly more "central" location? The increasing ubiquity of the Internet is making far-flung talent ever-more accessible.
Distributed Development: Lessons Learned:
Why repeat the mistakes of the past if you don’t have to?
Delivery of a technology-based project is challenging, even under well-contained, familiar circumstances. And a tight-knit team can be a major factor in success. It is no mystery, therefore, why most small, new technology teams opt to work in a garage (at times literally). Keeping the focus of everyone’s energy on the development task at hand means a minimum of non-engineering overhead.
Uprooting Software Defects at the Source:
Source code analysis is an emerging technology in the software industry that allows critical source code defects to be detected before a program runs.
Although the concept of detecting programming errors at compile time is not new, the technology to build effective tools that can process millions of lines of code and report substantive defects with only a small amount of noise has long eluded the market. At the same time, a different type of solution is needed to combat current trends in the software industry that are steadily diminishing the effectiveness of conventional software testing and quality assurance.
Sentient Data Access via a Diverse Society of Devices:
Today’s ubiquitous computing environment cannot benefit from the traditional understanding of a hierarchical file system.
It has been more than ten years since such "information appliances" as ATMs and grocery store UPC checkout counters were introduced. For the office environment, Mark Weiser began to articulate the notion of UbiComp and identified some of the salient features of the trends in 1991. Embedded computation is also becoming widespread. Microprocessors, for example, are finding themselves embedded into seemingly conventional pens that remember what they have written. Anti-lock brake systems in cars are controlled by fuzzy logic.
Nine IM Accounts and Counting:
The key word with instant messaging today is interoperability.
Instant messaging has become nearly as ubiquitous as e-mail, in some cases far surpassing e-mail in popularity. But it has gone far beyond teenagers’ insular world to business, where it is becoming a useful communication tool. The problem, unlike e-mail, is that no common standard exists for IM, so users feel compelled to maintain multiple accounts, for example, AOL, Jabber, Yahoo, and MSN.
Broadcast Messaging: Messaging to the Masses:
This powerful form of communication has social implications as well as technical challenges.
We have instantaneous access to petabytes of stored data through Web searches. With respect to messaging, we have an unprecedented number of communication tools that provide both synchronous and asynchronous access to people. E-mail, message boards, newsgroups, IRC (Internet relay chat), and IM (instant messaging) are just a few examples. These tools are all particularly significant because they have become essential productivity entitlements. They have caused a fundamental shift in the way we communicate. Many readers can attest to feeling disconnected when a mail server goes down or when access to IM is unavailable.
Beyond Instant Messaging:
Platforms and standards for these services must anticipate and accommodate future developments.
The recent rise in popularity of IM (instant messaging) has driven the development of platforms and the emergence of standards to support IM. Especially as the use of IM has migrated from online socializing at home to business settings, there is a need to provide robust platforms with the interfaces that business customers use to integrate with other work applications. Yet, in the rush to develop a mature IM infrastructure, it is also important to recognize that IM features and uses are still evolving. For example, popular press stories1 have raised the concern that IM interactions may be too distracting in the workplace.
Reading, Writing, and Code:
The key to writing readable code is developing good coding style.
Forty years ago, when computer programming was an individual experience, the need for easily readable code wasn’t on any priority list. Today, however, programming usually is a team-based activity, and writing code that others can easily decipher has become a necessity. Creating and developing readable code is not as easy as it sounds.
The Big Bang Theory of IDEs:
Pondering the vastness of the ever-expanding universe of IDEs, you might wonder whether a usable IDE is too much to ask for.
Remember the halcyon days when development required only a text editor, a compiler, and some sort of debugger (in cases where the odd printf() or two alone didn’t serve)? During the early days of computing, these were independent tools used iteratively in development’s golden circle. Somewhere along the way we realized that a closer integration of these tools could expedite the development process. Thus was born the integrated development environment (IDE), a framework and user environment for software development that’s actually a toolkit of instruments essential to software creation. At first, IDEs simply connected the big three (editor, compiler, and debugger), but nowadays most go well beyond those minimum requirements.
Modern System Power Management:
Increasing demands for more power and increased efficiency are pressuring software and hardware developers to ask questions and look for answers.
The Advanced Configuration and Power Interface (ACPI) is the most widely used power and configuration interface for laptops, desktops, and server systems. It is also very complex, and its current specification weighs in at more than 500 pages. Needless to say, operating systems that choose to support ACPI require significant additional software support, up to and including fundamental OS architecture changes. The effort that ACPI’s definition and implementation has entailed is worth the trouble because of how much flexibility it gives to the OS (and ultimately the user) to control power management policy and implementation.
Making a Case for Efficient Supercomputing:
It is time for the computing community to use alternative metrics for evaluating performance.
A supercomputer evokes images of “big iron“ and speed; it is the Formula 1 racecar of computing. As we venture forth into the new millennium, however, I argue that efficiency, reliability, and availability will become the dominant issues by the end of this decade, not only for supercomputing, but also for computing in general.
Energy Management on Handheld Devices:
Whatever their origin, all handheld devices share the same Achilles heel: the battery.
Handheld devices are becoming ubiquitous and as their capabilities increase, they are starting to displace laptop computers - much as laptop computers have displaced desktop computers in many roles. Handheld devices are evolving from today’s PDAs, organizers, cellular phones, and game machines into a variety of new forms. Although partially offset by improvements in low-power electronics, this increased functionality carries a corresponding increase in energy consumption. Second, as a consequence of displacing other pieces of equipment, handheld devices are seeing more use between battery charges. Finally, battery technology is not improving at the same pace as the energy requirements of handheld electronics.
The Inevitability of Reconfigurable Systems:
The transition from instruction-based to reconfigurable circuits will not be easy, but has its time come?
The introduction of the microprocessor in 1971 marked the beginning of a 30-year stall in design methods for electronic systems. The industry is coming out of the stall by shifting from programmed to reconfigurable systems. In programmed systems, a linear sequence of configuration bits, organized into blocks called instructions, configures fixed hardware to mimic custom hardware. In reconfigurable systems, the physical connections among logic elements change with time to mimic custom hardware. The transition to reconfigurable systems will be wrenching, but this is inevitable as the design emphasis shifts from cost performance to cost performance per watt. Here’s the story.
Getting Gigascale Chips:
Challenges and Opportunities in Continuing Moore’s Law
Processor performance has increased by five orders of magnitude in the last three decades, made possible by following Moore’s law - that is, continued technology scaling, improved transistor performance to increase frequency, additional (to avoid repetition) integration capacity to realize complex architectures, and reduced energy consumed per logic operation to keep power dissipation within limits. Advances in software technology, such as rich multimedia applications and runtime systems, exploited this performance explosion, delivering to end users higher productivity, seamless Internet connectivity, and even multimedia and entertainment.
Spam, Spam, Spam, Spam, Spam, the FTC, and Spam:
A forum sponsored by the FTC highlights just how bad spam is, and how it’s only going to get worse without some intervention.
The Federal Trade Commission held a forum on spam in Washington, D.C., April 30 to May 2. Rather to my surprise, it was a really good, content-full event. The FTC folks had done their homework and had assembled panelists that ran the gamut from ardent anti-spammers all the way to hard-core spammers and everyone in between: lawyers, legitimate marketers, and representatives from vendor groups.
Another Day, Another Bug:
We asked our readers which tools they use to squash bugs. Here’s what they said.
As part of this issue on programmer tools, we at Queue decided to conduct an informal Web poll on the topic of debugging. We asked you to tell us about the tools that you use and how you use them. We also collected stories about those hard-to-track-down bugs that sometimes make us think of taking up another profession.
No Source Code? No Problem!:
What if you have to port a program, but all you have is a binary?
Typical software development involves one of two processes: the creation of new software to fit particular requirements or the modification (maintenance) of old software to fix problems or fit new requirements. These transformations happen at the source-code level. But what if the problem is not the maintenance of old software but the need to create a functional duplicate of the original? And what if the source code is no longer available?
Code Spelunking: Exploring Cavernous Code Bases:
Code diving through unfamiliar source bases is something we do far more often than write new code from scratch--make sure you have the right gear for the job.
Try to remember your first day at your first software job. Do you recall what you were asked to do, after the human resources people were done with you? Were you asked to write a piece of fresh code? Probably not. It is far more likely that you were asked to fix a bug, or several, and to try to understand a large, poorly documented collection of source code. Of course, this doesn’t just happen to new graduates; it happens to all of us whenever we start a new job or look at a new piece of code. With experience we all develop a set of techniques for working with large, unfamiliar source bases.
Coding Smart: People vs. Tools:
Tools can help developers be more productive, but they’re no replacement for thinking.
Cool tools are seductive. When we think about software productivity, tools naturally come to mind. When we see pretty new tools, we tend to believe that their amazing features will help us get our work done much faster. Because every software engineer uses software productivity tools daily, and all team managers have to decide which tools their members will use, the latest and greatest look appealing.
Debugging in an Asynchronous World:
Hard-to-track bugs can emerge when you can’t guarantee sequential execution. The right tools and the right techniques can help.
Pagers, cellular phones, smart appliances, and Web services - these products and services are almost omnipresent in our world, and are stimulating the creation of a new breed of software: applications that must deal with inputs from a variety of sources, provide real-time responses, deliver strong security - and do all this while providing a positive user experience. In response, a new style of application programming is taking hold, one that is based on multiple threads of control and the asynchronous exchange of data, and results in fundamentally more complex applications.
Closed Source Fights Back:
SCO vs. The World-What Were They Thinking?
In May 2003, the SCO Group, a vendor of the Linux operating system, sent a letter to its customers. Among other things, it stated, "We believe that Linux is, in material part, an unauthorized derivative of Unix." What would make SCO do that?
Commercializing Open Source Software:
Many have tried, a few are succeeding, but challenges abound.
The use of open source software has become increasingly popular in production environments, as well as in research and software development. One obvious attraction is the low cost of acquisition. Commercial software has a higher initial cost, though it usually has advantages such as support and training. A number of business models designed by users and vendors combine open source and commercial software; they use open source as much as possible, adding commercial software as needed.
The Age of Corporate Open Source Enlightenment:
Like it or not, zealots and heretics are finding common ground in the open source holy war.
It’s a bad idea, mixing politics and religion. Conventional wisdom tells us to keep them separate - and to discuss neither at a dinner party. The same has been said about the world of software. When it comes to mixing the open source church with the proprietary state (or is it the other way around?), only one rule applies: Don’t do it.
From Server Room to Living Room:
How open source and TiVo became a perfect match
The open source movement, exemplified by the growing acceptance of Linux, is finding its way not only into corporate environments but also into a home near you. For some time now, high-end applications such as software development, computer-aided design and manufacturing, and heavy computational applications have been implemented using Linux and generic PC hardware.
Storage Systems: Not Just a Bunch of Disks Anymore:
The sheer size and scope of data available today puts tremendous pressure on storage systems to perform in ways never imagined.
The concept of a storage device has changed dramatically from the first magnetic disk drive introduced by the IBM RAMAC in 1956 to today’s server rooms with detached and fully networked storage servers. Storage has expanded in both large and small directions. All use the same underlying technology but they quickly diverge from there. Here we will focus on the larger storage systems that are typically detached from the server hosts. We will introduce the layers of protocols and translations that occur as bits make their way from the magnetic domains on the disk drives and interfaces to your desktop.
You Don’t Know Jack about Disks:
Whatever happened to cylinders and tracks?
Traditionally, the programmer’s working model of disk storage has consisted of a set of uniform cylinders, each with a set of uniform tracks, which in turn hold a fixed number of 512-byte sectors, each with a unique address. The cylinder is made up of concentric circles (or tracks) on each disk platter in a multiplatter drive. Each track is divided up like pie slices into sectors.
Open Spectrum:
A Path to Ubiquitous Connectivity
Just as open standards and open software rocked the networking and computing industry, open spectrum is poised to be a disruptive force in the use of radio spectrum for communications. At the same time, open spectrum will be a major element that helps continue the Internet’s march to integrate and facilitate all electronic communications with open standards and commodity hardware.
Self-Healing Networks:
Wireless networks that fix their own broken communication links may speed up their widespread acceptance.
The obvious advantage to wireless communication over wired is, as they say in the real estate business, location, location, location. Individuals and industries choose wireless because it allows flexibility of location--whether that means mobility, portability, or just ease of installation at a fixed point. The challenge of wireless communication is that, unlike the mostly error-free transmission environments provided by cables, the environment that wireless communications travel through is unpredictable. Environmental radio-frequency (RF) "noise" produced by powerful motors, other wireless devices, microwaves--and even the moisture content in the air--can make wireless communication unreliable.
Designing Portable Collaborative Networks:
A middleware solution to keep pace with the ever-changing ways in which mobile workers collaborate.
Peer-to-peer technology and wireless networking offer great potential for working together away from the desk - but they also introduce unique software and infrastructure challenges. The traditional idea of the work environment is anchored to a central location - the desk and office - where the resources needed for the job are located.
The Family Dynamics of 802.11:
The 802.11 family of standards is helping to move wireless LANs into promising new territory.
Three trends are driving the rapid growth of wireless LAN (WLAN): The increased use of laptops and personal digital assistants (PDAs); rapid advances in WLAN data rates (from 2 megabits per second to 108 Mbps in the past four years); and precipitous drops in WLAN prices (currently under $50 for a client and under $100 for an access point).
Caching XML Web Services for Mobility:
In the face of unreliable connections and low bandwidth, caching may offer reliable wireless access to Web services.
Web services are emerging as the dominant application on the Internet. The Web is no longer just a repository of information but has evolved into an active medium for providers and consumers of services: Individuals provide peer-to-peer services to access personal contact information or photo albums for other individuals; individuals provide services to businesses for accessing personal preferences or tax information; Web-based businesses provide consumer services such as travel arrangement (Orbitz), shopping (eBay), and e-mail (Hotmail); and several business-to-business (B2B) services such as supply chain management form important applications of the Internet.
The Future of WLAN:
Overcoming the Top Ten Challenges in wireless networking--will it allow wide-area mesh networks to become ubiquitous?
Since James Clerk Maxwell first mathematically described electromagnetic waves almost a century and a half ago, the world has seen steady progress toward using them in better and more varied ways. Voice has been the killer application for wireless for the past century. As performance in all areas of engineering has improved, wireless voice has migrated from a mass broadcast medium to a peer-to-peer medium. The ability to talk to anyone on the planet from anywhere on the planet has fundamentally altered the way society works and the speed with which it changes.
Putting It All Together:
Component integration is one of the tough challenges in embedded system design. Designers search for conservative design styles and reliable techniques for interfacing and verification.
With the growing complexity of embedded systems, more and more parts of a system are reused or supplied, often from external sources. These parts range from single hardware components or software processes to hardware-software (HW-SW) subsystems. They must cooperate and share resources with newly developed parts such that all of the design constraints are met. This, simply speaking, is the integration task, which ideally should be a plug-and-play procedure. This does not happen in practice, however, not only because of incompatible interfaces and communication standards but also because of specialization.
Blurring Lines Between Hardware and Software:
Software development for embedded systems clearly transcends traditional "programming" and requires intimate knowledge of hardware, as well as deep understanding of the underlying application that is to be implemented.
Motivated by technology leading to the availability of many millions of gates on a chip, a new design paradigm is emerging. This new paradigm allows the integration and implementation of entire systems on one chip.
Division of Labor in Embedded Systems:
You can choose among several strategies for partitioning an embedded application over incoherent processor cores. Here’s practical advice on the advantages and pitfalls of each.
Increasingly, embedded applications require more processing power than can be supplied by a single processor, even a heavily pipelined one that uses a high-performance architecture such as very long instruction word (VLIW) or superscalar. Simply driving up the clock is often prohibitive in the embedded world because higher clocks require proportionally more power, a commodity often scarce in embedded systems. Multiprocessing, where the application is run on two or more processors concurrently, is the natural route to ever more processor cycles within a fixed power budget.
SoC: Software, Hardware, Nightmare, Bliss:
System-on-a-chip design offers great promise by shrinking an entire computer to a single chip. But with the promise come challenges that need to be overcome before SoC reaches its full potential.
System-on-a-chip (SoC) design methodology allows a designer to create complex silicon systems from smaller working blocks, or systems. By providing a method for easily supporting proprietary functionality in a larger context that includes many existing design pieces, SoC design opens the craft of silicon design to a much broader audience.
Programming Without a Net:
Embedded systems programming presents special challenges to engineers unfamiliar with that environment.
Embedded systems programming presents special challenges to engineers unfamiliar with that environment. In some ways it is closer to working inside an operating system kernel than writing an application for use on the desktop. Here’s what to look out for.
Web Services: Promises and Compromises:
Much of web services’ initial promise will be realized via integration within the enterprise.
Much of web services’ initial promise will be realized via integration within the enterprise, either with legacy applications or new business processes that span organizational silos. Enterprises need organizational structures that support this new paradigm.
An Open Web Services Architecture:
The name of the game is web services.
The name of the game is web services-sophisticated network software designed to bring us what we need, when we need it, through any device we choose. We are getting closer to this ideal, as in recent years the client/server model has evolved into web-based computing, which is now evolving into the web services model. In this article, I will discuss Sun Microsystems’ take on web services, specifically Sun ONE: an open, standards-based web services framework. I’ll share with you Sun’s decision-making rationales regarding web services, and discuss directions we are moving in.
The Deliberate Revolution:
Transforming Integration With XML Web Services
While detractors snub XML web services as CORBA with a weight problem, industry cheerleaders say these services are ushering in a new age of seamless integrated computing. But for those of us whose jobs don’t involve building industry excitement, what do web services offer?