Download PDF version of this article PDF

How Flexible is CXL's Memory Protection?

Replacing a sledgehammer with a scalpel

Samuel W. Stark, A. Theodore Markettos, Simon W. Moore

In the beginning, there was PCIe. Well, really there was PCI and PCI-X, which were superseded by PCIe in 2003, and many others, such as ISA and VME, before them, but PCIe is roughly a superset of them all. They are all interconnects, allowing a host (e.g., the main system CPU) to configure and manipulate connected peripheral devices and map their memory into a shared address space.

As time passed, computations became bigger and more complicated, and peripheral devices became whole systems unto themselves. GPUs (graphics processing units) are the best example, going from hardwired graphics offload devices to full-blown general-purpose processors that cooperate and communicate with the host to solve problems.

Cooperative processing between the host and device is complicated by PCIe's lack of coherent memory sharing. When CPU cores share memory, they use a cache coherency protocol to ensure they can have a fast local copy (a cache) while keeping a coherent view of memory—even when other cores write to it. PCIe doesn't support this kind of sharing; it only allows block transfers between host and device. Various companies created successor protocols—CCIX, OpenCAPI, and Gen-Z—to support this, but they have all died or been subsumed by Intel's CXL (Compute Express Link).

CXL provides new protocols on top of PCIe for accelerator devices to cache host memory (CXL.cache) and for hosts to cache device memory (CXL.mem). The industry is currently focused on CXL.mem memory expansion devices. The first CXL-compatible CPUs (released in November 2022) support "CXL 1.1+ for memory expansion", 1 and CXL accelerators haven't been announced—only CXL.mem devices, such as Samsung's 512GB RAM expansion.19 CXL 3.0, released in August 2022, adds support for fabric topologies connecting many hosts to many shared GFAM (Global Fabric Attached Memory) devices. This facilitates disaggregated memory, where an arbitrary number of endpoints connected in an arbitrary topology can request, use, and coherently share arbitrary amounts of memory.

If disaggregated memory is the future, our biggest question is that of protection. With so many endpoints all connecting to and sharing the same memory, how can they be restricted to accessing only the memory they need? They may be running untrusted software or themselves be untrusted hardware. How can memory protection work in this threat environment? The CHERI (Capability Hardware Enhanced RISC Instructions) project has shown that architectural capabilities can provide flexible, fine-grained memory protection.21 How does CXL's current memory protection compare? Could a capability system work in CXL's distributed setting with malicious actors? To start, let's examine CXL's protection mechanisms and see how well they handle real-world security problems.

 

Protection Systems

In most cases, software uses physical resources through multiple layers of abstraction. Each layer translates incoming requests to a format expected by the next layer down, and can also provide protection. A simple example is the MMU (memory management unit), which translates memory requests from virtual to physical memory.13 The OS gives each process a different mapping of virtual to physical addresses, and the MMU ensures processes can access only the physical memory that the OS has mapped in. To generalize, protection systems ensure that actors can only access valid resources.

The protection a system can provide is limited by the granularity of its actors and resources; therefore, protection at multiple layers of abstraction is important. For example, the MMU only has insight at the process level. The software inside the process has tighter definitions of valid (e.g., "I will not access out-of-bounds array elements") that the MMU doesn't understand (it doesn't know or care where the array is) and thus cannot help with. Instead, another layer can be added above the MMU, such as a language runtime (JVM, .NET) or hardware-based checks (CHERI21), which have more information and ensure validity at a finer-grained level.

Different levels of abstraction can add different sets of actors and resources. For example, an operating system is responsible for ensuring that its processes access files correctly—and for actually performing those accesses through the file system driver. If those files are on a networked file system, the server may have to handle multiple clients at once and check that they access files correctly. The individual OS doesn't know about the other clients, and the server doesn't know about the processes running inside the OS, so having protections and checks at both levels is necessary.

 

CXL and the flaws therein

CXL, like PCIe, uses a host-device model. Each CXL host controls a set of connected peripheral devices, and maps all the memory they expose into an HPA (host physical address space). The host may also map its own memory into the HPA, and accelerator devices like GPUs can access it over CXL.cache, but current devices just expose RAM to the host over CXL.mem. CXL 3.0 upgraded CXL.mem to allow hosts to share memory regions through both multi-headed and GFAM devices.

Multi-headed CXL.mem devices connect to multiple hosts and can map the same regions of physical memory into all of their HPAs at the same time. Those hosts can all cache parts of those regions, and the device is responsible for ensuring coherency (see figure 1). For example, if host 1 tries to write to a cache line in region A, the device realizes that hosts 2 and 3 share A and tells them to invalidate that cache line. Unfortunately, each of those hosts can only access 16 regions8 (Sec 2.5), so they will necessarily be large—on the order of gigabytes or hundreds of megabytes.

How Flexible is CXLs Memory Protection?

GFAM devices take this a step further by not being attached to specific hosts. Any host can map GFAM memory into its HPA, and any endpoint (host or device) in that HPA can talk to the GFAM directly and access that memory. The GFAM is configured with separate translation tables for each endpoint, so each endpoint can access eight regions of physical memory8 (Sec 7.7.2.4). These regions may overlap, allowing memory sharing, or they may be isolated. As shown in figure 2, 10GiB of GFAM is mapped, but the host and accelerator are configured so they see only 6GiB each, with a 2GiB shared region. Again, because each endpoint has few ranges, they will be large. Memory groups8 (Sec 7.7.2.5) can punch holes in these ranges and hide specific blocks, but the holes will be at least 64MB8 (Table 7-67, Min. Block Size).

How Flexible is CXLs Memory Protection?

Both kinds of memory provide protection through nonexhaustive translation: Endpoints request addresses in their HPA, which get translated to local device addresses, and that translation may fail (i.e., the endpoint may not have memory mapped at that address). These mechanisms, similar to an MMU, provide inflexible coarse-grained protection. At most, each endpoint can access 16 memory ranges per device. The only way to change the mappings and transfer access rights is to convince the Fabric Manager, which has no defined interface for this8 (Sec 7.6.1).

CXL 3.0 also introduced Unordered I/O requests, which allow accelerators to access other devices' memory, but there is no standardized way to protect those accesses. It may be possible to prevent specific devices from interacting at all (e.g., through PCIe Access Control Services) or add MMU-like protection (e.g., through PCIe Address Translation Services) but these, like CXL's other protection models, are inflexible and coarse-grained.

CXL's protection isn't great. Endpoints can be configured to access and share large memory regions, but can't share many small ones. Endpoints can't grant each other access to memory; they have to go through an intermediary. Device-to-device access has to rely on vendor-defined protection, if any. How does that stack up against real-world threats?

 

Threats in the datacenter

First, we can understand the datacenter threat model from a whitepaper published in November 2022 by AWS (Amazon Web Services) about their Nitro platform.2 Cloud systems have to run workloads from many clients, who don't trust each other, on the same hardware. Before Nitro, AWS would run all client workloads as VMs (virtual machines) atop a hypervisor, which exposed isolated virtualized resources to each VM. For example, the hypervisor would implement a software model of a network card for each VM, so it could control which networks the VMs could access. The key impact of the Nitro system is moving this virtualization out of the hypervisor and into the hardware.

Each Nitro system is controlled by a custom Nitro Controller PCIe card. This is the hardware root of trust, responsible for configuring the System Main Board (i.e., the CPU, motherboard, and RAM) and other peripherals before running client workloads. Networks and storage are accessed through other AWS-designed Nitro PCIe cards, which the Nitro Controller can split into Virtual Functions using PCIe SR-IOV (single-root I/O virtualization)16 to provide isolated resources for each VM.

When running many VMs, a minimal hypervisor is still necessary to configure the MMU and link each VM to its dedicated virtual functions. A Nitro system can also run bare-metal (a single client workload without a hypervisor). Even though the client workload is untrusted, the Nitro cards still virtualize access to networks and storage.

AWS trusts the Nitro Controller to bring up the system, the Nitro Cards to virtualize networks/storage, and the hypervisor/MMU to enforce isolation between VMs. Client workloads cannot be trusted, and if they're running bare-metal, then any communication from the System Main Board cannot be trusted either. From CXL's perspective, this means a host could be malicious (running bare-metal) or be responsible for many malicious workloads (running VMs). In the latter case, CXL doesn't have any constructs that can help with virtualization. In fact, CXL doesn't consider virtualization at all—literally, virtualization and similar terms are not in the specification.

Datacenters have further complications. Accelerator devices, such as GPUs, sometimes rely on directly sharing memory for high performance. Nvidia's Magnum I/O APIs18 allow GPUs to directly access NVMe storage devices (GPUDirect Storage), share memory with other GPUs (NVSHMEM), and expose their memory to other peripherals (GPUDirect RDMA), including InfiniBand adapters (nvidia-peermem).

While some GPUs nominally support virtualization through SR-IOV, AWS does not take advantage of this—client workloads are given whole numbers of GPUs and control them directly (clients even control the GPU drivers3). This expands the threat model. Not only are GFAMs sharing memory across HPAs, but also individual devices (including accelerators) may expose their memory to endpoints controlled by malicious clients.

CXL does not handle this use case. It implicitly assumes that hosts and devices are trustworthy. Hosts may be trustworthy if they have, for example, a hypervisor keeping them in check, and devices may be carefully chosen for trust, but if any device or host is untrustworthy (e.g., running bare-metal client workloads) better protection is needed.

 

Threats in the consumer space

The threat of malicious devices is not exclusive to the datacenter—in fact, it's much worse for consumers! Desktops and laptops have a plethora of external ports for connecting arbitrary hardware, including high-performance accelerators, such as external GPUs. Accelerators take advantage of high-speed Thunderbolt connections that wrap PCIe, giving external hardware access to the internal PCIe memory map. Attacks on PCIe-based systems through Thunderbolt have already been demonstrated,15 showing that malicious hardware can access sensitive memory intended for other devices, even with protections such as IOMMU enabled.

Worse, direct device-to-device memory accesses are making their way to consumer systems as well. Modern game consoles depend on high-speed transfers from storage to GPU-accessible memory, and Microsoft's DirectStorage API brings this closer to reality on PCs. While at the time of writing, it still copied data through a buffer in system RAM, it seems inevitable that high-performance rendering systems (e.g., games and video editing) will eventually take advantage of direct access—especially because it's already possible in the datacenter.

CXL is coming to the consumer market, so it needs to handle this. In an AMD "Meet the Experts" webinar4 in October 2022, an AMD representative said it might come to consumer devices within five years, initially with a focus on connecting persistent storage and RAM. Loading from persistent storage is currently the big use case for device-to-device transfer, so CXL needs to consider malicious devices sooner rather than later.

As it stands now, CXL's memory protection is inflexible at best. It is capable of isolating endpoints in large memory regions, but not much more than that. It has no capacity for virtualization for workloads running on the same endpoint, and cannot protect devices from each other.

 

Capability-based Protection for CXL

CHERI21 is a capability-based protection system that has proven useful both for flexible, fine-grained (tens of bytes) memory protection and for compartmentalization, by sandboxing programs and libraries from each other.22 This seems to address all of CXL's security issues. Could CXL adopt a capability-based system?

Capabilities are unforgeable tokens that encode the authority to access a resource. Given a capability, an actor can access the resource, derive new capabilities for that resource with reduced permissions, transfer them to other actors, and potentially revoke them if those actors no longer need access. Because access rights are encoded directly in the token, capabilities are very flexible: It's easy to derive new capabilities with extremely specific access rights for new situations. Deriving lots of capabilities does have a downside: Revoking a capability—recursively deleting all derivations—can be more difficult. Let's examine a few examples.

 

Central-trust systems

Capabilities must be unforgeable. When a capability is used, the system needs some way to verify it hasn't been forged. The simplest way to enforce this is to store all capabilities and perform all capability modifications in a centralized trusted base, or a central-trust system.

One example is FreeBSD Capsicum,20 which protects files from processes by replacing Unix file descriptors with capabilities. A process can open the files it needs, limit its access with more granular permissions, and then enter capability mode to sandbox itself with those files. Like file descriptors, capabilities are stored in tables in OS memory. Userspace programs have to use syscalls to ask the OS to manipulate them, instead of creating or modifying them directly. The OS trusts itself to correctly modify capabilities (e.g., never adding permissions, only taking them away) so capabilities cannot be forged. Although Capsicum doesn't perform revocation, in principle it would simply require searching the tables or even tracking parentage in capability metadata. This provides better security than plain Unix, but syscalls and context-switching to the OS can be slow.

CHERI takes a different approach. Instead of implementing the trusted base in software, CHERI implements it in the hardware and adds machine instructions for fast capability manipulation. CHERI replaces pointers with capabilities—fat pointers that include the range of addresses the pointer may point to. This range ensures pointers stay within their original provenances6,12 and can be limited further (e.g., you can allocate an array, derive a capability for one element, and pass that to a function without exposing the rest of the array).

Registers and memory use tag bits to mark valid capabilities, and the hardware controls the tag bits to prevent forgery. Because all pointers have this metadata, including code pointers, even the smallest software components (e.g., individual functions) can be sandboxed with just the memory they need. Larger libraries, even ones compiled without CHERI support, can also be sandboxed using compartments.22 The cost of storing capabilities anywhere is that revocation needs to search everywhere,23 although the overheads are lower than you might expect.10 CHERI ensures that logical software components can access only the virtual memory ranges they have been explicitly given access to.

How could this help CXL? Eagle-eyed readers might notice that GFAM already uses a system similar to Capsicum—each endpoint (i.e., actor) has up to eight translation table entries (i.e., capabilities) that grant access to memory. This demonstrates the flaws with a centralized system in this context: The number of capabilities (and implicitly their granularity) can be limited by hardware resources. This is more suitable for protecting host memory from a limited number of devices, for example,14 but GFAM tries to track all capabilities granted to thousands of actors. To alleviate this, one could store the capabilities in the memory exposed over CXL.mem or give each endpoint some dedicated capability memory, such as Capsicum's tables. Both cases would require trust to be distributed among the endpoints.

 

Distributed-trust systems

Barrelfish5,17 and SemperOS11 are distributed operating systems, implemented as separate instances running on separate cores and communicating with message passing. Barrelfish uses capabilities to protect OS resources, such as message passing and threading primitives, physical memory ranges, etc. SemperOS uses capabilities for an in-memory file system.

The trusted base for capability operations is distributed across the OS cores but aims to provide identical semantics to central-trust systems. Most importantly, any core can derive from a capability in any other core, and thus revocation may need to touch all cores. This requires all actors to trust each other. It is more complicated to reason about than central-trust systems, but it scales better—particularly if cross-actor operations are uncommon.

For CXL, this may be suitable if all endpoints are trustworthy. If, for example, all endpoints in a datacenter use CHERI-like hardware to manipulate capabilities, this could work. At scale, however, revocation may become a bigger issue, and CXL can't rely solely on this model anyway—the threat of malicious endpoints is too great.

 

Decentralized systems

Even if calling out to a centralized trusted base to manipulate capabilities is impossible or impractical, and the actors can't be trusted to manipulate capabilities correctly, there is still hope. Decentralized capabilities, such as macaroons,7 can be passed to untrusted actors and have their validity checked when those actors try to use them.

Macaroons provide access to a resource that is reduced through an append-only list of caveats. A macaroon begins with an identifier, such as "access transaction details," and a signature, made by hashing the identifier with a secret key. When a caveat (such as " for Alice's account," or " until 5 p.m. EST") is added, that caveat is hashed with the current signature to make a new signature. The old signature is thrown away and cannot be reconstructed—the hash cannot be undone. Given a macaroon with a set of caveats, it's impossible to remove a caveat and recalculate the correct signature without the secret key. Therefore, it's impossible for a hostile user to forge a macaroon with fewer caveats (i.e., more permissions).

Decentralized capabilities haven't yet been integrated into low-level software or hardware. Macaroons were originally designed for the web, so they have a text-based wire format and third-party authentication features, which a binary-based interconnect doesn't need. This is fine for the network layer (e.g., Michael Dodson combined macaroons with CHERI for fine-grained memory-mapped I/O access over an insecure network9), but domain-optimized representations would be more space-efficient.

Revocation is also interesting. Capabilities could come with timeout caveats and require refreshing, or groups of capabilities (and all their derivations) could be revoked by throwing away their secret key. This would allow CXL endpoints to store and (attempt to) manipulate their capabilities themselves, and let the CXL.mem device revoke them, all without trusting them. Decentralized capabilities are robust to hostile actors, don't require centralized resources, and are ripe for further investigation.

 

Conclusion

Physical memory is accessed through many layers of abstraction. Applying protection at different layers, which are aware of different actors and use resources at different granularities, is essential. CHERI and MMUs offer great protection at the software and process level, but CXL's protection model has issues. It allows memory sharing, but only of a few large ranges instead of many small ones. It doesn't give actors a way to share new memory ranges with each other, instead relying on a central, underspecified Fabric Manager. Capabilities are inherently flexible—they can protect large and small memory ranges, and can be transferred directly between actors without a centralized authority—so they should be able to address these problems.

CXL initially targets the datacenter, with many endpoints sharing disaggregated memory. The protection is coarse-grained, and doesn't consider virtualization. VMs running on the same host have to rely on similarly coarse hypervisor- and MMU-based isolation. Fine-grained capabilities could allow individual VMs to share small memory regions directly. Capabilities for large memory regions could also enforce VM compartmentalization at the CXL layer, similarly to CHERI.

In datacenter and consumer systems, device-to-device memory sharing is becoming essential for high performance. CXL doesn't try to protect devices from each other at all, which is especially scary considering how powerful malicious devices already can be. Capabilities would provide a consistent interface for securely exposing regions of device memory. Decentralized capabilities are robust against malicious actors and could keep the peace in the Wild West of untrustworthy hardware. In a datacenter with trusted components, distributed-trust systems could even forgo the cryptography associated with decentralized capabilities for lower overheads.

Decentralized and distributed capabilities have a lot of potential, but they haven't been used in this context yet and need further investigation. Even so, they could greatly benefit CXL. CXL is a new interconnect standard that provides the opportunity to build in better security from the start instead of retrofitting it later. A domain-optimized decentralized capability system could work wonders, giving CXL fine-grained memory sharing and improving virtualization and device-to-device security. Interconnects need to take security more seriously, and we believe capabilities can provide flexible and robust security for CXL and beyond.

 

Acknowledgments

We would like to thank the CHERI project team led by Robert Watson for demonstrating the potential of capabilities for memory security, without which this work could not exist. The CHERI team also provided essential feedback while developing this article, for which we are extremely thankful. This work was supported by the University of Cambridge Harding Distinguished Postgraduate Scholars Programme, and by EPSRC grant EP/V000381/1 (CAPcelerate).

 

References

1. Advanced Micro Devices, Inc. 2022. Offering unmatched performance, leadership energy efficiency, and next-generation architecture, AMD brings 4th gen AMD EPYC processors to the modern data center; https://www.amd.com/en/pressreleases/2022-11-10-offering-unmatched-performanceleadership-energy-efficiency-and-next.

2. Amazon Web Services. 2023. The security design of the AWS Nitro System. AWS whitepaper; https://docs.aws.amazon.com/whitepapers/latest/security-design-of-aws-nitro-system/security-design-of-aws-nitro-system.html.

3. Amazon Web Services; https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/install-nvidia-driver.html.

4. AMD Meet the Experts Webinars. How AM5, DDR5 memory, and PCIe 5.0 support pave the way for next-gen gaming experiences; https://webinars.amd.com/wcc/eh/3751456/lp/3915027/ how-am5-ddr5-memory-and-pcie-50-support-pave-the-way-for-next-gen-gaming-experiences.

5. Baumann, A., et al. 2009. The Multikernel: a new OS architecture for scalable multicore systems. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, 29–44; https://dl.acm.org/doi/10.1145/1629575.1629579.

6. Beingessner, A. 2022. Rust's unsafe pointer types need an overhaul. Faultlore; https://faultlore.com/blah/fix-rust-pointers/.

7. Birgisson, A., et al. 2014. Macaroons: cookies with contextual caveats for decentralized authorization in the cloud. In Network and Distributed System Security Symposium; https://www.ndss-symposium.org/ndss2014/programme/macaroons-cookies-contextual-caveats-decentralized-authorization-cloud/.

8. CXL Consortium. 2022. Compute Express Link (CXL) Specification, revision 3.0, version 1.0. https://www.computeexpresslink.org/download-the-specification.

9. Dodson, M. G. 2021. Capability-based access control for cyber physical systems. Thesis, University of Cambridge, Computer Laboratory; https://www.repository.cam.ac.uk/items/784cbc11-00ad-448d-96c0-a9ccaa451f4b.

10. Filardo, N. W. 2020. Cornucopia: temporal safety for CHERI heaps. In IEEE Symposium on Security and Privacy, 608–625. https://ieeexplore.ieee.org/document/9152640.

11. Hille, M., Asmussen, N., Bhatotia, P., Härtig, H. 2019. SemperOS: a distributed capability system. In 2019 Usenix Annual Technical Conference, 709–722; https://www.usenix.org/conference/atc19/presentation/hille.

12. Jung, R. 2018. Pointers Are Complicated, or: What's in a Byte? Ralf's Ramblings; https://www.ralfj.de/blog/2018/07/24/pointers-and-bytes.html.

13. Kernel Development Community. Virtual memory primer; https://www.kernel.org/doc/html/latest/admin-guide/mm/concepts.html#virtual-memory-primer.

14. Markettos, A. T., Baldwin, J., Bukin, R., Neumann, P. G., Moore, S. W., Watson, R. N. M. 2020. Position paper: defending direct memory access with CHERI capabilities. In Hardware and Architectural Support for Security and Privacy. Article No. 7, 1–9; https://dl.acm.org/doi/10.1145/3458903.3458910.

15. Markettos, A. T., Rothwell, C., Gutstein, B. F., Pearce, A., Neumann, P. G., Moore, S. W., Watson, R. M. N. 2019. Thunderclap: exploring vulnerabilities in operating system IOMMU protection via DMA from untrustworthy peripherals. In Proceedings of the Network and Distributed System Security Symposium; 10/gjh62d.

16. Microsoft. 2021. An introduction to single root I/O virtualization (SR-IOV); https://learn.microsoft.com/en-us/windows-hardware/drivers/network/single-root-i-o-virtualization--sr-iov-.

17. Nevill, M. 2012. An evaluation of capabilities for a multikernel. Master's thesis, ETH Zurich; https://barrelfish.org/publications/nevill-mastercapabilities.pdf.

18. Nvidia Magnum IO; https://www.nvidia.com/en-us/data-center/magnum-io/.

19. Samsung Semiconductor. 2022. Samsung Electronics introduces industry's first 512GB CXL memory module; https://news.samsung.com/global/samsung-electronicsintroduces-industrys-first-512gb-cxl-memory-module.

20. Watson, R. N. M., Anderson, J., Laurie, B., Kennaway, K. 2012. Capsicum: practical capabilities for Unix. Communications of the ACM 55(3), 97–104; https://dl.acm.org/doi/10.1145/2093548.2093572.

21. Watson, R. N. M., Moore, S. W., Sewell, P., Neumann, P. G. 2019. An introduction to CHERI. University of Cambridge, 43; https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-941.pdf.

22. Watson, R. N. M., et al. 2015. CHERI: a hybrid capability-system architecture for scalable software compartmentalization. In IEEE Symposium on Security and Privacy, 20–37; https://ieeexplore.ieee.org/document/7163016.

23. Xia, Hongyan, et al. 2019. CHERIvoke: characterising pointer revocation using CHERI capabilities for temporal memory safety. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 545–557; https://dl.acm.org/doi/10.1145/3352460.3358288.

 

Samuel W. Stark is a Ph.D. student and Harding Scholar at the University of Cambridge, Department of Computer Science and Technology, studying the wider applications of capabilities for shared-memory systems under Simon Moore. His MPhil project, also at Cambridge, assessed the impact of CHERI on vectorized load/store instructions and won the UK RISE 2022 student competition. He has a particular interest in GPUs, piqued by his work in the games industry, and hopes to work on media with a positive emotional impact once he leaves academia.

A. Theodore Markettos is a Senior Research Associate at the University of Cambridge, Department of Computer Science and Technology. He co-leads the CAPcelerate project, which is researching the use of capabilities for securing distributed distrustful accelerators. He has a wide range of research interests, from operating systems and software to FPGA, hardware design, and electronics manufacturing.

Simon W. Moore is a Professor of Computer Engineering at the University of Cambridge, Department of Computer Science and Technology, where he conducts research and teaching in the general area of computer architecture, with particular interests in secure and rigorously-engineered processors and subsystems.

Copyright © 2023 held by owner/author. Publication rights licensed to ACM.

acmqueue

Originally published in Queue vol. 21, no. 3
Comment on this article in the ACM Digital Library





More related articles:

Gobikrishna Dhanuskodi, Sudeshna Guha, Vidhya Krishnan, Aruna Manjunatha, Michael O'Connor, Rob Nertney, Phil Rogers - Creating the First Confidential GPUs
Today's datacenter GPU has a long and storied 3D graphics heritage. In the 1990s, graphics chips for PCs and consoles had fixed pipelines for geometry, rasterization, and pixels using integer and fixed-point arithmetic. In 1999, NVIDIA invented the modern GPU, which put a set of programmable cores at the heart of the chip, enabling rich 3D scene generation with great efficiency.


Antoine Delignat-Lavaud, Cédric Fournet, Kapil Vaswani, Sylvan Clebsch, Maik Riechert, Manuel Costa, Mark Russinovich - Why Should I Trust Your Code?
For Confidential Computing to become ubiquitous in the cloud, in the same way that HTTPS became the default for networking, a different, more flexible approach is needed. Although there is no guarantee that every malicious code behavior will be caught upfront, precise auditability can be guaranteed: Anyone who suspects that trust has been broken by a confidential service should be able to audit any part of its attested code base, including all updates, dependencies, policies, and tools. To achieve this, we propose an architecture to track code provenance and to hold code providers accountable. At its core, a new Code Transparency Service (CTS) maintains a public, append-only ledger that records all code deployed for confidential services.


David Kaplan - Hardware VM Isolation in the Cloud
Confidential computing is a security model that fits well with the public cloud. It enables customers to rent VMs while enjoying hardware-based isolation that ensures that a cloud provider cannot purposefully or accidentally see or corrupt their data. SEV-SNP was the first commercially available x86 technology to offer VM isolation for the cloud and is deployed in Microsoft Azure, AWS, and Google Cloud. As confidential computing technologies such as SEV-SNP develop, confidential computing is likely to simply become the default trust model for the cloud.


Mark Russinovich - Confidential Computing: Elevating Cloud Security and Privacy
Confidential Computing (CC) fundamentally improves our security posture by drastically reducing the attack surface of systems. While traditional systems encrypt data at rest and in transit, CC extends this protection to data in use. It provides a novel, clearly defined security boundary, isolating sensitive data within trusted execution environments during computation. This means services can be designed that segment data based on least-privilege access principles, while all other code in the system sees only encrypted data. Crucially, the isolation is rooted in novel hardware primitives, effectively rendering even the cloud-hosting infrastructure and its administrators incapable of accessing the data.





© ACM, Inc. All Rights Reserved.