Interviews

February 23, 2009
Volume 7, issue 1

Download PDF version of this article PDF

A Conversation with Van Jacobson

Making the case for content-centric networking

To those with even a passing interest in the history of the Internet and TCP/IP networking, Van Jacobson will be a familiar name. During his 25 years at Lawrence Berkeley National Laboratory and subsequent leadership positions at Cisco Systems and Packet Design, Jacobson has helped invent and develop some of the key technologies on which the Internet is based. He is most well known for his pioneering contributions to the TCP/IP networking stack, his seminal work on alleviating congestion on the Internet, his leadership in developing the MBone (multicast backbone), and his development of several widely used IP networking tools, such as trace-route, pathchar, and tcpdump.

Now a Research Fellow at PARC (Palo Alto Research Center), Jacobson continues to do groundbreaking work. His latest work on CCN (content-centric networking) took the networking community by storm when his seminal 2006 talk, "A New Way to Look at Networking," was released on the Web as a Google Tech Talk).

For our interview this month, we enlisted another networking heavyweight, Craig Partridge, chief scientist for networking research at BBN Technologies, to speak with Jacobson about CCN and what it means for the future of the Internet. Partridge is an ACM Fellow, a former chair of ACM SIGCOMM, and was once editor-in-chief of ACM SIGCOMM Computer Communication Review. Jacobson and Partridge first met in 1987 when they were both working on TCP-related problems and have been periodically bouncing research ideas off each other ever since. We are immensely grateful for their participation and hope that their discussion helps to enlighten software engineers about this exciting new direction for networking technology.

CRAIG PARTRIDGE In a paragraph or two, can you describe the gist of what content-centric networking is to somebody who knows a little bit about how the Internet works? What does CCN do to that model?

VAN JACOBSON The easiest place to start is the history. The networking that we use today grew out of work in the 1960s and 1970s. At that time, the problem that people wanted to solve was what we call a resource-sharing problem today. You had one computer that had a tape drive and another that didn't, or one computer that had a printer and another that didn't.

There were not all that many computers, and they were big and expensive. Everything that was attached to them was big and expensive, and it was a really interesting problem to be able to share this big, expensive gear among computers. The model that drove the networking development was, "Can we extend the I/O bus so that we can share these resources between machines?"

Move forward 50 years, and now my phone has four computers in it; the light switch has one. Every box that you buy today has four cores. Computers are just these commodity items, and we really don't care about sharing their resources. What we care about these days is moving content around, and because of the way the disk price has been going down, enormous amounts of content are being generated. The Web gave us the right model for moving that content around: you say the name of what you want, and you get the Web page that's associated with that. You can build up content by referring to other content, and it's transparently presented to you as one nice seamless thing. But this content abstraction that the users see, and what they perceive as the value of the Net, is built on this old resource-sharing, host-to-host communication model.

It's frustrating because we've got these almost universal standards for content driven by the Web. For nicely rendered rich text we've got HTML; we've got JPEG and PNG for images, and MP4s and MP2s for audio and video, and they will all run on just about anything because they have to in order to make the Web experience work.

But you go down just a bit below that and ask, "OK, how do I move this stuff around?" and you've got this huge diversity of protocols that mostly won't interoperate. If you've ever tried to get the pictures off a random camera phone onto your media server and display them on your TV, it's an incredible amount of pain, and mostly people give up because there's not a simple path. But those pictures are nice, simple JPEGs. It would be great if you could say to your TV, "Show me the pictures I took on my phone today"—a perfectly usable name for the content—and it would know how to render that content, but it can't actually get to it because the communication protocols don't work in terms of content.

CP We're trying to get past the host-to-host model and make content our major abstraction, and based on your explanation, the way it works is that I create some content I would like to share with other people, and I name it. Do I name it much as I would name a Web page, or do I name it some other way? What's the key binding of my content to the name, and how does that fit together?

VJ There are two parts to that answer. One is that at the mechanism level, the network had better not care. You should be able to use whatever form you want for names. At the deployment level, however, naming has a huge effect on usability and scalability, and that imposes constraints on the names of data you want to make globally available.

Many good lessons came from building the Internet and scaling it up. One was to use hierarchical, aggregatable identifiers. The thing that saved our butt many times as IP grew from its little initial six-node deployment to something that covers the planet was that we managed to deal with the state in the core of the network because the addresses in IP aggregate. We typically view addresses as having a host part, subnet part, and a host-on-subnet part, and when you're away from subnet, all the host-level detail goes away and you use the address just up to the subnet part. When you're out beyond the boundaries of that particular network, all the subnets go away and you deal with just the net-level prefix.

That ability to aggregate detail has meant that even though we grew to on the order of 4 billion hosts, we've got only about 300,000 routes in the transit core of the network. The lesson seems to be: if you want to be able to scale the state required for global any-to-any communication, you want to use a hierarchical identifier space so that you can aggregate out detail as you get far away.

CP This notion of "far away" implies that the data is somewhere. I thought part of the idea here was to make content king.

VJ You're faced with the tension between being able to locate and rendezvous scalably with information and allowing information to come from anywhere but not really having the location. Our one-liner for CCN was "content has a name, not a location." We want to allow it to be anywhere.

If you view a world that's evolving from today's world, you manage that evolution by taking the current URL and saying, "We're no longer going to interpret that as the location of the content." We're going to say, "That's the name of where this content was born," and we'll just use that as the name. The prefix—the part that is the server name—is a nice, globally unique prefix. To make it globally unique, you have to have coordination structure, so there are things such as IANA (Internet Assigned Numbers Authority) and the name registries that exist to hand out globally unique names. And it's a name that, because of its current use, maps nicely onto the Internet, so it gives you good communication at a distance almost for free.

For a lot of content, we're viewing at least the initial model as follows: if you want something, you ask for the content locally, hopefully via broadcast because broadcast protocols are very efficient at locating and retrieving information. That broadcast is going to work on a local wire, such as a wireless LAN. Through a campus Ethernet, broadcast is extremely cheap. It's unlikely to propagate beyond the boundaries of a campus because then broadcast starts to get more expensive.

If I wanted to get a report on your current DTN (disruption-tolerant networking) work, I would broadcast locally. Someone in our group probably has it because they're collaborating with you. If they don't, our local content routers are going to say, "Well, the name starts with BBN.com, so I'm going to package this up and send it off in the direction of BBN.com." It can either bubble one hop up to the content router on our ISP and have our interest in that report go toward BBN, or if there is no content router upstream, then you turn BBN into an IP address and just talk to it.

CP If I think of the name at the top of the hierarchy as being a hint as to where your rendezvous server is, but that the content doesn't have to be at BBN.com, then it's got to get to somebody who says, "I know where it is and I will retrieve it or redirect it for you"—whatever the right answer is.

VJ Right, and that's the point at which you can disaggregate the state. As you get far away topologically—and by topologically I mean in any way that we might be able to communicate—what I know about are the things that are close to me that I can find via broadcast, or broadcast-like protocols, and the things that I can't find that way, for which I've got a top-level summarization that tells me who has more detailed state than I do.

CP So that's how we find content. What do you have to do to create content? That's a multilevel question, so let me dig down two levels. One is, mechanically, you've got some content, you want it bound to this name, and you want nobody else to be able to change what's attached to that name. But then the second thing is we've already got content servers and caches in various places. Do I as a creator have to worry about how big those caches are? If it's an HDTV movie, do I have to break it up into five-minute chunks, or do I create one big movie and—whomp!—let somebody downstream worry about caching it?

VJ Let me talk about the second part first. Right now, if you're not Google or YouTube—somebody who's big enough—there's a curse in creating popular content. If you make something that a lot of people look at and say, "Oooh, this is really cool!" you've just blown your Web site off the air because the only way that content can be distributed is from its original source.

That's again an artifact of this point-to-point communication model. If you change to the content model that we're talking about, then operationally what happens is content diffuses from where it was created to where the consumers are. In that diffusion, given that there's enough memory in the intermediaries, you guarantee pretty much maximum efficiency of the distribution, which is that any piece of the content will take, at most, one trip over any given link.

It's the same benefits we saw when we were working on multicast protocols many years ago. If you move to a content-centric model, then you can stop disenfranchising creators because they pay no cost and you actually stop disenfranchising all the intermediaries, too.

Look at the New York Times. Several million people read the online edition. There are a couple of ISPs downstream, and each one sees several million copies of exactly the same content because it's being served from nytimes.com. It's the same content; it's just that the ISPs can't tell. What they're seeing are lots of different connections, lots of different conversations, and the fact that all of those conversations have exactly the same bits in them is hidden, so they've got to haul all of those same bits multiple times.

If they could deal at the content level instead, then they could pull down one copy and redistribute it. You save bandwidth across the internal infrastructure because it's diffusing out toward consumers.

CP When we were doing multicast stuff, one of the nasty problems we started running into was MTU (maximum transmission unit) discovery. If you tried to disseminate content and you picked the wrong MTU, then suddenly the content would start vanishing. Do I have a similar problem as a content creator?

VJ That actually takes us back to the first part of your question, which is tied up with the details of naming. When you start to do content networking, one of the first problems that you hit is that we usually don't name content; we name the containers that the content is in. We name the server that has the content. We name the file on the server that holds those bits, but we don't name the bits themselves.

Many years ago in his Turing Award lecture, Tony Hoare said that this is a serious mistake, and that you're opening yourself up to a world of grief if you don't have what he called referential transparency. If you can refer to only the container and not the thing that's contained, then contents can change on you. You have security issues; you have decidability issues; you have robustness issues. You don't really know what the bits are, and you can't reason about what the bits are, because all you can name is a container.

One of the first steps in going to content networking, then, is to move from container names to actual content names. We went with the structured names in part because they aggregate so that you can easily and scalably rendezvous with information. Another reason is that we generally don't want to deal with single pieces of information; we want to deal with collections of information. That's what the hierarchy is for in file systems: to let us collect information in meaningful ways.

The hierarchy that you've got in the names lets you represent collections. You have to do that—a fundamental property of information is that it doesn't exist in isolation; it exists in a context, and that context can be represented by an ontology, a hierarchical namespace. If you can identify collections in names, then splitting a big thing into pieces is just making a collection from a big, intractable chunk of content. Since a network is a shared medium, the size of the biggest piece you allow on the network is the granularity of your sharing. If you allow really big pieces, then you're really cutting down your opportunities for sharing.

It also cuts down the opportunities for control. Right now we can control at the granularity of a connection, which could be an arbitrarily large number of bits, so you open this fire hose and sit back and get hit in the face by multiple megabytes worth of God knows what's being streamed at you.

We wanted something with a much finer grain of control, roughly packet-level, on the order of a kilobyte—something that you can send across almost any net as an idempotent unit. If you've got something bigger than that, then you say, "OK, it's a collection of kilobyte things." No users have to deal with things at that granularity, however. They give the collection name and their libraries, and their interface to the net supplies the segmentation and versioning information. At the network level it's just names, and those names are always for small pieces.

CP So, what the user sees is big but everything is magically created, so the movie would actually be a series of small chunks.

VJ Right, and they're all named. You can get them in any order, so you can get very efficient delivery paradigms. Unlike in the current networking where the network works in a completely different world from the users, everybody is working in the same space. The names you see may suppress some detail, but they're the same names that the network works on. You can communicate those names. You can embed them. They have completely well-defined meanings, always.

CP Can I create aliases? Can I say, "This is my thing with a little bit of my commentary; go look at those scenes in the movie"?

VJ The starting point of the content revolution was the Web, obviously, and the foundation of the Web was links. Inside of content, you could embed links, which are just names of other content. You would have to be foolish to do anything to damage links; our model is that links are first-class objects.

They're even known at a certain level in the network. When you ask for a name, and something that receives that question finds a piece of content that's identified as link content, the network can look inside and say, "Oh, this is a name for another name, and I have that other name as well. Why don't I just return them both?" You can get some efficiencies because the network knows what problem you're trying to solve, and suddenly it can help you, whereas today it can't help you because it lives in a different world from you.

CP The basic idea is that content is broken into pieces at its creation and named; if I wish to annotate it, I'm going to be encouraged to annotate the original. I could obviously copy it and rename it, in which case, from the network's perspective, it's now two different objects.

The idea is that because the network knows about the namespace and understands the problems you're trying to solve, the network will alias it if it's the same content. You can annotate it, you can embed links, but it's one copy so there isn't the danger that it will get replicated into 14 distinctly named copies in the world.

VJ We hope there's a lot of incentive not to do that. Linking actually adds information to the content. You can see it in blogging. Copying somebody else's content is just plagiarism. It hides the author's existence and really confines the reader's experience because all you can see is the excerpt. If instead you link to somebody else's content, then you get to say what you want to say but the reader gets to see the original context, its connections to the world, plus your additions to it.

You're providing much richer information to the reader via the act of linking than via the act of copying, and I think that's "first principles" true. It's not just a social media thing; it's the nature of the information.

There are arguments today in the networking community about "the one true name." There's a camp that says, "The one true name is some immutable proxy for the bits." You take the bits, compute a SHA-256 checksum, and that 256-bit number is the name of the content, and that's how you should always refer to the content.

There's another community that says, "The bits don't matter, only the names matter." What they do is link names to other names, and following that linkage—maybe as a side effect—you get at some bits.

We're square in the middle between those two camps. We say that what matters is the binding between the names that users give things—which reflect their view of the world and how they organize the information about the world—and the things that they name. In my mind, chunks of content are like words. They're building blocks, but they are not complete. It's the arrangement of words, their linkage, that turns them into sentences and paragraphs—things that have meaning. If you want to have a network that captures this emerging meaning, you have to have both the bits and their name; it's the linkage between them that's really the key.

CP I'd like to talk about customized content for the user—that is, the ads. You read the New York Times and I read the New York Times, and we probably get very different ads (unless we're culturally compatible enough). How does that work in a content-centric network?

VJ Pretty much the same way. There are two models for producing that content. One is the model that Amazon uses, where you produce a user-specific base page. It's the content I want plus lots of stuff that's pulled in, based on what they know about me—which is entirely too much.

That base page pulls in a bunch of stuff that thousands of different people are also seeing—not combined in the way that I'm seeing it, but the elements that are being combined are very heavily shared. Amazon is producing something that's tailored, but it is producing it out of off-the-shelf pieces pulled out of the closest cache. The only unshared piece is the base page that was customized for my query at that particular time and can't be cached anywhere because it's generated on the fly by Amazon. That custom content is just a little bit of text and links. The bulk of the bits that I actually see are highly replicated content that is getting to me efficiently.

The other model is to send content that answers a question or a set of content associated with an URL. Embedded in it are links with a level of indirection that tell the local Akamai servers they can insert the appropriate banner ad for the community they are feeding.

This is the way broadcast TV allows local affiliates to insert their own commercials. The network inserts some stuff that the affiliates can't tamper with, but there's a set of spots that the local redistribution center is allowed to replace with more relevant spots for its target audience. The parties agree on the size of the time slots, just as we agree on the size of a banner ad, so you know what sort of a chunk you can put in without disturbing the rest of the content. This is really easy in a name-based system because the important thing that's flowing through the network is interest in some name or some piece of data.

CP It's the thing that you as a recipient were told to go ask for, and there can be multiple instantiations, but the closest one geographically or topologically to you is the one you get.

VJ Right. An effect of that is you can target your audience much better because the thing that's hearing about these interests is always the closest thing to each recipient. If you don't satisfy it at the closest point, then it just bubbles up to the next point. It can bubble all the way back to the original content source, but you get lots of opportunities for the edge to fill in the appropriate thing.

CP You're creating a marketplace at some level, though, aren't you?

VJ That's the hope, yes.

CP What are the economic incentives here? One of the questions that Dave Clark (ACM Fellow) was concerned about when I was talking with him about this was if the storage is all in the routers, then we've suddenly handed a fair bit of market power—additional market power—to the ISPs, who are to some degree there by virtue of the fact that they own the wires. Is there a way to make this a marketplace in which others can compete for the caching opportunities, such as the one we just described in terms of injecting ads?

VJ There are a couple of different pieces to that answer. One is that in terms of opportunities for carriers and communication intermediaries, the decrease in storage cost over the past 30 years has just been phenomenal. The cost of disks has been falling 3 percent per week every week for the past 25 years, and, if anything, that is accelerating with new technologies, such as perpendicular recording and a bunch of nanotech stuff that's coming along. Nanotech excels at making large, regular structures, and, for data processing, the largest regular structure is a memory, so nanotech is going to give us a lot of really cheap nonvolatile storage. Right now, carriers can't leverage any of that cost reduction. A carrier is just something that moves bits through a wire.

Also, we use buffer memory in such a way that it's valuable only if it's empty, because otherwise it doesn't serve as a buffer. What we do is try to forget what we learned as soon as we possibly can; we have to do that to make our buffer memory empty.

For the Olympics (not the most recent, but the previous one), we got some data from the ISP downstream of NBC. That router was completely congested; it was falling over, dropping packets like crazy. If you looked inside its buffers, it had 4,000 copies of exactly the same data, but you couldn't tell that it was the same because it was 4,000 different connections. It was a horrible waste of memory, because the conversations were all different but what they were about was the same. You should be able to use that memory so you don't forget until you absolutely have to—that is, go to an LRU (least recently used) rather than MRU (most recently used) replacement policy. It's the same memory; you just change the way you replace things in it, and then you're able to use the content.

It wouldn't be necessary for carriers to put disks in routers. They could just start using the existing buffer memory in a more efficient way, and any time the data was requested more than once, they would see a bandwidth reduction. If they did put disks in so that you don't start to thrash that memory, then they can trade the fairly expensive bandwidth for this really cheap—and getting cheaper—disk storage by pulling this stuff off the disk.

So, the first part is there are lots of opportunities for carriers to have storage somehow be part of the communication model, and if you go to this "exchange-named content" view of the world, bits in a wire, bits on a disk, and bits in a memory are indistinguishable; they're all just storage. You get them all the same way: you give a name, you get back bits, and carriers can leverage that.

The second part is, because in the current conversational model the network is blind to the bits, we take this richly connected fabric of the world with multiple links going everywhere and put a spanning tree on it, so between any source and destination, there's exactly one path. We have to do that because we have to prevent loops, and we can't prevent loops at the content level because we don't see the content. But this is a global ordering on the network. It says no matter how rich your connectivity is, there's only one way to get to a particular source, and if some party in that one way decides to play games with you—if they decide not to carry your bits—then you're hosed. It's now a black hole: you can't get there; they can't get to you.

If you go to a content model, you completely stop caring about loops because when a piece of content shows up, you say, "Do I already have that? OK, duplicate, toss it." That means you can take this nice, rich network graph and use the whole thing, from anywhere.

CP The caching gives you a suppression mechanism that lets you actually deal with the fact that, if there's a loop, it's going to go away.

VJ Right.

CP Because you're using the LRU instead of the MRU, it creates an interesting model of how you would attack it. An attack would involve a really complex hammering of a cache whose size you probably don't know.

VJ And you don't know its replacement policy. Typically when you design these things, you put in enough randomness so that adjacent nodes keep different things. Even if you can hammer one guy, the guy upstream, or immediately downstream, has a different policy and has still got it, so it's a local repair action. You make a robust fabric where bad actors can have very little effect because it's possible from the end nodes to route around them.

Our first application was doing voice-over CCN. We did IP telephony but over CCN rather than IP. We basically took the linphone client, whacked out the UDP stack, and replaced it with our CCN stack. The little laptops that we first ran it on had Ethernet, 802.11, and Bluetooth interfaces. If you're doing voice-over IP, the call is bound to a particular interface address, so it starts up and you bind it to the Ethernet. If you unplug the Ethernet, you've hung up the call.

In CCN, the call is bound to the user who's making the call. It's bound to a name, and we made our names include the user identity, so when you're trying to make a call, you broadcast out to all your interfaces, "Does the identity of the party that I'm trying to call exist? Is there a CCN phone that handles this identity?"

You spread your traffic over all the ways that you have to communicate. In the initial implementation the callee replied to those three. You take the first reply and say, "OK, I can talk to him and, since this reply got to me first, the interface it arrived on is the fastest way to talk to him." With all three interfaces, it would start on the Ethernet, adaptively figuring out what worked best.

If you unplugged the Ethernet, it would detect the link loss and go back to broadcasting, "OK, I want the next packet of the call," to the remaining two interfaces, then unicast to the interface that answered first. If you turned off the access points so it couldn't get Wi-Fi connectivity, it would move over to the Bluetooth.

If you plugged the Ethernet back in, it would go back to the Ethernet—it would just adapt to whatever it had. Bluetooth has no IP address, so it was just using the link-level broadcast for the rendezvous there because at the CCN level, you don't care; it's whatever works. That applies to your getting any kind of content. Your phone has got something like nine radios in it.

CP Yes, it's terrible. I give a flaming talk about the state of wireless in which I have four chipsets, all of which are operating in the same band and each of which has its own antenna—and they're all in my phone.

VJ They all have different ways of communicating, different cost profiles, and different distance profiles. In CCN the waist of our layering is these little chunks of content, so right above that waist is the thing that secures the content. Sitting right below that waist is what we call a strategy layer. The world today has lots of different ways of communicating, from dumping stuff into the USB memory in your phone and walking across the world—that's a perfectly viable way of communicating—to using all of the different wired and wireless connections we've got. At runtime you should be actively figuring out the best way to communicate.

Going forward, it looks like there's going to continue to be a multiplicity of ways to communicate because there's no one right answer for communication. You really want a networking infrastructure that deals with what it's got and is smart about it and has that pretty much built into the stack.

CP Let's talk a little more about that narrow waist—that is, the point of unification between the range of choices above it and the range of choices below it. I want to talk a little more about the choices below it because it sounded to me that you were saying something that I've been hearing from an entirely different perspective from folks like Jon Turner, who were looking at building connection devices in a world in which you have a large number of available processors and a lot of processing power, and the ability to virtualize middle boxes and allow people to run completely different protocol suites.

It's not just TCP/IP, but whatever makes sense for their particular applications and problems concurrently in the same infrastructure. It sounds to me like you're saying at some level we're headed that way already just in the multiplicity of technologies, and this is a technology completely consistent with that perspective.

VJ I think we're actually going in different directions. It's going toward a good end, but to my mind Jon is trying to pull the waist down. The waist in the protocol stack is something that [Internet pioneer] Vinton Cerf talked about many years ago, and I think Vint was the first to point out that in the ISO OSI stack, which is just a cylinder with layers that more or less look the same, most of the layers have a bilateral relationship with their peers—for example, in the link layer both ends of the link have to be using the same way of framing packets and making bits.

There's some agreement there, but it's only between the two ends of a link. The applications at both ends of the communication have to agree on the transport layer that they're using and the session layer and the things going above them, but it's an agreement that's happening between two applications on two hosts in two parts of the world—purely bilateral.

The network layer is the only one that touches everybody—it's the only multilateral agreement. Vint said you can have lots of innovation above and below it, but it has to be the same everywhere. You want to make it as simple as you possibly can so that it can be universal and not constrain what's above and below it. He expressed all this visually by drawing the protocol stack not as a cylinder but as an hourglass with the network layer as its narrow waist.

For those of us who bought heavily into the Internet religion, exactly what and where the waist is, is a really crucial thing; it's the linchpin of the architecture.

I see the work that Jon is doing as pulling the waist down toward Layer 2 and saying we want to be able to virtualize and specialize it, making it much more universal than Layer 2 is today.

I want to push the waist way up and hope to generalize the context of what a network is. In Jon's model—and I hate putting words in somebody else's mouth—the networks have wires. For content networking, a guy on a bicycle with a phone in his pocket is a networking element. He's doing a great job of moving bits.

In the late 1980s there was a post being forwarded around about never underestimating the bandwidth of a 747 filled with tapes, and at the time that was just an unobtainable bandwidth for our communication technology. I tried to redo the calculation recently for a 747 filled with Blu-ray disks and got 100-terabits-per-second transcontinental bandwidth, which is still better than you can do from any carrier infrastructure.

We should have a very broad definition of things that move bits, and all of them should be able to come into our networking tent. Particularly because of the decreasing cost of storage, bits today get moved a lot of ways that aren't on wires, and it's really unfortunate that we see the world split into these two camps—operating system and storage people deal with bits that are on memories or on disks or on tapes; and communications people deal with bits that are on wires—and we don't talk except when we're forced to.

These are two utterly disjoint communities with different views of the world, but the bits are the same. The bits don't know whether they're on a wire or a disk. And we shouldn't care. I would like to move us to a world where that distinction is simply not relevant anymore. The bits are bits. You give a name, you get the bits. Q

LOVE IT, HATE IT? LET US KNOW

[email protected]

Originally published in Queue vol. 7, no. 1—
Comment on this article in the ACM Digital Library