Peerless P2P

Kode Vicious - @kode_vicious

December 28, 2006
Volume 4, issue 10

Download PDF version of this article PDF

A koder with attitude, KV answers your questions. Miss Manners he ain’t.

Peer-to-peer networking (better known as P2P) has two faces: the illegal file-sharing face and the legitimate group collaboration face. While the former, illegal use is still quite prevalent, it gets an undue amount of attention, often hiding the fact that there are developers out there trying to write secure, legitimate P2P applications that provide genuine value in the workplace. While KV probably has a lot to say about file sharing’s dark side, it is to the legal, less controversial incarnation of P2P that he turns his attention to this month. Take it away, Vicious…

Dear KV,
I’ve just started on a project working with P2P software, and I have a few questions. Now, I know what you’re thinking, and no this isn’t some copyright-violating piece of kowboy kode. It’s a respectable corporate application for people to use to exchange data such as documents, presentations, and work-related information.

My biggest issue with this project is security—for example, accidentally exposing our users’ data or leaving them open to viruses. There must be more things to worry about, but those are the top two.

So, I want to ask, “What would KV do?”

Unclear Peer

Dear UP,
What would KV do? KV would run, not walk, to the nearest bar and find a lawyer. You can always find lawyers in bars, or at least I do; they’re the only ones drinking faster than I am. The fact that you believe your users will use your software only for your designated purpose makes you either naive or stupid, and since I’m feeling kind today, I’ll assume naive.

So let’s assume your company has lawyers to protect them from the usual charges of providing a system whereby people can exchange material that perhaps certain other people, who also have lawyers, consider it wrong to exchange. What else is there to worry about? Plenty.

At the crux of all file-sharing systems—whether they are peer-to-peer, client/server, or what have you—is the type of publish/subscribe paradigm they follow. The publish/subscribe model defines how users share data.

The models follow a spectrum from low to high risk. A high-risk model is one in which the application attempts to share as much data as possible, such as sharing all data on your disk with everyone as the basic default setting. Laugh if you like, but you’ll cry when you find out that lots of companies have built just such systems, or systems that are close to being as permissive as that.

Here are some suggestions for building a low-risk peer-to-peer file-sharing system.

First of all, the default mode of all such software should be to deny access. Immediately after installing the software, no new files should be available to anyone. There are several cases in which software did not obey this simple rule, so when a nefarious person wanted to steal data, he or she would trick someone into downloading and installing the file-sharing software. This is often referred to as a “drive-by install.” The attacker would then have free access to the victim’s computer or at least to the My Documents or similar folder.

Second, the person sharing the files—that is, the sharer—should have the most control over the data. The person connecting to the sharer’s computer should be able to see and copy only the files that the sharer wishes that person to see and copy. In a reasonably low-risk system, the sharing of data would have a timeout such that unless the requester got the data by a certain time (say, 24 hours), the data would no longer be available. Such timeouts can be implemented by having the sharer’s computer generate a one-time use token containing a timeout that the requester’s computer must present to get a particular file.

Third, the system should be slow to open up access. Although we don’t want the user to have to say OK to everything—because eventually the user will just click OK without thinking—you do want a system that requires user intervention to give more access.

Fourth, files should not be stored in a known or easily guessable default location. Sharing a well-known folder such as My Documents has gotten plenty of people into trouble. The best way to store downloaded or shared files is to have the file-sharing application create and track randomly named folders beneath a well-known location in the file system. Choosing a reasonably sized random string of letters and digits as a directory name is a good practice. This makes it harder for virus and malware writers to know where to go to steal important information.

Fifth, and last for this particular letter, the sharing should be one-to-one, not one-to-many. Many systems share data one-to-many, including most file-swapping applications, such that anyone who can find your machine can get at the data you are willing to share. Global sharing should be the last option a user has, not the first. The first option should be to a single person, the second to a group of people, and the last, global.

You may note that a lot of this advice is in direct conflict with some of the more famous file-sharing, peer-to-peer systems that have been created in the past few years. This is because I have been trying to show you a system that allows for data protection while data is being shared. If you want to create an application that is as open—and as dangerous—as Napster or its errant children were and are, then that’s a different story. From the sound of your letter, however, that is not what you want.

Other things you will have to worry about include the security of the application itself. A program that is designed to take files from other computers is a perfect vector for attacks by virus writers. It would be unwise—well, actually, it would be incredibly stupid—to write such a program so that it executes or displays files immediately after transfer without asking the user first. I have to admit that answering yes to the question, “Would you like to run this .exe file?” on Windows is about the same as asking, “Would you like me to pull the trigger?” in a game of Russian roulette.

Another open research area, er, I mean, big headache, which I’ll not get into here, is the authentication system itself. Outside of all the other advice I just gave, this problem is itself quite thorny. How do I know that you are you? How do you know that I am me? Perhaps I am the Walrus, except, wait, the Walrus was Paul.

Well, I believe you have enough to think about now. I suggest you sleep on it and wake up screaming, just like...

KODE VICIOUS, known to mere mortals as George V. Neville-Neil, works on networking and operating system code for fun and profit. He also teaches courses on various subjects related to programming. His areas of interest are code spelunking, operating systems, and rewriting your bad code (OK, maybe not that last one). He earned his bachelor’s degree in computer science at Northeastern University in Boston, Massachusetts, and is a member of ACM, the Usenix Association, and IEEE. He is an avid bicyclist and traveler who has made San Francisco his home since 1990.

Originally published in Queue vol. 4, no. 10—
Comment on this article in the ACM Digital Library

More related articles:

David Collier-Brown - You Don't Know Jack about Bandwidth
Bandwidth probably isn't the problem when your employees or customers say they have terrible Internet performance. Once they have something in the range of 50 to 100 Mbps, the problem is latency, how long it takes for the ISP's routers to process their traffic. If you're an ISP and all your customers hate you, take heart. This is now a solvable problem, thanks to a dedicated band of individuals who hunted it down, killed it, and then proved out their solution in home routers.

Geoffrey H. Cooper - Device Onboarding using FDO and the Untrusted Installer Model
Automatic onboarding of devices is an important technique to handle the increasing number of "edge" and IoT devices being installed. Onboarding of devices is different from most device-management functions because the device's trust transitions from the factory and supply chain to the target application. To speed the process with automatic onboarding, the trust relationship in the supply chain must be formalized in the device to allow the transition to be automated.

Brian Eaton, Jeff Stewart, Jon Tedesco, N. Cihan Tas - Distributed Latency Profiling through Critical Path Tracing
Low latency is an important feature for many Google applications such as Search, and latency-analysis tools play a critical role in sustaining low latency at scale. For complex distributed systems that include services that constantly evolve in functionality and data, keeping overall latency to a minimum is a challenging task. In large, real-world distributed systems, existing tools such as RPC telemetry, CPU profiling, and distributed tracing are valuable to understand the subcomponents of the overall system, but are insufficient to perform end-to-end latency analyses in practice.

David Crawshaw - Everything VPN is New Again
The VPN (virtual private network) is 24 years old. The concept was created for a radically different Internet from the one we know today. As the Internet grew and changed, so did VPN users and applications. The VPN had an awkward adolescence in the Internet of the 2000s, interacting poorly with other widely popular abstractions. In the past decade the Internet has changed again, and this new Internet offers new uses for VPNs. The development of a radically new protocol, WireGuard, provides a technology on which to build these new VPNs.