Download PDF version of this article PDF

OSS Supply-chain Security

What Will It Take?

A discussion with Maya Kaczorowski, Falcon Momot, George Neville-Neil, and Chris McCubbin

While enterprise security teams naturally tend to turn their focus primarily to direct attacks on their own infrastructure, cybercrime exploits now are increasingly aimed at easier targets upstream—within the open-source software supply chains that enterprises and other organizations have come to rely upon.

This has led to a perfect storm, since virtually all significant codebase repositories at this point include at least some amount of open-source software, given that's where a wealth of innovation is available to be tapped. But opportunities also abound there for the authors of malware, since it's a setup they can leverage to spread the seeds of their exploits far and wide.

The broader cybercrime world, meanwhile, has noted that open-source supply chains are generally easy to penetrate, given an abundance of entry points and an inconsistent dedication to security.

What's being done at this point to address the apparent risks? What are the issues and questions developers and security experts ought to be considering?

To delve into this, we asked George Neville-Neil, who writes acmqueue's Kode Vicious column, to talk it over with a few people known for their work in the front lines: Maya Kaczorowski, who was the senior director of software supply-chain security at GitHub prior to turning her focus more recently to secure networking at a Canadian startup called Tailscale; Falcon Momot, who is responsible for managing quality standards and running a large penetration testing team at Leviathan Security; and Chris McCubbin, an applied scientist at Amazon Web Services who focuses on detecting external security risks and performing triage as necessary.

 

GEORGE NEVILLE-NEIL In thinking about this topic, I keep coming back to the degree to which open source is based on trust and just how hard it is to establish trust within any distributed community. Some projects succeed in achieving it because they're of manageable size and everybody knows everyone else. But when it comes to FreeBSD, my open-source home, we have 300 source developers. Even though I've been on the project for more than 20 years, I haven't met all those people in person. I can't even attest they're all humans. Some might be robots working for three-letter government agencies, for all I know. This raises an interesting question about what the basis for trust is in an open-source community and how that, in turn, influences the way a project looks at supply-chain security.

MAYA KACZOROWSKI Actually, I think one of my frustrations with this is the tendency to equate human trust with project trust. Why is it we think those are so related? Aren't there ways for us to trust projects even if we don't necessarily have trust in the whole team? Isn't it enough for people to have confidence in the controls that have been put in place for a project?

GNN What sorts of controls do you have in mind? Are you thinking in terms of signing commits, which I believe GitHub has support for now?

MK I believe GitHub has had signing-commit support for quite some time, but you're not necessarily verifying the signing commits when you pull a package.

FALCON MOMOT Leviathan has a lot of clients with internal repositories, but not many of them use code signing. Then there's the matter of trust as to what—X.509 identity by a CA [Certificate Authority] would not be enough, and what we have are self-signed OpenPGP keys.

A related challenge is that many developers take on dependencies by way of the toolchain, since they'll take in code associated with tools they've imported and then commit that to their own repository, thereby losing the thread on where that code originally came from. This poses a significant challenge because code of unknown origin will only build up over time. Then you won't have the ability to use any kind of tooling to track the dependencies in your project. When a committer signs, that covers all the code—even though much of it isn't their own work.

GNN All right, but I wonder how many projects will take on that problem.

FM It would help if there were some standard way to help people understand that the commit they're about to make will be the first signed one in this particular repository, and that all the commits that follow will also need to be signed. But, of course, the consequences for not doing so remain unclear. What happens if there's an unsigned commit? Nobody seems inclined to stop the development process whenever this proves to be the case.

GNN What about tools like Black Duck that supposedly let you track all your open-source dependencies—or at least how your dependency graph is evolving?

MK For the most part, these tools just let you know what's in your environment and offer limited information on security controls, rather than focusing on vulnerabilities. In the event of a vulnerability response, even just knowing what's in the environment is useful, but triage tools aren't generally going to take advantage of that information. While tools like Black Duck can also point out vulnerabilities that need to be patched, there's not necessarily updated software for dealing with those issues. Still, I think it's better to have the added visibility than not to have it.

FM Most of the people I know will either use grep() to manage this or they'll use some of their own internally developed tooling to list all the things that have been pulled in. In my experience, the actionability of the security information coming out of this tends to be quite low.

In addition to the reasons Maya already cited, there's the possibility you might be employing only a very minor bit of the functionality offered by some of these very large packages your system is drawing from. For example, you might be using some small element of jQuery, which is forever being updated because of one vulnerability or another. The Java ecosystem and OpenSSL can also be very trying in this way. It's entirely possible that your application doesn't even expose any of the vulnerable functionality that led to an alert in the first place.

My concern is that when alerts fail to put these sorts of vulnerabilities into a proper context, it can really damage the credibility of the security industry in general.

MK Besides Black Duck, is there any tooling you've found to be particularly useful?

FM Although they're not glamorous, any static analyzer—even things like find-sec-bugs for Java—tends to be par for the course. That just goes to show that the tooling available for solving this type of problem isn't really all that different from the tooling you'd use to review first-party-authored code.

In terms of products that are specifically apropos to this area, you'd certainly have to count the various dependency-graph management tools that the SBOM [software bill of materials] folks are publishing. Those at least tell you what needs to be scanned. Beyond the dependency-graph software, however, I'm not sure what exists for unique tooling in this area.

GNN One thing that comes to mind is the license-to-license issue. That's why Black Duck came up in the first place, since it tracks down all your licensing issues, which can pose risks. It's certainly something that commercial companies have finally concluded they need to help manage the open-source packages they collect.

I agree that it's good to know about the licensing, but what does that tell you about what's in the package and who contributed to it? Otherwise, we don't have a way to learn about that.

Ideally, you'd also like to know something about the number of active contributors to the project and the sorts of controls they have at their disposal. That at least would give you a better idea about whether the project has already been compromised or is likely to be compromised.

Or, given some package, just check out its list of CVEs [common vulnerabilities and exposures]. You can search MITRE's CVE database, which I always do whenever someone on the project picks up a package. That isn't perfect, of course. Still, a tool that collects a list of the CVEs, figures out how many are of a certain severity, and then estimates how long it will take to fix those issues gives you a sense of the security of a package. That isn't fabulous, but at least it's something.

MK It would also be good to have a tool that could tell whether the code we're evaluating does anything—or even attempts to do anything—that might be considered particularly sensitive. For example, does this code attempt to perform any cryptography? If so, that's definitely going to get my attention. So far, I haven't seen anyone propose a good way of dealing with this. I certainly haven't seen any tools along these lines.

GNN The problem you're describing would, in general, be hard to address, but finding chunks of cryptographic code ought to be straightforward enough. I actually needed to screen for that back when I was part of the Paranoids team at Yahoo, since some coders thought it would be cute to work in their little bits of cryptography here and there. Then, I'd go tell them firmly (but politely), "No, don't write your own crypto. That's Rule #1. Rule #2 is: See Rule #1."

MK All right, but then I also have another longtime complaint. It should be easy to communicate to package managers that a particular package is no longer maintained, but there isn't a standard means for doing that. Instead, some people make clumsy efforts to edit the package description and that sort of thing. What would be far more useful would be a machine-readable format that simply communicates, "This thing is no longer being maintained."

GNN Speaking of features we'd like to see, I'd love it if GitHub had some way of informing you, "Look, this package you're checking out right now is actually abandonware." That way, everyone would also learn that this thing has been abandoned, which would be super-useful.

 

Many of the excuses and laments that have surfaced of late in explaining the shortcomings of open-source supply-chain security sound ever so familiar: "too little time" "not enough resources" "somebody else's problem" "no limit to the number of potential attack vectors" "not nearly as much fun as building features" ad nauseum.

While these complaints may not be unique to open-source projects, the scale of the supply-chain security challenge itself has proved to be unusually large. Then there's one more thing: Most of the work invested in open-source projects is performed by volunteers. That matters, it turns out.

 

GNN What supply-chain problems concern you most at this point?

MK Account takeover is definitely high on my list, since you can have complete trust in a user and the company that person works for, but if the company ends up getting compromised, all that is for naught. At root, this comes down to making sure there are good means for detecting account compromises.

Obviously, a project can be infiltrated in many other ways. A new maintainer, pretending to be accredited, could join the effort and take on more responsibility over time. That could lead to compromise of the code itself or the ways in which code is built and reviewed. All these risks are magnified, of course, if that person should also happen to commit to the project.

Another huge vulnerability has to do with the potential deletion of code. This is a particularly acute risk with smaller projects that are being maintained by a single individual. Should that person end up getting angry about something and start acting out, there's nothing to keep them from deleting or sabotaging their own code. That is essentially what happened in the notorious leftpad incident, which led to considerable disruption.

FM My biggest concern here is that there has long been a generalized sense that supply-chain security issues are somebody else's problem—meaning, why should you invest time, money, and effort in doing reviews of open-source libraries when you've got plenty of your own issues to deal with?

GNN I'm not sure how many people on the FreeBSD project are doing significant security reviews at this point. Probably not a lot. They'll keep up with the patches if they're not too busy, but this is open source, right? Which means all the work is being done by volunteers. There's always that question of whether somebody is going to have the time—or make the time—to do the updates.

FM This is the problem with software signing in general—namely, signed as to what, exactly? This can give rise to false confidence that's then shattered when some dependency later surfaces that proves to be malicious. And God forbid that went into your distribution, since there's almost certainly some volunteer out there who doesn't have the skills or the time to audit the whole chain.

If all you do is glance over the change log, its security properties aren't necessarily obvious. There can be some subtle changes in there that you would find only if you closely assessed everything in context, which would place far too much burden on what invariably proves to be the one or two individuals responsible for maintaining a package.

A distribution signing should not be taken as an assurance that the code contains no back doors. The fact the distribution is signed may mean its provenance is known. But that's all you get.

GNN Yes, the fact it's signed does not attest to the security of the thing you've received. The signing means only that it came from a known package creator as opposed to an unknown package creator. And I'm sure we're only scratching the surface here in terms of the sorts of frustrations that surface while trying to ride herd on OSS [open-source software] security.

FM That really does vary with the politics you encounter within each project. And yet, I've typically encountered pushback whenever I've attempted to fix things I've seen as potential risks, regardless of where those concerns have taken me. Mind you, I'm not talking here about problems that always prove to be risks, but instead about code-quality issues that often lead to problems, such as buffer overflows or SQL injections. To avoid the concerns those can lead to, I suggest implementing certain architectural changes or coding practices—like starting to use the strlcpy() function or prepared statements in SQL, for example.

When making suggestions along these lines, I've often encountered resistance—ranging from people simply ignoring my advice to opposing it because it was deemed unnecessary. In some cases, the maintainers simply said they considered the risks associated with change to be greater than the risks associated with discovery of their vulnerabilities.

It really all depends on the culture of each open-source project, which kind of sucks, since you can't really do anything about that short of installing a new maintainer group that actually cares about security.

 

Ambivalence about security has never boded well for a project's prospects, and it should absolutely alarm anyone who might end up actually using that software. But now, as the role of open-source software continues to grow in importance, so do the attendant security challenges, with respect to both scale and complexity.

Part of the challenge for organizations has to do with maintaining an awareness of packages within their own repositories that might be laced with vulnerabilities. This can be problematic, since even open-source packages that are knowingly downloaded commonly come with dependencies that are not initially recognized and examined. In fact, one great security challenge now has to do with simply maintaining an awareness of all the dependencies that reside within the organization's code repository. It doesn't help that the due diligence required to stay on top of this is not nearly as trivial as it might seem at first.

 

CHRIS McCUBBIN What can developers do to evaluate the security of a given package? And how should that be done? It's not as if there's some central repository.

MK Actually, the Open Source Security Foundation [OpenSSF] has a project called Security Scorecards that tracks a number of factors related to security [https://github.com/ossf/scorecard]. That's a quick way for anyone to discover, for example, whether a project has branch protection enabled, although this depends on a lot of information that's self-reported by each project. Project maintainers can use these scorecards to track whether any notable changes or flags start showing up right after they've accepted a new commit to their repo.

GNN I've recently been helping a friend teach a college course on software design, and that has involved explaining dependencies since, for modern software developers, dependencies are everything. You might write only 10 lines of code, but that could easily pull in 10 megabytes in libraries. You need to know that all of that now has become part of your codebase.

Happily, there are some products you can use to map your packages. We're applying one of those, called Package Graph, to the BSD package system. The first thing you need to do—if you know you're going to depend on open-source software—is to make sure you have some way of maintaining a map of all those dependencies, with the assurance you'll be alerted whenever that map changes.

CM That's definitely helpful. Another thing that's good to know is that, if you're using something like Spark, the map is going to show that you're bringing in one project, which in turn will bring in literally hundreds of other dependencies that will continue to constantly change with versioning. That extends all the way down, basically. How can you maintain situational awareness of all that?

Also, of course, even if you're able to maintain a completely accurate list of all the things you depend on, that doesn't mean you know whether all those things are secure. Still, keeping on top of all your dependencies is a challenging problem in its own right.

GNN This serves to point out something that those of us with degrees in computer science tend to forget: Not every company is steeped in technology. Everything we've talked about so far points to the uncomfortable truth that it's the receiver of the software that needs to take on the responsibility. It's just a fact that there are lots of folks out there who are going to be completely lost when it comes to this.

FM OK, but I'm not comfortable with the sense emerging here that open-source software is somehow riskier than what comes out of first-party authors. That's simply not the case.

Even if a principal engineer or product manager in an organization has a working relationship with the actual authors of the code, there's still going to be a need for code review. As I see it, the only difference between code written by a developer you employ versus code written by a developer you don't employ is the amount of control you wield. That could lead to higher quality, but there's no assurance of that.

And there is absolutely no support for the idea that open-source software is either more or less secure than the software you've developed in-house. So far as I know, there is no demonstrated relationship—unless you're familiar with research that shows otherwise.

MK If I can generalize, I'd say it's actually the open-source software that's most widely used and is drawn from a broad community of contributors that tends to be the most secure. But even then, you don't really know. Still, you would think that having more people involved would tend to be a good thing, particularly when everyone brings their own expertise and genuinely cares about what they're working on. Of course, that can happen with proprietary software, as well as with open-source software.

FM I'd agree with that.

GNN I also agree. What's more, my view from working with clients is that people often treat open-source packages as if they were packaged commercial software. That is, they just expect these things to work. And, if not, "Hey, where's the number for customer support?" But with open-source software, you may get verification of the initial package but not necessarily all the added parts that come along over time. Should users be told they need to be verifying all those added parts?

MK I can see the call for verifying all the parts, but that means people will need to be able to see things in a way that's machine-readable, easily recognizable, shareable, and readily consumable downstream. It would also help if a description of the dependencies came along as part of the package or binary.

The problem here, for me, has always had to do with adoption. That is, something in open source only becomes truly useful if you can get 99.99 percent of the people to use it. In practice, this means getting some of the base projects to adopt something. But even the biggest projects can't do that unless a lot of their smaller dependencies are also on board. Which leads me to believe that what we really need to do is to step back and acknowledge that we should focus on improving the security of every project by default.

In terms of how open-source security ought to work, maybe projects should be required to be branch protection-enabled if you have more than a certain number of users. And maybe automated scanning should also be enabled.

Oh, and another thing: Maybe we should go back to writing controls. We really do need to start moving toward a model where some of these protections are just built in automatically, since we can't expect hundreds of thousands of maintainers to change the controls for these packages on their own. Why isn't 2FA [two-factor authentication] a minimum requirement to publish a package?

FM If I put on my business-student hat for a minute, it occurs to me that asking your average enterprise—even a software development firm—to take on this Herculean task of building dependency graphs and all the rest of it well, we simply can't do that. Still, the business ultimately is exposed to the risk anyway. This strikes me as a perfect scenario for what I believe is referred to as cyber risk insurance, where the covered business is insured against losses stemming from a hacking incident.

If I were in the business of writing that kind of insurance, I would offer my employers this advice: "We can significantly reduce our potential for future loss if, as part of our value-add to our clients, we figure out what the most critical dependencies are and then work to resolve the security issues there before breaches occur. By doing that, we would be not only helping our clients, but also significantly reducing the strain of claims against our risk pool."

Which is only to say that what's called for here is a concerted effort. Up until now, nobody has really wanted to engage in the sort of joint initiative required to achieve a decent level of cybersecurity coverage for broadly deployed open-software packages. This is probably just one of those things where no one will bother to put a traffic light at the intersection until a horrible collision occurs. Until then, businesses will just need to continue making their own prudent investments to protect themselves.

If we had a market player that was focused on this issue—for whom the effort wasn't simply a bothersome distraction from normal operations—we might be able to make a bit more headway, since I really do believe the crux of the problem has to do with a fundamental lack of interest. Those of us who work for software developers are just too busy shipping features to put any time into better protecting OpenSSL.

Businesses ignore known vulnerabilities in their own software all the time. In fact, there's even a term for this: risk acceptance. The calculation seems to be that it would require an unacceptably massive amount of work to come to grips with the risk an organization takes on with the 200,000 or so transitive dependencies that come with each Node.js app that someone downloads. This head-in-the-sand approach will probably continue to prevail unless or until some operation takes a commercial interest in solving the problem and thus chooses to devote some actual resources to it.

GNN There's another possibility. Remember when the auto industry had its Ford Pinto moment? Pintos were cute and kind of quirky, but they also had this little design problem where, when they were rear-ended in a certain way, they exploded! Besides leading to some crippling lawsuits, that spawned government investigations that required Ford executives to stand and deliver explanations in front of Congressional panels.

It doesn't take much to imagine a series of security breaches that puts some number of Big Tech executives in an equally uncomfortable position—which might just lead to some corrective action. But the question is: How big does the catastrophe need to be for the industry to finally decide it's going to deal with this?

FM Well, it certainly would appear we haven't hit that point yet. And we probably never will since, if we look at the history of breaches, I can't think of a single large company that's been brought down by any of them. Instead of company-ending risks, we're probably talking about large losses that impact the balance sheet but are nevertheless sustainable. This is annoying, but at least with cyber risk insurance, there would be a way to price in the risks such that there's a tangible incentive to temper them as much as possible.

MK We may have already hit the critical point. Although the SolarWinds incident was more about the risks associated with relying on third-party vendors than with open-source supply chains, there are some very uncomfortable similarities. And you would definitely have to say the SolarWinds brand is tainted at this point.

FM But SolarWinds as a company is still a going concern. Its market cap may have taken a hit, but it's not as if SolarWinds is a penny stock now.

It's my view that, so long as these risks don't prove to be existential, there will remain a certain reluctance on the part of corporate management to invest significant resources into what they view as purely a cost center. Which is how we got to this point in the first place.

Copyright © 2022 held by owner/author. Publication rights licensed to ACM.

acmqueue

Originally published in Queue vol. 20, no. 5
see this item in the ACM Digital Library


Tweet


More related articles:

Reynold Xin, Wes McKinney, Alan Gates, Chris McCubbin - It Takes a Community
Of the many challenges faced by open-source developers, among the most daunting are some that other programmers scarcely ever think about. Building a successful open-source community depends on many different elements, some of which are familiar to any developer. Just as important are the skills to recruit, to inspire, to mentor, to manage, and to mediate disputes. But what exactly does it take to pull all that off?





© ACM, Inc. All Rights Reserved.