Download PDF version of this article PDF

Test Accounts: A Hidden Risk

You may decide the risks are acceptable. But, if not, here are some rules for avoiding them.

Phil Vachon

When it comes to the fundamental principle of never testing in production, nearly every software engineer is guilty of breaking the rule at some point in their career. But this isn't just a best practice. It's a keystone value meant to protect your reputation, your customers, and your sanity.

Tweaking a configuration on a live system or deploying a poorly tested fix could lead to downtime or data loss. A developer's compromised device that has access to your production environment could end up being a convenient vector for an attacker to steal data or introduce malware into your company's tech infrastructure. In general, access to your production environment must be guarded jealously, even among your own team of developers.

Still, sometimes developers do need to touch the end-user experience to reproduce an issue or test new functionality in situ. Some bugs will readily show up in your production environment once your software gets into the hands of end users, even though they remain stubbornly hidden within your own development and test environments. And we've all experienced what happens once a third-party UAT (user acceptance test) integration doesn't behave quite the same way as the production instance.

Of course, the telemetry and tracing data that would help you track down these issues without jumping into a production system inevitably prove not to exist when you really need them.

With the recent publication of its SSDF (Secure Software Development Framework), NIST (National Institute of Standards and Technology) underscored the criticality of this best practice. Whenever developers do need to access production, every protection imaginable must be in place: monitoring, data-loss controls, minimized scope of access, multiple eyes of approval, multifactor authentication?just to name a few. This is not controversial, and most enterprises have evolved to put some controls in place to protect their production environment or make use of some commercial offering providing the same protection.

So, why would you treat production test accounts in your applications any differently? One common strategy is to create a handful of test user accounts and share them with developers as needed. However, you quickly lose track of who is using which account and what it is they're allowed to do and test. This means you're knowingly introducing a gap in your production protections, even if it only means conferring the same level of access your customers might already have.

But sometimes the cost can be much higher. A shared credential has the potential to eliminate the business controls you've built around your product. This may impact your ability to maintain your regulatory obligations and could even expose your whole business to a great deal of risk.

 

What's the Real Risk?

The first thing to consider is what's involved in creating an account for your product. If the product is meant to be widely used by the masses and it's a snap for anyone to create an account, then (perhaps) your risks are mostly reputational. Of course, anything that could undermine or damage your brand's trustworthiness could be problematic. Third parties must not be able to impersonate your company or an employee. After all, brand is everything.

Some kinds of businesses have different risks to consider. If your product comes with complex regulatory requirements or legal obligations, such as those found in much of financial tech and the traditional finance space, then an improperly managed test account could be a convenient bypass for requisite KYC (know your customer) controls. A production test account could quickly find a new life as an enabler of money laundering or a means to bypass any sanctions-enforcement controls your company has in place. In such cases, you must carefully consider all the potential risks of allowing an employee to have unfettered access to any real account, and you must find ways to ensure sensitive capabilities are disabled after KYC checks to avoid sticky legal situations.

Not only must you consider what damage your developers could do with access to an account in your production environment, but also what sort of damage a malicious third party could do if they steal that account. This analysis should inform how strict an approach you need to take, as well as what investment you should make to protect your environment.

 

Who Goes There?

Companies of all sizes make efforts to know the identity of a principal (such as a user, developer, or executive) at all times. This is achieved through a number of mechanisms?usually either local accounts in a service or federated identities from an enterprise directory that your company manages. If done correctly, knowing that an identity exists within your directory gives you some assurance that the principal it represents is real. These identities are created at certain, well-defined times. For example, you create an identity for an employee before their first day of work or for a client when they sign up for a service and complete an onboarding process.

Simply stating an identity isn't enough, though. You must also authenticate the principal with a username, password, and additional factors, providing proof of authentication. If these all match up, you have an assurance that the person claiming to be that principal is likely to be who they claim to be. Authorization systems then map the principal's identity to a set of policies that help determine the authority of the user to perform some action or set of actions. These represent the types of activities the user is privileged to perform according to job role, employment status, or perhaps, even time of day.

The principle of least privilege says that a user should have only enough authority to do the job they need to perform, and nothing more. An extension of this principle is to require additional proof of identity or approval from others to add higher-level privileges?decreasing the instances of ambient privilege. For example, requiring a more senior leader to approve access to a critical system ensures that a lone employee in an office halfway around the world can't just decide to access a sensitive asset.

Job mobility in larger organizations means that peoples' roles may change often. As their responsibilities change, so must the capabilities to which they have access. For any organization, big or small, terminating an employee should mean that all access is revoked immediately. SaaS (software as a service) makes this problem especially acute since a terminated employee could abuse access to a service, clearing out sensitive materials before you even know it. Having a kill switch for employee access to SaaS products is a must?whether for a test account or otherwise.

Another important factor to consider is the SoD (separation of duties) your employees perform. This ensures an individual's authority is set up to ensure they can't bypass fraud protections, sanctions, or compliance controls. A toxic combination of privileges occurs when two different sources of authority enable an employee to bypass controls that protect the company from major business risk.

For example, a bank's policy might prohibit programmers from creating new accounts. This control alone could prevent many kinds of fraud. However, a programmer might still be able to access the databases that contain account information, and have the ability to modify their contents. A test account might be all a malicious programmer needs to bypass fraud controls, simply by tweaking the attributes associated with it in your back-end database.

 

Limit the Scope

PAM (privileged access management) is the marketing buzzword for infrastructure technology that provides an additional layer of controls to ensure that administrative access is hard to misuse. PAM provides extra assurance of the identity of whoever is using the administrative access at any point in time.

A typical PAM solution introduces additional authentication steps, multiparty approvals, and time bounding to requests for access to sensitive resources or data. A PAM solution can restrict what can be done with privileged access, scoping that access to particular resources or sites, in addition to allowing only certain actions to be performed on those resources. Most PAM solutions can also record the actions a privileged user performs while using a sensitive system. In this way, beyond proactively protecting your systems from abuse, you also have a paper trail of whatever actions someone took if/when they chose to abuse their access.

A simple policy would be that any given test account can be used only for a particular purpose by a particular developer. Access to this test account must be controlled in the same way. When this sort of policy is enforced, only those developers who need to test some given feature should be able to use this account. Access to these accounts must also be time-bounded, restricted, and subject to approval by other employees to prevent abuse. PAM can help with each of these requirements.

Finally, test accounts should be easily revocable. One click should be all it takes to disable one of these accounts. What's more, if nobody is left who is privileged to access that account, it should be easy to delete?or, at the very least, to generate an alert that the account has been orphaned. Also, ensuring that at least one developer still needs access to that account through periodic recertification further simplifies reducing the number of abandoned test accounts.

 

Telemetry Never Stops

Another concern about test accounts has to do with understanding when and how they are used, and what has been done with them. While these accounts are a necessary evil, they must not be exempt from normal telemetry collection. The same telemetry you generate based on normal customer usage?including patterns of behavior or anomalies in that behavior and access?should always be in place. (They are in place, right?)

You should always know which user accounts are test accounts, and keep track of them. When a developer uses a test account, especially one that could bypass critical security or business controls, this should merit some notification of increased risk. In the context of a test plan and other protective controls, most test-account usage alerts can be considered benign. But the anomalies, such as when there is no test plan in place, could indicate something nefarious is happening. You can help your security and risk operations teams identify potential risks by giving them as much context as possible.

Telemetry is also beneficial for auditing test accounts. Knowing which accounts are test accounts as well as when they were last used?and by whom?makes it easy to keep track of what's going on, as well as to proactively deactivate any test accounts that haven't seen use in a long time. Deploying such a system amounts to another easy-to-implement and proactive risk control for your test accounts.

 

Never Impersonate a Customer

Letting developers take on the identity of a client might at times seem like the fastest path to success. That is, when a developer is able to act as the user, all within the production environment, it can seem like the easiest way to reproduce certain kinds of bugs (and how is it that clients manage to find the strangest bugs?) since the developer can use the application in the same way as the client. But then, how can you ensure an employee acting as a client won't do something the client wouldn't want to have done?

Impersonation is hard to get right, and improperly implemented, it can increase the risk of the confused deputy problem. This is a type of bug that crops up when a more-privileged system or service that less-privileged users can interact with is able to perform some action that circumvents other security controls. Even if you limit the scope of what can be done when impersonating customers (by, for example, enforcing a read-only state), there can still be side effects such as bugs in how these controls work, or even just plain old implementation errors in authorization checks that can give a developer the ability to do a lot of damage?even if only accidentally. It's just not worth the risk.

 

So, What to Do?

It's tempting to create a test account, or a handful of test accounts, and then share them among your team. Create the account, set a password, put the password on a whiteboard (or in a shared password manager, at least), and your team is off to the races. Or perhaps your controls make it easy for your developers to create their own test accounts that they can use as needed.

Either case is a problem. A test account that's shared among many can be used by anyone who happens to have the password. This leaves a trail of poorly managed or unmanaged accounts that only increases your attack surface. A test account could be a treasure trove of information, even revealing information about internal system details. If you really need to take this approach, give your developers their own test accounts and then educate them about the risks of misusing these accounts. Also, if you can periodically expire?and easily renew?these accounts, all the better.

Remember that shared credentials are a violation of the principle of least privilege. Anyone with that credential could do a lot more than they should be capable of doing in their job capacity. What's more, they can do that as a user who isn't attributable to who they actually are!

Ultimately, it's up to you to figure out if these risks are acceptable for your business, since only you can answer that. But if you can avoid them, you should.

 

Phil Vachon (@pvachonnyc on Twitter) is the Head of Infrastructure in the CTO's Office at Bloomberg. In this role, he leads a department of talented architects, engineers and product managers in developing secure, scalable and reliable infrastructure technologies. In prior roles at Bloomberg, Phil built secure embedded biometric systems, as well as hardware security modules, among other things. Previously, he co-founded and was CTO of a startup that sold a high-speed packet capture and analytics platform. Earlier in his career he worked extensively on synthetic aperture radar data processing and analysis, as well as carrier router data plane engineering. Copyright © 2024 held by owner/author. Publication rights licensed to ACM.

acmqueue

Originally published in Queue vol. 22, no. 4
Comment on this article in the ACM Digital Library





More related articles:

Jinnan Guo, Peter Pietzuch, Andrew Paverd, Kapil Vaswani - Trustworthy AI using Confidential Federated Learning
The principles of security, privacy, accountability, transparency, and fairness are the cornerstones of modern AI regulations. Classic FL was designed with a strong emphasis on security and privacy, at the cost of transparency and accountability. CFL addresses this gap with a careful combination of FL with TEEs and commitments. In addition, CFL brings other desirable security properties, such as code-based access control, model confidentiality, and protection of models during inference. Recent advances in confidential computing such as confidential containers and confidential GPUs mean that existing FL frameworks can be extended seamlessly to support CFL with low overheads.


Raluca Ada Popa - Confidential Computing or Cryptographic Computing?
Secure computation via MPC/homomorphic encryption versus hardware enclaves presents tradeoffs involving deployment, security, and performance. Regarding performance, it matters a lot which workload you have in mind. For simple workloads such as simple summations, low-degree polynomials, or simple machine-learning tasks, both approaches can be ready to use in practice, but for rich computations such as complex SQL analytics or training large machine-learning models, only the hardware enclave approach is at this moment practical enough for many real-world deployment scenarios.


Matthew A. Johnson, Stavros Volos, Ken Gordon, Sean T. Allen, Christoph M. Wintersteiger, Sylvan Clebsch, John Starks, Manuel Costa - Confidential Container Groups
The experiments presented here demonstrate that Parma, the architecture that drives confidential containers on Azure container instances, adds less than one percent additional performance overhead beyond that added by the underlying TEE. Importantly, Parma ensures a security invariant over all reachable states of the container group rooted in the attestation report. This allows external third parties to communicate securely with containers, enabling a wide range of containerized workflows that require confidential access to secure data. Companies obtain the advantages of running their most confidential workflows in the cloud without having to compromise on their security requirements.


Charles Garcia-Tobin, Mark Knight - Elevating Security with Arm CCA
Confidential computing has great potential to improve the security of general-purpose computing platforms by taking supervisory systems out of the TCB, thereby reducing the size of the TCB, the attack surface, and the attack vectors that security architects must consider. Confidential computing requires innovations in platform hardware and software, but these have the potential to enable greater trust in computing, especially on devices that are owned or controlled by third parties. Early consumers of confidential computing will need to make their own decisions about the platforms they choose to trust.





© ACM, Inc. All Rights Reserved.