The Future of Tag Management – Part 2: Subresource Integrity : Development & Analytics

Welcome Back!

When we last left off, we were discussing the changing Tag Management landscape, the impact of ‘Magecart’ and the changing security aspects related to Content Security Policies. We also briefly covered the workflow changes required for working with those changes.

Surely that was the end of the security concerns?

Sadly, ‘tis not to be. Two words stand before us. Subresource Integrity. For those in the know, those words could be followed by the crash of thunder and lightning in the distance—such is the effect that is has on the world of Tag Management. What this post hopes to do, is to explain what it is and why that can be problematic so that you can have a meaningful conversation when your friendly neighborhood security staff or regulatory auditor comes around and insists that you implement it.

What is it that strikes terror into the hearts of those that manage tags on behalf of their company?

I’ll explain by continuing the guard analogy from part 1 of this series.

A Content Security Policy is like the building lobby guard—validating badges to make sure you belong, but not overly deep into verification. If you say you work for the company, and are on the list – you are allowed access.

In contrast, Subresource Integrity is the guard sitting behind bulletproof glass, who previously conducted a background check on you covering all your bank accounts, previous work history and education, as well as references from friends and family, and a signed statement from everyone who saw you wearing a zebra outfit in Miss Apple’s 3rd grade play. They then proceed to validate it is really you via voice verification, palm scanning and retina verification. Only if everything checks out, are you allowed to proceed.

But really, how does this work?

Subresource Integrity explained

MDN defines it as:

Subresource Integrity (SRI) is a security feature that enables browsers to verify that resources they fetch (for example, from a CDN) are delivered without unexpected manipulation. It works by allowing you to provide a cryptographic hash that a fetched resource must match.

When the resource is served in a 3rd party context, such as the case with external files loaded via tags, it proceeds to invoke Cross-Origin Resource Sharing (CORS) validation.

In the example below we will take a text file called example.js and figure out what the hash is.

Example.js contents:

“This sentence is a SHA-256 Hash”

Evaluates to:

“825d54d634016243c900287afc0c2fbc4b3ae4f3d1a1a970b2efd2c60abae98d”

So the resulting HTML would look like:

<script src="https://example.com/example.js" integrity="sha256-825d54d634016243c900287afc0c2fbc4b3ae4f3d1a1a970b2efd2c60abae98d" crossorigin="anonymous"></script>

Let’s say I broke into example.com and modified example.js to read:

“This sentence is not a SHA-256 Hash”

When the browser goes to execute

<script src="https://example.com/example.js" integrity="sha256-825d54d634016243c900287afc0c2fbc4b3ae4f3d1a1a970b2efd2c60abae98d" crossorigin="anonymous"></script>

It runs the verification of the file. The SHA256 hash for the modified code is

8bbec2eb0ac6b57dc9a82a650be9a0beae65c0108abc50e1f180993ce5602f74

Since (Original)

825d54d634016243c900287afc0c2fbc4b3ae4f3d1a1a970b2efd2c60abae98d

is not the same as (Modified)

8bbec2eb0ac6b57dc9a82a650be9a0beae65c0108abc50e1f180993ce5602f74

The file is rejected and not executed. So whatever that file was attempting to do is prevented. A Security win!

Impacts to Tag Management

Remember above we said this request would likely come from security staff or an auditor? That’s true. However, what they are asking can be misunderstood. They are asking that all files loaded are verified. For tag management, that’s both the tag manager as well as all the tags calling external files with-in it.

The Tag Manager

It does not matter which client side tag manager you use, Adobe, Google or Tealium, for they all load a library on to the page and execute tags. In the unlikely event that the tag manager code itself is corrupted/modified – SRI validation will prevent the manager from loading (and by extension, none of the tags will execute).

This seems like a good thing from a secure application point of view, and it is. Then you realize that every time you update the container the hash of the file could change (this varies by tag manager). Now your container publishes have to be mirrored with a code deployment of the entire site. Since tag managers are typically deployed to avoid that exact scenario, this is non-desirable, but it is possible.

Even if the container doesn’t change, but the vendor patches/upgrades the externally hosted tag manager, the hash will change and all the tags will stop loading. The premise of SRI is that you are only loading ‘known’ files and if you don’t know the file, you ignore it in the name of security. Sound reasoning to be sure—until you find you haven’t served tags for a week because you missed the tag managers upgrade notice as you were on vacation.

The Tags

More complicated than the manger issue above are the tags themselves. What tag managers can do is load and execute 3rd party JavaScript by appending it to the webpage. Since it’s being added via JavaScript injection, it’s either bypassing the need for SRI, or failing SRI, depending on how SRI is being enforced on the webpage. As both conditions are not desirable, one solution is to modify the tag manager templates, manually calculate the hash of the resource, and add the appropriate script attributes so that SRI is enforced during the injection process. Then repeat that process every time a vendor changes their externally loaded files. If this sounds like more of a hassle than you want to deal with… then that brings me to my next point.

Working with SRI and Tag Management

Can they play together? Yes.

Do I recommend it? No.

If you are having to implement Subresource Integrity on the website, here are some things to consider:

Can you not load the tag manager/tags on specific pages to avoid having to deal with SRI accepting the data loss on those areas?
Can you come up with another implementation to avoid having to tag everything with SRI? (This commonly involves hosting everything on origin, which has other problems as we discussed in part one, or moving to a server side solution).
Can you skip the tag manager and load the tags directly via hard coding them in the page? (not recommended as this still ties the tag management with code deployments but removes the Tag Manager as a concern).

If, however, you can’t get around it and have to continue to use a client side tag manager—then that choice has to be made with the knowledge that data collection will randomly break as external code changes and a much higher level of communication and coordination is required to minimize breakage.

Conclusion

Subresource Integrity is good for sites that want to harden their application against Cross-site Scripting attacks. I do consider that a worthy goal, but I recognize it is at odds with the normal tagging workflow. The two workflows (security and tagging) become tightly intertwined with SRI, and alignment of tagging and code deployments becomes a matter-of-course. As one could expect, this can drastically slow down the tagging workflow.

Still, I advocate for avoiding the headache if you can in order to retain agility in tag management. The best advice I can offer is to consider all possibilities and understand the risk should you decide to load tags without SRI enforcement—for you are betting the security of your customers data on your vendors maintaining secure infrastructure. If that is not a bet you’re willing to take (remember the average cost for data breaches in 2019 is currently at $3.92 million) then other options exist and you should work with security to find a common ground between risk mitigation and operational efficiency.

Coming up next week, we will leave the quagmire of security and regulation and proceed to scale the mountains springing up before us as we explore the crags and summits of the Browser Change mountain range.