Simplifying Data Masking in Snowflake with Tags

In our last post, we discussed data masking and setting it up for specific columns. Now, I don’t know about y’all, but I deal with hundreds of tables with several columns in each on a daily basis that truly need data masking. If I have to go through and set up a masking policy one by one, I might go crazier than I normally am.

Let’s say that I have an existing table, Employee, in my database that I want to apply a policy. As I’m sure many of you know, any table with Employee data should have at least some of its data masked. The first thing that we need to do is to set up the tag that we’ll be using for, in our case, a column named Salary.

CREATE TAG salary_tag COMMENT 'Tag for Salary Data';

From there, we just have to specify which column we want to tag. Thankfully, we should now be used to configuring things at a column level.

ALTER TABLE Employee MODIFY COLUMN Salary SET TAG salary_tag = 'Employee'

Now, you may be wondering what the point of setting the tag to be “Employee” is. That, simply, is just to tell me, and any auditor, that the column Salary is the Employee’s Salary. Is it used for anything else? For better or worse, no. It’s just to let us know a bit more information if there is any, but it isn’t necessary.

Let’s say we use this tag on several more fields on several more tables. In our last post, we would have had to configure the masking policy on each column one at a time. This time? Heck, no!

ALTER TAG salary_tag SET MASKING POLICY salary_mask;

If you have configured your database columns and tables using a plethora of tags, then setting up policies should be made much easier.

Now, hold onto your hats! You don’t just have to do all that work of setting up tags to put a masking policy on it once and never again. Oh no! If you have to update the policies for whatever reason, then you’ll be glad you went this tag route to do all of your updates.

Now, if you have to update a masking policy via tags, this is definitely something to think about. You can update the policy in one of two ways:

The first way consists of unsetting it and setting it back up again.

ALTER TAG salary_tag UNSET MASKING POLICY salary_mask;

ALTER TAG salary_tag SET MASKING POLICY salary_mask_v2;

Now, if some nefarious person SELECTs the data after you UNSET the masking policy and before you SET the new masking policy then they will be able to see the data with no masks! Also, if the role you’re in allows you to see the data in the first policy, but not the second, then you will be able to see the data unmasked up until the policy is set in the second statement.

You can also set the policy the way that I recommend, just so that the first scenario above cannot happen.

ALTER TAG salary_tag SET MASKING POLICY salary_mask_v2 FORCE;

Yes, here the FORCE keyword sets the masking policy quickly and without any time for the previously mentioned nefarious person to do anything untoward.

Now, as always, there are some limitations.

  1. A single tag can only support one masking policy for each data type. For example, you cannot set the same policy for NUMBER and VARCHAR.
  2. Neither the masking policy nor the tag can be dropped if the masking policy is assigned to a tag. Also, the table and/or schema cannot be dropped if it has a tag or masking policy on it.
  3. You cannot set up a materialized view on a table that uses tag-based masking policies.
  4. You cannot use a masked column as a conditional column in a masking policy. In other words, you cannot use a masked column within your masking policy.

Now one of the errors that you might get is the following:

SQL Execution Error: Column Salary is mapped to a masking policy where the table has a column with different data-type for a secondary argument name. Please contact your local administrator to fix the issue.

The underlying problem is that the data type that the masking policy was originally set up to be used with and the data type of the column that you’re querying against does not match any longer. Simply assign a different policy to that column or revert the column to the previous data type. (Yes, you can consider this a huge QA fail if you’ve made it to Production and QA never caught this bug.)

And with that, we’ve reached the end of today’s post. There are lots of things that are just made easier with tag-based masking policies. And your data governance team, even if it’s just you, will appreciate all that you’re doing to help with masking your company’s data. Until next time my friends!

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.