Enhancing Data Privacy in Snowflake

The premise behind data masking is fairly simple – make it so that only certain people can see the data in a certain column. For example, social security numbers or credit card numbers probably shouldn’t be seen by everyone who has access to your database.

I like to set it up as follows: (Obviously, this can change depending on your business’ policies.)

  1. Admins have full access.
  2. Call Center Representatives have access to the last 4 digits of the social.
  3. All others see nothing useful.

To do so, I have to create SQL functions in Snowflake:

CREATE FUNCTION IsRoleAuthorizedForFull (arg1 VARCHAR)
RETURNS BOOLEAN
MEMOIZABLE
AS $$
     SELECT ARRAY_CONTAINS(arg1::VARIANT,
          (SELECT ARRAY_AGG(role)
           FROM auth_role
           WHERE FullAuth = TRUE))
$$;

CREATE FUNCTION IsRoleAuthorizedForPartial (arg1 VARCHAR)
RETURNS BOOLEAN
MEMOIZABLE
AS $$
     SELECT ARRAY_CONTAINS(arg1::VARIANT,
          (SELECT ARRAY_AGG(role)
           FROM auth_role
           WHERE PartialAuth = TRUE))
$$;

Yes, that’s correct. This is why we introduced/reviewed the term memoizable in our last post.

Now, we want to create the masking policy itself. This masking policy will use the memoizable function(s) we just created.

CREATE MASKING POLICY ssnMaskingPolicy AS
(val STRING) RETURNS STRING ->
CASE
   WHEN IsRoleAuthorizedForFull(current_role()) THEN val
   WHEN IsRoleAuthorizedForPartial(current_role()) THEN 'xxxx-xx-' + right(val, 4)
   ELSE 'xxx-xx-xxxx'
END;

Now we have to tell Snowflake what column should use this masking policy.

ALTER TABLE Patient MODIFY COLUMN SSN SET MASKING POLICY ssnMaskingPolicy;

Now, after all of that we have to list out all of the roles that we have created – at least those that we want to have access – and add them to a new table – called auth_role – that provides them the access they need to our data.

In this example, my auth_role table looks like the following:

CREATE TABLE auth_role
(   role varchar NOT NULL,
    FullAuth boolean NOT NULL,
    PartialAuth boolean NOT NULL);

After this, my happy fingers can be inserting data into the auth_role table to give all of my awesome users the correct authorization.

INSERT INTO Auth_role 
(role, FullAuth, PartialAuth)
SELECT 'Admin', TRUE, FALSE UNION ALL
SELECT 'CallCenter', FALSE, TRUE;

Now, please make sure that the only people who have access to insert/delete/update any data in this table are trusted. If this table gets into the wrong hands, you will lose all control over who has any access to your data.

Also, note that you will not need to create a new masking policy for each field. You can have one policy for all of your authorization needs.

You can also use a separate masking policy for many of your tables. Please I encourage you to do what feels right for your environment. There is no perfect solution, only ones that work right for you.

And with that, I hope you have learned about masking policies. The memoizable function is extremely helpful because it exponentially decreases the amount of time needed to determine if the role the user has is allowed. The masking policy itself can be used in such a way as to make the data unrecognizable in the wrong hands or fully viewed by the correct people. Hopefully, you too can find reasons to use masking policies that will help your company be better with your data today.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.