Complete Guide to Snowflake’s Tag-Based Masking (Now With Auto-Tagging)

Introduction
If you’ve followed our site for a while, you would have seen in a previous post how powerful tag-based masking policies are in Snowflake. They let you enforce consistent data masking rules across columns without constantly rewriting logic. But Snowflake hasn’t stopped there—recent enhancements now make it even easier to classify, tag, and mask data at scale. In this post, we’ll recap the essentials of tag-based masking, highlight the new functionality, and share some practical tips for rolling it out in your environment.


Quick Recap: How Tag-Based Masking Works

Traditionally, if you wanted to mask sensitive data in Snowflake, you had two main options:

  • Column-level masking policies — applied one by one, which gets messy in wide schemas.
  • Tag-based masking policies — assign a tag (e.g., PII = 'SSN') to a column, then attach a masking policy to the tag. Snowflake automatically enforces that policy wherever the tag is used.

Example:

CREATE TAG pii_tag;

CREATE MASKING POLICY ssn_masking_policy 
  AS (val STRING) 
  RETURNS STRING -> 
    CASE 
      WHEN CURRENT_ROLE() IN ('ANALYST') THEN '***MASKED***'
      ELSE val
    END;

ALTER TAG pii_tag SET MASKING POLICY ssn_masking_policy;

ALTER TABLE customers ALTER COLUMN ssn SET TAG pii_tag = 'SSN';

The real beauty here is that every column tagged pii_tag inherits the masking policy. That saves time and enforces consistency.


What’s New: Classification + Automatic Tag Mapping

Until recently, the biggest pain point was coverage. You still had to identify and tag each sensitive column manually. For small schemas, this is okay – not great, but okay. For enterprise-scale data lakes? Painful.

Snowflake’s new Classification service solves that. It automatically scans your schema, classifies columns by sensitivity (like email, phone, or SSN), and can directly apply the correct tags. Once tags are applied, your tag-based masking policies kick in with no manual effort.

For example:

-- Run classification on a table
ALTER TABLE customers CLASSIFY;

-- View results
SELECT column_name, classification
FROM table_classifications('customers');

-- Automatically map to tags
ALTER TABLE customers CLASSIFY WITH TAGS;

Now, your customers.email column may automatically get tagged as EMAIL, which already has a masking policy attached. No more chasing down every field yourself – and probably missing a few because those pesky developers keep adding fields without talking to us.


Why This Matters

  • Speed: Classification accelerates tagging dramatically.
  • Consistency: No risk of missing sensitive columns when onboarding new data sources.
  • Scalability: Ideal for enterprises with hundreds of schemas and evolving workloads.
  • Auditability: You can now report on what’s masked and why with less manual tracking.

Practical Tips and Gotchas

  1. Classification ≠ Perfect. Automated classification works well for obvious fields (emails, phone numbers) but may miss business-specific PII. Always validate the results.
  2. Audit Regularly. Just because tags are auto-assigned doesn’t mean they’re permanent. Schema drift can unmask new fields.
  3. Custom Tagging Still Matters. If your org has its own data sensitivity rules (e.g., “Employee Badge ID”), you’ll still need custom tags + masking policies.
  4. Role Awareness. Remember, masking policies are evaluated in the context of roles. Test thoroughly with all relevant roles before rolling out broadly.
  5. Performance Impact. Masking happens at query runtime. Generally minimal, but don’t be surprised if certain heavily-masked queries see small slowdowns.

Bonus: Metadata Query to Audit Masked Columns

Here’s a query you can run to check which columns have tags and masking policies:

SELECT 
    c.table_schema,
    c.table_name,
    c.column_name,
    t.tag_name,
    t.tag_value,
    p.policy_name
FROM information_schema.columns c
LEFT JOIN snowflake.account_usage.tag_references t
    ON c.table_schema = t.object_database
   AND c.table_name = t.object_name
   AND c.column_name = t.column_name
LEFT JOIN snowflake.account_usage.policies p
    ON t.tag_name = p.policy_name
WHERE t.tag_name IS NOT NULL
ORDER BY c.table_schema, c.table_name, c.column_name;

This will help you confirm coverage—and spot gaps.


Wrapping Up

Tag-based masking was already a strong tool for Snowflake users. With classification and auto-tagging, it’s becoming a near plug-and-play solution for sensitive data governance. That said, automation isn’t a replacement for oversight—think of it as your fast first pass, not the final word.

If you’ve read SherpaOfData’s earlier blog on tag-based masking, this update should give you the context you need to take advantage of Snowflake’s latest capabilities. The takeaway? Less grunt work, more governance.


Next Steps for You:

  • Try classification on one non-critical schema to see how accurate it is in your environment – the real test for any functionality.
  • Update your existing masking/tagging framework to incorporate auto-tagging.
  • Share lessons learned internally—Snowflake’s governance features shine most when consistently applied across teams.

As with all posts about data masking, we have to remember one of the greatest musicals of all time – Phantom of the Opera. If you truly don’t know why, then do yourself a favor and watch the movie or see the show. It’s more than worth your time.