Mask email address for GDPR reasons

By FoxLearn 1/14/2025 9:51:16 AM   68
With the introduction of the General Data Protection Regulation (GDPR), companies must be extra cautious about handling personal data, including email addresses.

One effective way to protect email addresses is by hashing them before storing them in logs or databases.

How to mask/hide an email address in C#?

In this post, we will hash email addresses using the SHA256 algorithm, which converts the email into a unique hash. This method ensures that even if a log file containing email addresses is exposed, the original email address cannot be retrieved.

SHA256 Hashing Email Addresses for GDPR reasons

We’ll create an extension method to hash email addresses using SHA256 and append a fake domain name to make the result resemble an email address while ensuring privacy.

using System.Security.Cryptography;
using System.Text;

namespace FoxLearn
{
    public static class StringFormatter
    {
        // Extension method to hash the email address and append a fake domain
        public static string MaskEmail(this string email)
        {
            return SHA256(email) + "@domain.com";
        }

        // Method to perform SHA256 hashing on the email address
        private static string SHA256(string input)
        {
            using (SHA256Managed sha256 = new SHA256Managed())
            {
                StringBuilder hash = new StringBuilder();
                byte[] hashArray = sha256.ComputeHash(Encoding.UTF8.GetBytes(input));
                foreach (byte b in hashArray)
                {
                    hash.Append(b.ToString("x"));
                }
                return hash.ToString();
            }
        }
    }
}

For example, how you can use the MaskEmail extension method to hash an email address:

string email = "[email protected]";
string maskedEmail = email.MaskEmail();
    
// Result: 836f82db99121b348111f16b49dfa5fbc714ad1b1b9f784a1ebbbf5b39577f@domain.com

Under the new GDPR rules, it's crucial to handle personal data, including email addresses, with care, especially in log files. Sharing log files containing email addresses with third parties, even for unrelated issues, is not advisable. While removing personal data from logs is ideal, it’s often impractical, so pseudonymization or data masking methods become important for ensuring compliance.

C# Mask email address for GDPR reasons

This C# extension method masks email addresses based on specific patterns:

  • If the input is not an email, the entire string is masked (e.g., "this string" becomes "***********").
  • If the email's local part (before the '@') is shorter than 4 characters, the whole email is masked (e.g., "[email protected]" becomes "@.*").
  • For other emails, only the first and last characters of the local part and domain are shown, with the rest masked (e.g., "[email protected]" becomes "s****y@s*****e.com").

For example, C# Email Masking Extension Method

using System;
using System.Text.RegularExpressions;

namespace FoxLearn
{
    public static class EmailMasker
    {
        // Regular expression pattern to mask parts of the email address
        private static string _pattern = @"(?<=[\w]{1})[\w-\._\+%\\]*(?=[\w]{1}@)|(?<=@[\w]{1})[\w-_\+%]*(?=\.)";

        // Extension method to mask email addresses based on defined rules
        public static string MaskEmail(this string email)
        {
            // If the string doesn't contain an '@', mask the entire string
            if (!email.Contains("@"))
                return new string('*', email.Length);

            // If the local part of the email is less than 4 characters, mask the entire email
            if (email.Split('@')[0].Length < 4)
                return @"*@*.*";

            // Otherwise, apply regex to mask the email, except for the first and last characters of the local and domain parts
            return Regex.Replace(email, _pattern, m => new string('*', m.Length));
        }
    }
}

Usage

string email = "[email protected]";
string maskedEmail = email.MaskEmail();
// Result: j*****e@e*****e.com

In this example:

The MaskEmail method applies specific masking rules to the email address:

  • If the input is not an email (doesn’t contain '@'), it masks the entire string.
  • If the local part (before '@') is less than 4 characters, it returns a generic masked format "*@*.*".
  • For all other emails, it uses a regular expression to mask the email, preserving the first and last characters of both the local and domain parts.