Transforms
Transforms normalise a plaintext value before it is passed to the HMAC function. Normalisation ensures that logically equivalent inputs — differing only in case or whitespace — produce the same blind index fingerprint.
Without transforms, searching for "jane@example.com" would not match a record saved with "Jane@Example.com", even though they refer to the same address.
Configuring Transforms
Transforms are declared on the [BlindIndex] attribute as an ordered array of strings:
[BlindIndex(
CompanionProperty = nameof(EmailHash),
Transforms = ["lowercase", "trim"])]
public string Email { get; set; } = "";Transforms are applied left to right. The example above first converts to lowercase, then strips surrounding whitespace.
Built-In Transforms
lowercase
Converts the entire value to lowercase using the invariant culture.
| Input | Output |
|---|---|
"Jane@Example.COM" | "jane@example.com" |
"ACME Corp" | "acme corp" |
"already lower" | "already lower" |
Use on: email addresses, usernames, domain names.
trim
Removes leading and trailing whitespace (spaces, tabs, newlines).
| Input | Output |
|---|---|
" jane@example.com " | "jane@example.com" |
"\tjane\n" | "jane" |
"no change" | "no change" |
Use on: any field where trailing spaces might appear from user input or data imports.
alphanumeric
Removes all characters that are not ASCII letters (a–z, A–Z) or digits (0–9). Useful for normalising names or identifiers that might contain punctuation.
| Input | Output |
|---|---|
"O'Brien" | "OBrien" |
"Smith-Jones" | "SmithJones" |
"+1 (555) 867-5309" | "15558675309" |
Combine with lowercase for case-insensitive matching
alphanumeric alone does not change case. Use ["lowercase", "alphanumeric"] if you want case-insensitive matching.
digits
Retains only ASCII digit characters (0–9). All other characters are removed. Designed for phone numbers, tax IDs, and other numeric identifiers.
| Input | Output |
|---|---|
"+1 (555) 867-5309" | "15558675309" |
"SSN: 123-45-6789" | "123456789" |
"GB VAT 123 456 789" | "123456789" |
last4
Retains only the last 4 characters of the value after all other characters have been processed. Commonly used for partial credit card or SSN matching.
| Input | Output |
|---|---|
"4111111111111111" | "1111" |
"123-45-6789" | "6789" |
"AB12" | "AB12" |
"AB" | "AB" (shorter than 4 — returned as-is) |
Combine last4 with digits for card numbers
Use ["digits", "last4"] to strip formatting characters before taking the last four digits. This ensures "4111-1111-1111-1111" and "4111111111111111" produce the same result.
first_char
Retains only the first character of the value. Useful for bucketed or initial-based lookups.
| Input | Output |
|---|---|
"Jane" | "J" |
"jane" | "j" |
"" | "" (empty string is preserved) |
Low cardinality warning
first_char produces at most 26 distinct values (plus digits and symbols). This is a very low-cardinality blind index and is susceptible to frequency analysis. See Security Considerations.
Transform Ordering
Transforms are applied in the order they are declared. Order matters.
Example: ["trim", "lowercase", "digits"]
Input: " +1 (555) 867-5309 "
trim → "+1 (555) 867-5309"
lowercase → "+1 (555) 867-5309" (no letters, no change)
digits → "15558675309"Example: ["digits", "last4"]
Input: "4111-1111-1111-1111"
digits → "4111111111111111"
last4 → "1111"Reversing the order would give last4 the formatted string first, which could produce a different result depending on the trailing characters.
Custom Transforms
Use WithTransform() to add inline custom transforms in the fluent API:
// Inline custom transforms — no class or registration needed
var transformServices = new ServiceCollection();
var transformBuilder = transformServices.AddTayra(opts => opts.LicenseKey = licenseKey);
transformBuilder.Entity<IndexedCustomer>(e =>
{
e.DataSubjectId(c => c.CustomerId);
e.PersonalData(c => c.Email);
e.BlindIndex(c => c.Email)
.WithTransform(value => value.Split('@')[0]) // extract local part
.WithLowercase()
.StoredIn(c => c.EmailIndex);
});Custom transforms are just functions — no class or registration needed. They compose naturally with built-in transforms in the pipeline.
Custom Transform Rules
- The function must be a pure function — same input always produces the same output.
- The function must not throw on an empty string.
- Transforms should be fast (no I/O, no allocations if avoidable).
Transform Reference Summary
| Name | Effect | Typical Use |
|---|---|---|
lowercase | Converts to invariant lowercase | Email, username |
trim | Removes leading/trailing whitespace | Any user-input field |
alphanumeric | Keeps only [a-zA-Z0-9] | Names, identifiers |
digits | Keeps only [0-9] | Phone numbers, tax IDs |
last4 | Keeps last 4 characters | Card numbers, SSN suffix |
first_char | Keeps first character only | Bucketed lookups |
See Also
- Blind Indexes Overview — How HMAC blind indexes work
- Security Considerations — Cardinality and frequency analysis risks
- Querying — Using blind indexes in EF Core and Marten queries
