Handling audio data often involves processing conversations that contain personally identifiable information such as names, phone numbers, or financial details. Redacting PII helps you comply with privacy regulations like GDPR and CCPA/CPRA, protect your users’ sensitive data, and reduce the risk of data breaches when storing or sharing transcripts.
Usage
Add"pii_redaction": true to your request to redact all detected PII in the transcript. Sensitive entities will be replaced with markers in the output.
Optional configuration
You can customize the behavior withpii_redaction_config:
Preset or list of PII entity types to redact (e.g.
["GDPR"]). See Supported entity types for available presets.How to replace detected PII:
MASK: Each character replaced by a mask (e.g. “John Smith” →#### #####)MARKER: Placeholder labels like[NAME_1],[EMAIL_1]. Same entity will have same ID.
Example body
Pre-recorded
Example output
Without PII redaction (raw transcript):Hi, I’m calling about the order for John Smith. Can you confirm the delivery to john.smith@company.com? Yes, John Smith placed it yesterday.With PII redaction (
processed_text_type="MASK"):
Hi, I’m calling about the order for #### #####. Can you confirm the delivery to ######################? Yes, #### ##### placed it yesterday.With PII redaction (
processed_text_type="MARKER"):
Hi, I’m calling about the order for [NAME_1]. Can you confirm the delivery to [EMAIL_1]? Yes, [NAME_1] placed it yesterday.The same entity mentioned multiple times receives the same marker ID (e.g. “John Smith” becomes [NAME_1] both times), so you can track references across the transcript while keeping sensitive data redacted.
This consistency is also useful for downstream tasks using LLMs, which can reason about entities (e.g. “the person in [NAME_1]”) without ever seeing the raw PII.
Supported entity types
When usingentity_types, you can pass a preset group or a list of specific types.
Preset groups
GDPR — EU General Data Protection Regulation
GDPR — EU General Data Protection Regulation
Personal data entities covered by the EU GDPR.Includes:
AGE, DRIVER_LICENSE, DOB, EMAIL_ADDRESS, HEALTHCARE_NUMBER, IP_ADDRESS, LANGUAGE, LOCATION, LOCATION_ADDRESS, LOCATION_ADDRESS_STREET, LOCATION_CITY, LOCATION_COORDINATE, LOCATION_COUNTRY, LOCATION_STATE, LOCATION_ZIP, NAME, NAME_FAMILY, NAME_GIVEN, NAME_MEDICAL_PROFESSIONAL, NUMERICAL_PII, PASSPORT_NUMBER, PHONE_NUMBER, SSN, URL, USERNAME, VEHICLE_ID, BANK_ACCOUNT, CREDIT_CARD, CREDIT_CARD_EXPIRATION, CVV, BLOOD_TYPE, CONDITION, DRUG, INJURY, MEDICAL_PROCESSGDPR_SENSITIVE — GDPR Article 9 sensitive data
GDPR_SENSITIVE — GDPR Article 9 sensitive data
Sensitive personal data as defined by GDPR Article 9.Includes:
GENDER, LANGUAGE, ORIGIN, PHYSICAL_ATTRIBUTE, POLITICAL_AFFILIATION, RELIGION, SEXUALITYHIPAA_SAFE_HARBOR — USA HIPAA Safe Harbor
HIPAA_SAFE_HARBOR — USA HIPAA Safe Harbor
De-identification standard for US healthcare data under HIPAA.Includes:
ACCOUNT_NUMBER, AGE, DATE, DATE_INTERVAL, DOB, DRIVER_LICENSE, EMAIL_ADDRESS, HEALTHCARE_NUMBER, IP_ADDRESS, LOCATION, LOCATION_ADDRESS, LOCATION_ADDRESS_STREET, LOCATION_CITY, LOCATION_COORDINATE, LOCATION_ZIP, NAME, NAME_FAMILY, NAME_GIVEN, NAME_MEDICAL_PROFESSIONAL, NUMERICAL_PII, PASSPORT_NUMBER, PHONE_NUMBER, SSN, URL, VEHICLE_ID, BANK_ACCOUNT, CREDIT_CARD, CREDIT_CARD_EXPIRATION, CVVHEALTH_INFORMATION — Health-related entities
HEALTH_INFORMATION — Health-related entities
PCI — Payment Card Industry
PCI — Payment Card Industry
PCI-specific financial entities only.Includes:
BANK_ACCOUNT, CREDIT_CARD, CREDIT_CARD_EXPIRATION, CVV, ROUTING_NUMBERCPRA — California Privacy Rights Act
CPRA — California Privacy Rights Act
Personal data entities covered by the California Privacy Rights Act.Includes:
DOB, DRIVER_LICENSE, EMAIL_ADDRESS, GENDER, HEALTHCARE_NUMBER, IP_ADDRESS, LOCATION_ADDRESS, LOCATION_ADDRESS_STREET, LOCATION_CITY, LOCATION_COORDINATE, LOCATION_ZIP, NAME, NAME_FAMILY, NAME_GIVEN, NAME_MEDICAL_PROFESSIONAL, NUMERICAL_PII, ORIGIN, PASSPORT_NUMBER, PASSWORD, PHONE_NUMBER, PHYSICAL_ATTRIBUTE, POLITICAL_AFFILIATION, RELIGION, SEXUALITY, SSN, URL, USERNAME, VEHICLE_ID, BANK_ACCOUNT, CREDIT_CARD, CREDIT_CARD_EXPIRATION, CVV, BLOOD_TYPE, CONDITION, DRUG, INJURY, MEDICAL_PROCESSAPPI — Japan Act on the Protection of Personal Information
APPI — Japan Act on the Protection of Personal Information
Personal data entities covered by Japan’s APPI.Includes:
AGE, DOB, DRIVER_LICENSE, EMAIL_ADDRESS, HEALTHCARE_NUMBER, IP_ADDRESS, LOCATION, LOCATION_ADDRESS, LOCATION_ADDRESS_STREET, LOCATION_CITY, LOCATION_COORDINATE, LOCATION_COUNTRY, LOCATION_STATE, LOCATION_ZIP, NAME, NAME_FAMILY, NAME_GIVEN, NAME_MEDICAL_PROFESSIONAL, NUMERICAL_PII, OCCUPATION, ORGANIZATION, ORGANIZATION_MEDICAL_FACILITY, PASSPORT_NUMBER, PASSWORD, PHONE_NUMBER, SSN, VEHICLE_ID, BANK_ACCOUNT, CREDIT_CARD, CREDIT_CARD_EXPIRATION, CVV, DRUGAPPI_SENSITIVE — APPI sensitive data
APPI_SENSITIVE — APPI sensitive data
Sensitive data subset under Japan’s APPI.Includes:
CONDITION, DRUG, GENDER, INJURY, LANGUAGE, MARITAL_STATUS, MEDICAL_PROCESS, ORIGIN, PHYSICAL_ATTRIBUTE, POLITICAL_AFFILIATION, RELIGION, SEXUALITYQUEBEC_PRIVACY_ACT — Quebec Law 25
QUEBEC_PRIVACY_ACT — Quebec Law 25
Personal data entities covered by Quebec’s Privacy Act / Law 25.Includes:
AGE, DATE, DATE_INTERVAL, DOB, DRIVER_LICENSE, EMAIL_ADDRESS, HEALTHCARE_NUMBER, IP_ADDRESS, LOCATION_ADDRESS, LOCATION_ADDRESS_STREET, LOCATION_CITY, LOCATION_COORDINATE, LOCATION_ZIP, NAME, NAME_FAMILY, NAME_GIVEN, NAME_MEDICAL_PROFESSIONAL, NUMERICAL_PII, OCCUPATION, ORGANIZATION, ORGANIZATION_MEDICAL_FACILITY, ORIGIN, PASSPORT_NUMBER, PHONE_NUMBER, POLITICAL_AFFILIATION, RELIGION, SSN, BANK_ACCOUNT, CREDIT_CARD, CREDIT_CARD_EXPIRATION, CVV, BLOOD_TYPE, CONDITION, DRUG, INJURY, MEDICAL_PROCESS, STATISTICSCORE_ENTITIES — Common PII types
CORE_ENTITIES — Common PII types
Core set of the most commonly used PII entity types across all regulations.
CCI — Corporate Confidential Information
CCI — Corporate Confidential Information
Business and corporate data entities.Includes:
ACCOUNT_NUMBER, DATE, DATE_INTERVAL, EMAIL_ADDRESS, FILENAME, LOCATION, LOCATION_ADDRESS, LOCATION_ADDRESS_STREET, LOCATION_CITY, LOCATION_COUNTRY, LOCATION_STATE, LOCATION_ZIP, MONEY, NUMERICAL_PII, OCCUPATION, ORGANIZATION, PASSWORD, TIME, URL, VEHICLE_ID, BANK_ACCOUNT, CREDIT_CARD, CREDIT_CARD_EXPIRATION, CVV, ROUTING_NUMBER, DRUG, MEDICAL_PROCESSLIDI — Locally Identifiable Data
LIDI — Locally Identifiable Data
Entities that can identify individuals within a local context.
NUMERICAL_EXCL_PCI — Numerical PII (excluding PCI)
NUMERICAL_EXCL_PCI — Numerical PII (excluding PCI)
Numerical PII entities, excluding PCI-specific types.Includes:
ACCOUNT_NUMBER, NUMERICAL_PII, SSN, etc.Excludes: CREDIT_CARD, CVV, BANK_ACCOUNT, ROUTING_NUMBERIndividual entity types
Core PII
| Entity | What it catches |
|---|---|
NAME | Full person names |
EMAIL_ADDRESS | Email addresses |
PHONE_NUMBER | Phone/fax numbers |
LOCATION_ADDRESS | Full mailing addresses |
DATE | Specific dates |
DOB | Dates of birth |
SSN | Social security numbers (+ intl. equivalents) |
PASSPORT_NUMBER | Passport numbers |
DRIVER_LICENSE | Driver’s license numbers |
IP_ADDRESS | IPv4 / IPv6 addresses |
URL | Web addresses |
USERNAME | Logins, handles |
Financial / PCI
| Entity | What it catches |
|---|---|
CREDIT_CARD | Credit card numbers (incl. last 4) |
BANK_ACCOUNT | Bank accounts, IBAN |
CVV | Card verification codes |
Sensitive / GDPR Article 9
| Entity | What it catches |
|---|---|
ORIGIN | Nationality, ethnicity |
RELIGION | Religious affiliation |
POLITICAL_AFFILIATION | Political opinions |
SEXUALITY | Sexual orientation |
PHYSICAL_ATTRIBUTE | Race, physical traits |
Healthcare (HIPAA / GDPR)
| Entity | What it catches |
|---|---|
CONDITION | Medical conditions, diseases |
DRUG | Medications, supplements |
HEALTHCARE_NUMBER | Health plan IDs, medical record numbers |
MEDICAL_PROCESS | Treatments, procedures |