Skip to main content
Pre-recorded PII Redaction automatically detects and replaces sensitive entities (names, emails, addresses, etc.) in your transcript output. This feature is only available for pre-recorded transcription.

Handling audio data often involves processing conversations that contain personally identifiable information such as names, phone numbers, or financial details. Redacting PII helps you comply with privacy regulations like GDPR and CCPA/CPRA, protect your users’ sensitive data, and reduce the risk of data breaches when storing or sharing transcripts.

Usage

Add "pii_redaction": true to your request to redact all detected PII in the transcript. Sensitive entities will be replaced with markers in the output.
{
  "audio_url": "YOUR_AUDIO_URL",
  "pii_redaction": true
}

Optional configuration

You can customize the behavior with pii_redaction_config:
entity_types
string[]
Preset or list of PII entity types to redact (e.g. ["GDPR"]). See Supported entity types for available presets.
processed_text_type
enum
default:"MARKER"
How to replace detected PII:
  • MASK: Each character replaced by a mask (e.g. “John Smith” → #### #####)
  • MARKER: Placeholder labels like [NAME_1], [EMAIL_1]. Same entity will have same ID.

Example body

Pre-recorded
{
  "audio_url": "YOUR_AUDIO_URL",
  "pii_redaction": true,
  "pii_redaction_config": {
    "entity_types": ["GDPR"],
    "processed_text_type": "MARKER"
  }
}

Example output

Without PII redaction (raw transcript):
Hi, I’m calling about the order for John Smith. Can you confirm the delivery to john.smith@company.com? Yes, John Smith placed it yesterday.
With PII redaction (processed_text_type="MASK"):
Hi, I’m calling about the order for #### #####. Can you confirm the delivery to ######################? Yes, #### ##### placed it yesterday.
With PII redaction (processed_text_type="MARKER"):
Hi, I’m calling about the order for [NAME_1]. Can you confirm the delivery to [EMAIL_1]? Yes, [NAME_1] placed it yesterday.
The same entity mentioned multiple times receives the same marker ID (e.g. “John Smith” becomes [NAME_1] both times), so you can track references across the transcript while keeping sensitive data redacted.
This consistency is also useful for downstream tasks using LLMs, which can reason about entities (e.g. “the person in [NAME_1]”) without ever seeing the raw PII.

Supported entity types

When using entity_types, you can pass a preset group or a list of specific types.

Preset groups

Personal data entities covered by the EU GDPR.Includes: AGE, DRIVER_LICENSE, DOB, EMAIL_ADDRESS, HEALTHCARE_NUMBER, IP_ADDRESS, LANGUAGE, LOCATION, LOCATION_ADDRESS, LOCATION_ADDRESS_STREET, LOCATION_CITY, LOCATION_COORDINATE, LOCATION_COUNTRY, LOCATION_STATE, LOCATION_ZIP, NAME, NAME_FAMILY, NAME_GIVEN, NAME_MEDICAL_PROFESSIONAL, NUMERICAL_PII, PASSPORT_NUMBER, PHONE_NUMBER, SSN, URL, USERNAME, VEHICLE_ID, BANK_ACCOUNT, CREDIT_CARD, CREDIT_CARD_EXPIRATION, CVV, BLOOD_TYPE, CONDITION, DRUG, INJURY, MEDICAL_PROCESS
Sensitive personal data as defined by GDPR Article 9.Includes: GENDER, LANGUAGE, ORIGIN, PHYSICAL_ATTRIBUTE, POLITICAL_AFFILIATION, RELIGION, SEXUALITY
De-identification standard for US healthcare data under HIPAA.Includes: ACCOUNT_NUMBER, AGE, DATE, DATE_INTERVAL, DOB, DRIVER_LICENSE, EMAIL_ADDRESS, HEALTHCARE_NUMBER, IP_ADDRESS, LOCATION, LOCATION_ADDRESS, LOCATION_ADDRESS_STREET, LOCATION_CITY, LOCATION_COORDINATE, LOCATION_ZIP, NAME, NAME_FAMILY, NAME_GIVEN, NAME_MEDICAL_PROFESSIONAL, NUMERICAL_PII, PASSPORT_NUMBER, PHONE_NUMBER, SSN, URL, VEHICLE_ID, BANK_ACCOUNT, CREDIT_CARD, CREDIT_CARD_EXPIRATION, CVV
PCI-specific financial entities only.Includes: BANK_ACCOUNT, CREDIT_CARD, CREDIT_CARD_EXPIRATION, CVV, ROUTING_NUMBER
Personal data entities covered by the California Privacy Rights Act.Includes: DOB, DRIVER_LICENSE, EMAIL_ADDRESS, GENDER, HEALTHCARE_NUMBER, IP_ADDRESS, LOCATION_ADDRESS, LOCATION_ADDRESS_STREET, LOCATION_CITY, LOCATION_COORDINATE, LOCATION_ZIP, NAME, NAME_FAMILY, NAME_GIVEN, NAME_MEDICAL_PROFESSIONAL, NUMERICAL_PII, ORIGIN, PASSPORT_NUMBER, PASSWORD, PHONE_NUMBER, PHYSICAL_ATTRIBUTE, POLITICAL_AFFILIATION, RELIGION, SEXUALITY, SSN, URL, USERNAME, VEHICLE_ID, BANK_ACCOUNT, CREDIT_CARD, CREDIT_CARD_EXPIRATION, CVV, BLOOD_TYPE, CONDITION, DRUG, INJURY, MEDICAL_PROCESS
Personal data entities covered by Japan’s APPI.Includes: AGE, DOB, DRIVER_LICENSE, EMAIL_ADDRESS, HEALTHCARE_NUMBER, IP_ADDRESS, LOCATION, LOCATION_ADDRESS, LOCATION_ADDRESS_STREET, LOCATION_CITY, LOCATION_COORDINATE, LOCATION_COUNTRY, LOCATION_STATE, LOCATION_ZIP, NAME, NAME_FAMILY, NAME_GIVEN, NAME_MEDICAL_PROFESSIONAL, NUMERICAL_PII, OCCUPATION, ORGANIZATION, ORGANIZATION_MEDICAL_FACILITY, PASSPORT_NUMBER, PASSWORD, PHONE_NUMBER, SSN, VEHICLE_ID, BANK_ACCOUNT, CREDIT_CARD, CREDIT_CARD_EXPIRATION, CVV, DRUG
Sensitive data subset under Japan’s APPI.Includes: CONDITION, DRUG, GENDER, INJURY, LANGUAGE, MARITAL_STATUS, MEDICAL_PROCESS, ORIGIN, PHYSICAL_ATTRIBUTE, POLITICAL_AFFILIATION, RELIGION, SEXUALITY
Personal data entities covered by Quebec’s Privacy Act / Law 25.Includes: AGE, DATE, DATE_INTERVAL, DOB, DRIVER_LICENSE, EMAIL_ADDRESS, HEALTHCARE_NUMBER, IP_ADDRESS, LOCATION_ADDRESS, LOCATION_ADDRESS_STREET, LOCATION_CITY, LOCATION_COORDINATE, LOCATION_ZIP, NAME, NAME_FAMILY, NAME_GIVEN, NAME_MEDICAL_PROFESSIONAL, NUMERICAL_PII, OCCUPATION, ORGANIZATION, ORGANIZATION_MEDICAL_FACILITY, ORIGIN, PASSPORT_NUMBER, PHONE_NUMBER, POLITICAL_AFFILIATION, RELIGION, SSN, BANK_ACCOUNT, CREDIT_CARD, CREDIT_CARD_EXPIRATION, CVV, BLOOD_TYPE, CONDITION, DRUG, INJURY, MEDICAL_PROCESS, STATISTICS
Core set of the most commonly used PII entity types across all regulations.
Business and corporate data entities.Includes: ACCOUNT_NUMBER, DATE, DATE_INTERVAL, EMAIL_ADDRESS, FILENAME, LOCATION, LOCATION_ADDRESS, LOCATION_ADDRESS_STREET, LOCATION_CITY, LOCATION_COUNTRY, LOCATION_STATE, LOCATION_ZIP, MONEY, NUMERICAL_PII, OCCUPATION, ORGANIZATION, PASSWORD, TIME, URL, VEHICLE_ID, BANK_ACCOUNT, CREDIT_CARD, CREDIT_CARD_EXPIRATION, CVV, ROUTING_NUMBER, DRUG, MEDICAL_PROCESS
Entities that can identify individuals within a local context.
Numerical PII entities, excluding PCI-specific types.Includes: ACCOUNT_NUMBER, NUMERICAL_PII, SSN, etc.Excludes: CREDIT_CARD, CVV, BANK_ACCOUNT, ROUTING_NUMBER

Individual entity types

Core PII

EntityWhat it catches
NAMEFull person names
EMAIL_ADDRESSEmail addresses
PHONE_NUMBERPhone/fax numbers
LOCATION_ADDRESSFull mailing addresses
DATESpecific dates
DOBDates of birth
SSNSocial security numbers (+ intl. equivalents)
PASSPORT_NUMBERPassport numbers
DRIVER_LICENSEDriver’s license numbers
IP_ADDRESSIPv4 / IPv6 addresses
URLWeb addresses
USERNAMELogins, handles

Financial / PCI

EntityWhat it catches
CREDIT_CARDCredit card numbers (incl. last 4)
BANK_ACCOUNTBank accounts, IBAN
CVVCard verification codes

Sensitive / GDPR Article 9

EntityWhat it catches
ORIGINNationality, ethnicity
RELIGIONReligious affiliation
POLITICAL_AFFILIATIONPolitical opinions
SEXUALITYSexual orientation
PHYSICAL_ATTRIBUTERace, physical traits

Healthcare (HIPAA / GDPR)

EntityWhat it catches
CONDITIONMedical conditions, diseases
DRUGMedications, supplements
HEALTHCARE_NUMBERHealth plan IDs, medical record numbers
MEDICAL_PROCESSTreatments, procedures