Appearance
Best Practices
Recommendations for integrating Questa Anonymizer in production.
Production runs self-hosted. These integration patterns (retries, high-volume chunking, secure map storage) apply to a production self-hosted instance. The hosted demo (
demo.questa-ai.online) is for evaluation only. See Hosted vs Self-Hosted.
1. Filter Entity Types
Running with all entity types enabled increases latency and false positives. Scope detection to the entities relevant to your use case.
python
# Scope to relevant entity types only
payload = {
"text": text,
"entities": "PERSON_NAME,EMAIL_ADDRESS,PHONE_NUMBER"
}| Use Case | Recommended Entities |
|---|---|
| Customer support logs | PERSON_NAME,EMAIL_ADDRESS,PHONE_NUMBER,ADDRESS |
| Financial documents | CREDIT_CARD,IBAN,VAT_NUMBER,NATIONAL_ID |
| Medical records | PERSON_NAME,DATE,ADDRESS,NATIONAL_ID |
| HR documents | PERSON_NAME,EMAIL_ADDRESS,PHONE_NUMBER,ADDRESS,NATIONAL_ID |
| Source code / logs | EMAIL_ADDRESS,API_KEY,LICENSE_KEY,USERNAME |
2. Handle Errors Gracefully
python
import time
def anonymize_with_retry(client, payload, max_retries=3):
for attempt in range(max_retries):
try:
response = client.post("/anonymize/text", json=payload)
if response.status_code == 200:
return response.json()
if response.status_code == 401:
raise PermissionError("Bearer token is missing or invalid")
if response.status_code == 429:
retry_after = int(response.headers.get("Retry-After", "5"))
time.sleep(retry_after)
continue
if response.status_code >= 500:
time.sleep(2 ** attempt)
continue
response.raise_for_status()
except requests.exceptions.RequestException as e:
if attempt == max_retries - 1:
raise
time.sleep(2 ** attempt)3. Use Start Index for Chunked Processing
When processing text in chunks, pass start_index to ensure placeholder numbering is globally unique.
python
offset = 0
for chunk in split_into_chunks(large_text, chunk_size=10000):
result = anonymize_text(chunk, start_index=offset)
offset += result["entities"]4. Store De-anonymisation Maps Securely
If you need to reverse anonymization, store the map in an encrypted database. Never expose the map to untrusted users.
python
import json
from cryptography.fernet import Fernet
key = Fernet.generate_key()
cipher = Fernet(key)
encrypted = cipher.encrypt(json.dumps(result["map"]).encode())5. Implement Exponential Backoff
python
import time
from functools import wraps
def retry_with_backoff(max_retries=5, base_delay=1.0):
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
for attempt in range(max_retries):
try:
return func(*args, **kwargs)
except Exception as e:
if attempt == max_retries - 1:
raise
delay = base_delay * (2 ** attempt)
time.sleep(delay)
return wrapper
return decorator6. Security
- Always send requests over HTTPS.
- Validate file types before uploading (check MIME type server-side).
- Set file size limits at your reverse proxy (recommended: 100 MB max).
- Never hardcode your license key (self-hosted) or evaluation key (hosted demo) in source code. Use environment variables or a secrets manager.
Next: Entity Types