Evaluation & Safety
Data Privacy in AI
The set of concerns and practices around protecting personal and sensitive information when using AI systems, covering training data, user inputs, model outputs, and data retention.
Why it matters
Every prompt you send to an AI service is data you are sharing with a third party. Understanding how AI providers handle your data is essential for compliance, trust, and responsible deployment.
Key concerns
- Training data usage — are your prompts and conversations used to train future models? Most enterprise AI agreements offer opt-outs, but default consumer plans often do not.
- Data residency — where is your data processed and stored? Regulations like GDPR require data to stay within certain jurisdictions.
- Input sensitivity — pasting customer data, source code, or financial records into a prompt means that data leaves your environment.
- Output leakage — models can sometimes reproduce training data, including personal information they should not have memorized.
Practical guidelines
- Review terms of service — understand whether your inputs are used for training. Use enterprise tiers that guarantee they are not.
- Minimize sensitive data — anonymize or redact PII before including it in prompts.
- Use local models — for highly sensitive use cases, run models on-premises so data never leaves your network.
- Implement access controls — not everyone in your organization needs access to the same AI tools with the same data permissions.
Related Terms
Guardrails— Programmatic constraints placed around AI model inputs and outputs to prevent harmful, off-topic, or policy-violating behavior.AI Alignment— The challenge of ensuring AI systems act in accordance with human intentions and values — making them do what we actually want, not just what we literally ask for.AI Bias— Systematic unfairness in AI outputs caused by skewed training data, flawed labelling, or model design choices that reflect and amplify existing societal inequities.