Evaluation & Safety

Data Privacy in AI

The set of concerns and practices around protecting personal and sensitive information when using AI systems, covering training data, user inputs, model outputs, and data retention.

Why it matters

Every prompt you send to an AI service is data you are sharing with a third party. Understanding how AI providers handle your data is essential for compliance, trust, and responsible deployment.

Key concerns

  • Training data usage — are your prompts and conversations used to train future models? Most enterprise AI agreements offer opt-outs, but default consumer plans often do not.
  • Data residency — where is your data processed and stored? Regulations like GDPR require data to stay within certain jurisdictions.
  • Input sensitivity — pasting customer data, source code, or financial records into a prompt means that data leaves your environment.
  • Output leakage — models can sometimes reproduce training data, including personal information they should not have memorized.

Practical guidelines

  • Review terms of service — understand whether your inputs are used for training. Use enterprise tiers that guarantee they are not.
  • Minimize sensitive data — anonymize or redact PII before including it in prompts.
  • Use local models — for highly sensitive use cases, run models on-premises so data never leaves your network.
  • Implement access controls — not everyone in your organization needs access to the same AI tools with the same data permissions.