Legal Concerns When Uploading to Public LLM

Podcast discusses using confidential information with AI models – 8min
https://podcasts.apple.com/us/podcast/ai-in-regulated-industries-ai-agents-ai-training-when/id1548733275?i=1000729691423

Further reading:
https://resea.ai/share/bWEyaHY1cTM4Zm1ldXI3

Uploading confidential information to a service like ChatGPT generally constitutes a disclosure to a third party (OpenAI), which can violate confidentiality obligations depending on the context, such as non-disclosure agreements (NDAs), employment contracts, or legal requirements for protecting sensitive data. Here’s a breakdown of why this is often the case, based on how such AI services handle data:

1. Data Handling by the AI Provider

When you input data into ChatGPT, it is transmitted to OpenAI’s servers, stored for periods like up to 30 days for temporary chats (for safety and moderation purposes), and potentially used to improve their services or train models unless you explicitly opt out.
This means the information is no longer solely under your control and could be accessed by OpenAI employees, vendors, or affiliates for operational reasons.
Even with opt-outs for model training, the act of uploading itself is a form of sharing with a third party, which may breach confidentiality if the information is protected (e.g., trade secrets, client data, or personal identifiable information).

2. Legal and Ethical Risks

Under NDAs or Contracts

If the information is covered by an NDA, sharing it with an AI tool like ChatGPT typically violates the agreement because it’s an unauthorized disclosure to an external entity.
This applies to employee scenarios where company policies prohibit such sharing.

Privacy Laws

For data like personal health information (e.g., under HIPAA in the US) or EU GDPR-protected data, uploading could lead to non-compliance, as AI services aren’t inherently secure for such purposes and may result in unintended exposure.

Professional Obligations

In fields like law or medicine, inputting privileged or confidential client data risks waiving privilege or breaching ethical duties.

General Risks

Data leaks have occurred in AI systems, and even without malice, features like shared chats or integrations could expose information.
Studies show that confidential data is frequently pasted into tools like ChatGPT, leading to potential leaks hundreds of times per week in average companies.

3. Exceptions and Mitigations

If the information is your own non-sensitive data with no external obligations, uploading might not technically “violate” confidentiality in a legal sense, but it’s still risky due to potential data breaches or future policy changes.
Enterprise versions of AI tools (e.g., ChatGPT Enterprise) often have stronger privacy controls, like no training on your data, which reduces risks compared to consumer versions.
Always review the AI service’s terms and privacy policy before uploading anything. For ChatGPT, opt out of data usage for training via their settings, but this doesn’t eliminate all disclosure risks.

In Summary

Yes, you are likely violating confidentiality by uploading such information to a public AI like ChatGPT, as it involves disclosing it to a third party without guaranteed protections.
Consult a legal professional for advice specific to your situation, and consider anonymizing data or using self-hosted AI alternatives if possible.