ComplyBar | How to Prevent Confidential Data Uploads to ChatGPT

A single employee pasting a client spreadsheet into ChatGPT can create a significant compliance exposure. Yet most organisations have no technical controls to prevent it and no way to detect when it happens. This guide covers the practical steps to build a defensible data protection posture around AI tool usage.

Step 1: Know What AI Tools Are Actually Being Used

Before you can control AI tool usage, you need to know what is happening. In most organisations, the actual landscape is far broader than IT is aware of:

ChatGPT (personal accounts, often on personal devices)
Microsoft Copilot (may be enabled by default in Microsoft 365 tenants)
Google Gemini (enabled in Google Workspace)
Perplexity, Claude, Grok and dozens of specialist AI tools
AI features embedded in Outlook, Word, Excel and other productivity apps

An AI tool audit — asking staff directly, reviewing browser history on managed devices, and examining network logs — typically reveals 3–5 times more tools than management expects.

Step 2: Classify Your Data First

You cannot protect what you have not classified. Before implementing AI controls, you need at minimum a basic data classification framework:

Public: Information that is safe to share publicly.
Internal: Business information not intended for public release.
Confidential: Client data, commercially sensitive information, personal information.
Restricted: Special personal information (health, financial, HR records), legally privileged information.

The rule is simple: Confidential and Restricted data should never enter a consumer AI tool. This rule needs to be clearly communicated, not assumed.

Step 3: Write a Clear AI Acceptable Use Policy

The policy must be unambiguous. Vague language like “use AI tools responsibly” creates no protection. A defensible policy explicitly states:

Which AI tools are approved for which purposes
That no client personal information may be entered into any AI tool without explicit approval
That no confidential business information may be entered into consumer AI tools
That file uploads to AI tools are prohibited for non-public documents
The consequences of violation
The process for requesting approval to use a new AI tool

Critically: employees must sign or acknowledge this policy. An unacknowledged policy is very difficult to enforce and provides limited legal protection.

Step 4: Implement Technical Controls

Policy without enforcement is wishful thinking. The most effective technical controls are:

Browser-Level Controls

Deploy browser extensions that detect file uploads to AI tool domains and display a warning or block the action
Configure browser extensions to flag when text from sensitive documents is pasted into AI interfaces
Use managed browser profiles to enforce policies across all managed devices

Network-Level Controls

Configure web filtering to block uploads to consumer AI tool endpoints (while allowing browsing)
Implement SSL inspection to monitor encrypted traffic to AI tool domains on managed networks
Use CASB (Cloud Access Security Broker) solutions for organisations with Microsoft 365 or Google Workspace

Endpoint Controls

Configure DLP (Data Loss Prevention) rules to prevent documents classified as Confidential or Restricted from being uploaded to non-approved destinations
Apply sensitivity labels in Microsoft 365 or Google Workspace that restrict copy-paste and sharing behaviour

Step 5: Move Staff to Approved Enterprise AI Tools

Blocking AI tools without providing an alternative creates frustration and shadow IT. The answer is to provide approved alternatives:

Microsoft Copilot for Microsoft 365 — Processes data within your Microsoft 365 tenant boundary, with a data processing agreement in place.
ChatGPT Enterprise or Team — Offers a data processing agreement and does not use conversations for training.
Azure OpenAI Service — Processes data within your Azure subscription. Conversations are not used for model training.

Whichever tools are approved, review the data processing agreement and privacy terms carefully before approving them for use with personal or confidential data.

Step 6: Train Staff on What “Confidential Data” Actually Means

Many employees do not recognise what constitutes personal information under POPIA. Training should use realistic examples from their actual role:

A client’s name and email address is personal information
An employee’s salary is personal information
A client’s tax number is personal information
A spreadsheet of debtors is personal information if it contains names
A scanned ID document is always Restricted personal information

Step 7: Monitor and Audit

Controls without monitoring create a false sense of security. Minimum monitoring requirements:

Monthly review of AI tool usage logs on managed devices
Quarterly staff survey to identify new AI tools in use
Annual review of the AI acceptable use policy and approved tool list
Incident response procedure for when a prohibited upload is detected or reported

The Bottom Line

Preventing confidential data from reaching AI tools is a governance challenge, not just a technical one. It requires classification, policy, training, technical controls and monitoring working together. Organisations that implement all five layers are in a defensible position. Those that rely on any single layer are not.

Find out where your business stands on this risk.

ComplyBar helps businesses identify hidden risks in how information, AI tools, email, documents and cloud systems are used. A structured assessment gives management the visibility to know - not just assume.

Built for POPIA support, AI governance, data leak prevention, employee risk awareness, information governance and audit evidence.

Start Risk Assessment → Free Checklist View Plans

How to Prevent Confidential Data Uploads to ChatGPT