AI Browsers are Harvesting Information Without Your Consent

Home
/
Insights
/
AI Browsers are Harvesting Information Without Your Consent

October 24, 2025

Chris Penn's LinkedIn post stopped me cold. It opened my eyes to something I’d suspected, but never really tested. Chris did.

He'd run a simple experiment with Google Gemini's Chrome integration. He asked it to record all the reactions on a LinkedIn post from Marcus Sheridan. The AI browser, of course, dutifully obliged. Then he asked for the LinkedIn profile URLs of everyone who reacted. It delivered. Finally, he requested a downloadable CSV file of all 292 people who engaged with the post.

The AI compiled it without hesitation.

"All that data normally invisible to Google is now on Google's servers for processing," Chris wrote. Then he had the AI transcribe all the comments on an AI ethics post. The irony wasn't lost on him.

His closing insight hit hard: "Did the 292 people that engaged on Rebecca's post opt into AI data collection? Maybe, maybe not. But they are not given a choice and their data is being vacuumed up by AI-enabled browsers being used by others."

That post sent me down a security rabbit hole. What I found was worse than I expected.

Your Browser Is Reading Everything You See

Here's what makes AI browsers fundamentally different from traditional ones. When you use ChatGPT Atlas, Perplexity Comet, or Google Gemini in Chrome, you're not just browsing. The AI assistant reads the HTML of every page you share with it. It parses the content, extracts the data, and transmits it to company servers for processing.

This happens because of how these tools work. The AI model must process both your command and the webpage content in the same context. That page showing Marcus Sheridan's LinkedIn post? The browser received the HTML containing all 292 names, profile links, and reactions. When Chris asked Gemini to extract that data, it simply read what was already there and formatted it for him.

The boundary between what you control and what the website controls has collapsed. Traditional browsers kept those worlds separate. AI browsers merge them by design.

Server-Side Processing: The Moment of Exposure

Despite marketing claims about "local storage" and "on-device" processing, all three major AI browsers transmit your page content to their servers when you activate AI features. This transmission is required for the AI to function. The models are too large to run locally with full capability.

OpenAI's Atlas uses a "browser memories" feature that processes page content on their servers and stores AI-generated summaries. Google's Gemini uses "the content of the tab you're viewing" and may temporarily log it to your account. Perplexity's Comet collects "URLs, page text, images, and other resources" as browsing data.

That server transmission is where the privacy exposure occurs. Once your page content hits their infrastructure, it's subject to their data retention policies, training practices, and security vulnerabilities.

Who's Protecting You by Default?

The three browsers take dramatically different approaches to your data.

OpenAI ChatGPT Atlas is the most protective. Browsing content is not used for model training by default. You must explicitly enable two separate toggles to opt in: "Improve the model for everyone" and "Include web browsing." Even if you opt in, OpenAI says it won't train on pages that block GPTBot.

Perplexity Comet operates on an opt-out model. For individual users, browsing data and queries are used to improve AI models and search functionality by default. You must manually disable the "AI Data Usage" toggle in settings. Most users don't.

Google Gemini in Chrome also uses opt-out defaults. If "Keep Activity" is enabled, which it often is, your saved activity is used to develop and improve Google services, including training generative AI models. A subset of chats may be human-reviewed and stored for up to three years.

The business model is clear. Privacy is a premium feature. Enterprise and business customers get strong protections with no training and no human review. Individual consumers subsidize model development with their data unless they know to opt out.

The People You Never See

Here's what Chris Penn identified that matters most. When you use an AI browser, you're not just exposing your own data. You're transmitting everyone else's data that appears on your screen.

Those 292 people who reacted to Marcus Sheridan's post? They didn't consent to having their information extracted and sent to Google's servers. The commenters on Rebecca Bultsma's AI ethics post? Same story. Email recipients visible in your inbox, social media connections on a profile page, names listed in a company directory. All of them become collateral data collection.

These people have no consent mechanism. They can't opt out of something they don't know is happening. The AI browser is reading the HTML your browser received, which includes data about third parties who have no relationship with the browser vendor.

This is the "Matrix" problem Chris described. As users browse with AI-enabled tools, they become unwitting data harvesters. They're doing the AI companies' work for them, transmitting information the companies themselves might not be able to access because of paywalls, login requirements, or access restrictions.

But It Gets Even More Concerning

You think you’re sweating, now? Think about everywhere you browse while logged in. Banking sites. Email. Medical portals. Paywalled journalism. Your company's intranet. Private communities. Confidential SaaS applications.

All three AI browsers can see whatever your browser can see once you share a tab with them. The content is already rendered on your screen. The browser just reads the HTML and transmits it for AI processing.

The Electronic Frontier Foundation tested Atlas and found it created and stored browser memories related to highly sensitive information. The AI captured details about a user "registering for sexual and reproductive health services via Planned Parenthood Direct" and included the name of a real doctor. This happened despite OpenAI's claims of automated PII filters.

The filters exist. They just fail in practice. The gap between vendor policy and actual performance is real and documented.

For banking and financial sites, Atlas's "agent mode" is designed to pause on financial institutions and cannot access saved passwords. But that protection doesn't stop you from explicitly sharing a bank statement page with ChatGPT. If you ask it to analyze your transaction history for budget insights, the AI processes that content on OpenAI's servers.

The same applies to emails. Comet offers Gmail and Calendar connectors that are opt-in. Once connected, email contents you ask it to process are in scope for the browser's data handling practices. For consumer accounts, that potentially includes model training unless you've opted out. For enterprise accounts, Perplexity commits that data is never used for training.

The Unsolved Security Problem

Beyond data collection for training, there's a more immediate threat. Indirect prompt injection is what OpenAI's Chief Information Security Officer calls "a frontier, unsolved security problem."

Security researchers demonstrated a proof-of-concept attack called "CometJacking" on Perplexity's browser. A malicious URL contained hidden instructions for the AI agent. When a user clicked the link, the browser's AI interpreted those hidden commands as legitimate user intent. The attack instructed Comet to access data from other open tabs, like a logged-in Gmail session, encode the sensitive data using base64 to bypass security filters, and exfiltrate it to an attacker-controlled server.

The vulnerability exists because the AI model processes webpage content and user commands in the same context. It cannot reliably distinguish between "instructions you gave it" and "instructions embedded in a webpage you're viewing."

OpenAI acknowledges there's no technical solution yet. Their mitigation strategy places the burden on users. They recommend "Logged Out Mode" to run the agent without access to authenticated sessions, and "Watch Mode" that requires users to keep sensitive tabs in the foreground and actively monitor the agent's actions.

This shifts security responsibility from the system to the user. You must perform continuous risk assessment and make perfect decisions about when to deploy the AI agent. That's an unsustainable model.

Enterprise Protection vs. Consumer Exposure

The contrast between enterprise and consumer protections is significant.

For Google Workspace customers, Gemini submissions aren't used for training and aren't human-reviewed. The platform is covered by Google's extensive compliance certifications, including SOC 2, ISO, and HIPAA.

Perplexity commits that enterprise user data is never used to train or fine-tune models. The platform holds SOC 2 Type II certification.

OpenAI's Atlas, however, is explicitly in "early access" for business and enterprise customers. It's not yet covered by OpenAI's SOC 2 or ISO attestations. Some Atlas-specific data types may not be covered by enterprise retention, segregation, or deletion policies. The documentation advises customers to avoid using Atlas for regulated, confidential, or production data.

For individual consumers, the protections are weaker across all three platforms. Default settings favor data collection. Opt-outs are available, but users must know they exist, where to find them, and actually know and care enough to click the button.

What You Can Actually Do

The practical guidance from a security perspective is straightforward but unsatisfying.

Use separate browser profiles or virtual machines for sensitive tasks. Don't enable AI features on profiles you use for banking, healthcare, legal work, or confidential business applications.

Never grant AI browsers access to email or calendar connectors. The demonstrated ability of prompt injection attacks to exfiltrate data from connected services makes this too risky.

Disable all training and data sharing features immediately after installation. For Atlas, keep "Improve the model for everyone" and "Include web browsing" turned off. For Comet, disable the "AI Data Usage" toggle and use Incognito mode for sensitive browsing. For Gemini, turn "Keep Activity" off in your Gemini Apps Activity settings.

For enterprises, prohibit the use of these browsers on corporate devices until vendors can provide verifiable technical solutions to prompt injection and bring products fully under enterprise compliance certifications.

Recognition, Not Resolution

These tools offer genuine utility. The ability to interact with web content through natural language is powerful. The productivity gains are real for certain tasks.

But the technology is not mature from a security and privacy standpoint. The core innovation that makes these browsers compelling is the same feature that creates systemic vulnerability. The AI must process trusted commands and untrusted content together, and there's currently no reliable way to prevent their exploitation.

The burden of security falls on users to maintain constant vigilance. That's not sustainable. And the invisible data harvesting of third parties who never consented creates a fundamental consent problem for the AI era.

Use these tools if you choose. But use them with your eyes open. Understand that your browsing activity isn't just about you anymore. You're transmitting data about everyone whose information appears on your screen.

Until vendors solve prompt injection at the model level and provide meaningful consent mechanisms for third-party data, these are experimental tools. Powerful, yes. But premature for any use case involving confidential, sensitive, or regulated information.

Chris Penn saw it clearly. We're all doing the machines' work now. The question is whether we're doing it knowingly or unknowingly.

Key Citations:

Share0

Tweet0

Share0

AI Browsers are Harvesting Information Without Your Consent

Your Browser Is Reading Everything You See

Server-Side Processing: The Moment of Exposure

Who's Protecting You by Default?

The People You Never See

But It Gets Even More Concerning

The Unsolved Security Problem

Enterprise Protection vs. Consumer Exposure

What You Can Actually Do

Recognition, Not Resolution

Key Citations:

95% of AI Initiatives Fail

What are the First Practical Steps to Bring AI Into my Company

Your Marketing Team Is Falling Behind on AI. Here’s How to Fix It.

How to Conduct an AI Readiness Assessment for Your Organization

5 Reasons Enterprise AI Strategies Fail at Scale

The Real Cost of Not Training Employees on AI by Company Size