The short version. Most AI workspace tools fall into one of three privacy postures. Cloud workspaces (Notion AI, ChatGPT Plus, Reflect) store your data on their servers and route AI through their pipeline. BYOK desktop tools (Projelli, Cursor, Continue.dev) keep your data on your machine and route AI calls directly to your chosen provider with your key. Fully local AI (Ollama, LM Studio) keeps everything on your machine, including the model. The right posture depends on the threat you actually care about. This page maps each.
I'm going to walk through where your data actually goes when you use an AI workspace tool. Not the marketing version. The technical version, with the specific terms of service that govern each path.
Most founders I talk to are vaguely uncomfortable with putting their pitch decks and customer interview notes into ChatGPT, but haven't read the terms closely enough to know what's actually true. So they either trust the marketing language ("we don't sell your data") or they don't use AI on the sensitive work at all, which means they leave a lot of value on the table. There's a better posture, and it starts with knowing the actual data flow.
Before talking about where the data goes, take stock of what's actually in there. For an indie founder, an AI workspace typically holds:
Most of this is more sensitive than the email you're encrypting and the password manager you're paying for. Yet the privacy posture for AI workspaces is often weaker than either. Closing that gap is what this page is about.
Every AI workspace falls into one of three categories. Knowing which category you're using is the first move.
Examples: Notion AI, ChatGPT Plus, Claude.ai Pro, Reflect, Mem.ai, Tana.
Data flow:
Who has a copy: the workspace company, the AI provider, you. Three parties.
Who can see your prompt: workspace company employees with access (typically gated by SOC 2 controls), AI provider employees with access (similar), and anyone the workspace company sells to or shares with under their TOS.
What changes if you cancel: data deletion is governed by the workspace company's retention policy. Notion retains data for 30 days post-deletion; other tools vary.
Examples: Projelli, Cursor (with BYOK), Continue.dev, Obsidian + AI plugins.
Data flow:
Who has a copy: you (the file on your machine) and the AI provider (in their API logs). Two parties.
Who can see your prompt: you and the AI provider. The desktop app's company is not in the path. There are no app-company servers seeing your data.
What changes if you cancel: nothing. Your files are on your machine, in Markdown. The desktop app uninstalling doesn't delete your data; you'd have to delete the folder yourself.
Examples: Ollama running Llama 3 or Mistral; LM Studio with any open-weight model; Projelli with the Ollama provider configured.
Data flow:
Who has a copy: you. One party.
Who can see your prompt: you.
This is the strictest posture. The trade-off is model quality: open-weight models in 2026 are competitive with cloud frontier models on many tasks but lag on others (long-form strategic synthesis, very long context, latest knowledge cutoffs).
Here is the specific TOS / policy language for each provider, with the version that applies to BYOK (API tier) usage. Read these for yourself; they update.
For high-sensitivity content (anything you'd be uncomfortable seeing in a court filing), the lowest-exposure posture is local Ollama. The next-lowest is BYOK with Anthropic API. After that, BYOK with OpenAI API. Notion AI and consumer ChatGPT Plus are higher-exposure than any of those.
For lower-sensitivity content (drafting blog posts, brainstorming features, working through a stuck pitch), any of these postures is reasonable. The cost of strict privacy is convenience and sometimes quality. Pick the right posture for the actual document.
Privacy talk gets vague when there's no specific threat in mind. Three concrete ones for an indie founder, ranked by likelihood:
The most likely threat is also the most boring. The vendor changes their data policy in a way you don't notice. They update the consumer-tier training-data clause. They get acquired. They sunset a feature. Your data was fine yesterday and is being used differently tomorrow.
Mitigation: keep the data in a posture where the vendor can't unilaterally change the rules on stuff that's already on your hard drive. Local-first BYOK accomplishes this.
A workspace company has a security incident. Your data is among what's exfiltrated. This is rare but not unheard of (Notion, Asana, Confluence, Slack have all had incidents in the last decade).
Mitigation: the only data the workspace company has is the data you sent to them. With local-first BYOK, that's nothing. With cloud workspace, it's everything.
Less likely for most founders but real. A regulator, opposing counsel, or law enforcement requests records. Whatever the workspace company holds is potentially producible. Whatever you hold on your machine is also potentially producible, but the path is different and the protections (attorney-client privilege, fifth amendment for self-incrimination) are stronger.
Mitigation: keep workspace data local; AI provider sees only the specific text you sent for inference, not the whole archive.
Determined targeted state-actor attacks. Defending against those is nation-state-grade work, beyond any consumer software. If you genuinely have that threat model (security researcher, dissident, journalist on certain beats), you need air-gapped machines and operational security beyond an "AI workspace privacy guide."
For a founder doing strategic work with AI in 2026, here's the working checklist:
Specifically, here's what Projelli does:
Source code is open at github.com/projelli/projelli. The full privacy policy is at /legal/privacy.
Consumer ChatGPT (chatgpt.com) trains on conversations by default for free and Plus users; you can opt out in Settings → Data Controls. The OpenAI API (which powers BYOK tools like Projelli) does not train on inputs by default. Source: OpenAI's API data usage policy.
Anthropic's commercial API (BYOK) does not train on inputs by default per their commercial terms. Claude.ai consumer tier may use opted-in conversations. Anthropic's policy is generally stricter than OpenAI's at the consumer tier.
Yes. Notion stores all workspace data on Notion's servers, and Notion AI processes that data through its model providers (OpenAI and Anthropic) to generate responses. Per Notion's documentation, customer data is not used to train these third-party models. But the data lives on Notion's servers regardless of AI use.
Local-first AI tools store your conversations on your machine and don't have their own servers in the data path. The only network call goes from your machine directly to your chosen AI provider, using your own API key. The tool's company never sees your prompts. This is more private than any cloud workspace, but the AI provider still sees the specific text you choose to send for inference.
Yes. With Ollama or LM Studio running open-weight models, the AI itself runs on your machine. Nothing leaves your device. The trade-off is quality: open-weight models lag behind Claude / GPT / Gemini on long-form strategic work. Most founders use a hybrid: BYOK Claude for high-stakes tasks, local Ollama for volume tasks where 90% quality is fine.
All major US AI providers can be compelled to produce data via subpoena or court order. The retention windows differ: OpenAI keeps API logs for 30 days by default; Anthropic for similar windows. Local-first tools don't have this exposure for the data on your machine, only for the specific text you sent to the AI provider during inference.
Encrypt the disk (FileVault on macOS, BitLocker on Windows). With disk encryption, a stolen machine reveals nothing. The API keys in the OS keychain are protected by your account password, separately from the disk encryption.
Projelli is local-first. Your files are on your machine. Your API keys are in your OS keychain. Your AI conversations never touch our servers.
Get Projelli