Download Now

Data Privacy & Security

Understand AI provider data policies to make informed choices

Why It Matters

As an AI Agent, SailFish sends the following data to AI providers:

  • Terminal output (IPs, paths, processes)
  • Command history (may contain secrets)
  • Server configurations
  • Business code & logs
  • Risks go beyond training — breaches, subpoenas, and internal access are all threats

International Providers

API data is not used for model training by default, with explicit contractual commitments. Consumer products (ChatGPT, Claude.ai, etc.) have different policies.

Anthropic (Claude) No training by default
Training: Not used for training by default
Retention: Auto-deleted within 30 days
Encryption: TLS in-transit + at-rest encryption
Zero retention: Zero retention agreement available
Data ownership: User owns inputs and outputs

Only opt-in via Developer Partner Program. Cleanest data policy among all providers.

View policy →
OpenAI (GPT-4o / o3) No training by default
Training: Not used by default (since Mar 2023)
Retention: Abuse monitoring logs kept 30 days
Encryption: AES-256 at-rest + TLS 1.2+ in-transit
Zero retention: Available (enterprise, by request)
Data ownership: User owns inputs and outputs

Most transparent policies, richest enterprise options. Supports customer-managed encryption keys (EKM).

View policy →
Google (Gemini) No training by default
Training: Paid API: No; Free tier: may use
Retention: Abuse detection logs kept 55 days
Encryption: In-transit and at-rest encryption
Zero retention: Not available
Data ownership: Paid API data not used to improve products

Free tier has training risk. Longer retention (55 days) than peers. Enterprise covered by Cloud DPA.

View policy →

China-based Providers

Policies below are for each provider's API platform (not consumer apps). Most API platforms have separate service agreements that may be stricter than consumer products. All providers are subject to PIPL and Data Security Law.

Bailian (Alibaba) No training by default
Training: Explicitly will not use for training
Opt-out: N/A (no training by default)

Official docs state: "We will never use your data for model training." AES-256 encryption. Clearest API policy among China providers.

View policy →
DeepSeek Opt-out available
Training: May use by default (de-identified)
Opt-out: Toggle off "data for experience optimization"

Has separate Open Platform Terms of Service, but training policy still references main privacy policy.

View policy →
Doubao (Volcengine) Opt-out available
Training: May use by default (de-identified)
Opt-out: Toggle off "help improve model"

API served via Volcengine (Ark platform) with separate Model Service Agreement and Data Authorization Agreement.

View policy →
Qianfan (Baidu) Separate agreement
Training: Separate security whitepaper for API
Opt-out: See Qianfan-specific terms

API via Qianfan platform (not Ernie Bot app). Has independent agreement, security whitepaper, AES-256 encryption, Level 3 security certification.

View policy →
MiniMax Opt-out available
Training: May use by default (de-identified)
Opt-out: Contact support to opt out

Open platform (platform.minimaxi.com) has separate platform agreement. Gaining prominence with abab model series and Hailuo AI video generation.

View policy →
Zhipu GLM Vague policy
Training: Not explicitly stated
Opt-out: Not specified

Open platform (bigmodel.cn) has separate privacy policy. Users own their data. Prohibits using outputs to train competing models.

View policy →
Kimi (Moonshot) Vague policy
Training: May use to "improve service"
Opt-out: Not specified

Open platform (platform.moonshot.ai) has separate Terms of Service and privacy policy.

View policy →

Risks Beyond Training

"Not used for training" is good news, but your data still faces multiple risks during transit and storage:

Server Breach
Data kept during retention periods is exposed if servers are compromised. Longer retention = higher risk.
Internal Access
Human reviewers may see your conversations during abuse monitoring and content moderation.
Government Requests
Law enforcement in different jurisdictions can compel providers to hand over user data.
Third-party Subprocessors
Cloud infrastructure providers, content moderation vendors may also access your data.
Retention Window
Even without training, data sitting on servers for 30-55 days widens the exposure window.
Cross-border Transfer
Data may be stored in different countries with varying levels of legal protection.

Overall Safety Rating

Recommended
No training + encryption + clear retention
AnthropicOpenAIGoogle (paid)Bailian (Alibaba)
Use with care
Manual opt-out needed / separate agreements
DeepSeekDoubaoQianfan (Baidu)MiniMax
Not for sensitive data
Training policy unclear
ZhipuKimiGoogle (free)

Legal Landscape

Choose by Scenario

High-sensitivity (finance/gov)
Self-host open-source models (DeepSeek / Qwen)
Enterprise / production
Anthropic / OpenAI paid API
Personal dev / learning
Any API (ensure training toggle is off)
Public information
Any provider

Security Tips

  1. Prefer API over consumer products — API data policies are generally stricter than ChatGPT, Ernie Bot app, etc.
  2. For China-based providers, check settings — confirm "data for model improvement" toggles are disabled
  3. Never paste secrets directly in conversations — even with no-training promises, data still transits their servers
  4. Review provider policies periodically — privacy policies can change, review every 6 months
  5. Consider local models for sensitive environments — data never leaves your machine, the safest option

Based on publicly available provider policy documents as of March 2026. For reference only, not legal advice. Policies may change — verify via original links before use.