Your Query Data Is Not Retained
We do not store, log, or retain the content of your inference requests or responses. Prompts and completions are processed transiently to serve your request and are never persisted. The only usage data we retain is the token-accounting metadata required to operate and bill the service: your input tokens, your output tokens, and your cached tokens. This metadata records how much you used, never what you sent or received. We have no record of the substance of your queries, and your data is never used to train or improve models.Security
Encrypted in transit
All API traffic is encrypted using industry-standard TLS.
Scoped per organization
API keys are issued per organization, required for every request, and revocable at any time.
Least-privilege access
Internal access to production systems is restricted to authorized personnel.
Vetted subprocessors
We build on a small set of established, security-conscious infrastructure providers.
Privacy
We collect only the personal data needed to provide and support the service, such as account identity, contact details, and basic technical and usage data. You retain rights over your personal data, including access, correction, and erasure. For full details, see our Privacy Policy and Security pages.Enterprise
For organizations with additional requirements, we offer:- Dedicated endpoints with isolated compute and no shared infrastructure. See Dedicated Endpoints.
- On-prem deployment to run inference in your own cloud or data center. See On-Prem.
- Custom security reviews with your security team.