Cloud API - Subconscious Docs

The Subconscious Cloud API is the fastest way to access our inference infrastructure. All requests go through api.subconscious.dev, our managed gateway that handles authentication, rate limiting, and routing.

How It Works

OpenAI compatible

Use the standard OpenAI SDK with no custom libraries needed

Global edge

Low-latency access from anywhere

Managed infrastructure

We handle scaling, reliability, and uptime

Architecture

Your API requests flow through a managed gateway:

Your application calls api.subconscious.dev/v1/chat/completions
The gateway authenticates your API key and enforces rate limits
The request is routed to the inference cluster
Responses stream back to your application

API Keys

API keys are created and managed from your Subconscious dashboard. Each organization can have multiple active keys.

Python

from openai import OpenAI

client = OpenAI(
    api_key="your-api-key",               # From your dashboard
    base_url="https://api.subconscious.dev/v1",
)

Default Limits

Limit	Default
Tokens per minute	1,000,000
Requests per minute	100
Tokens per day	10,000,000

Need higher limits? See Dedicated Endpoints or contact us.

Billing

Usage is billed per token. See Pricing for current rates.

Credits are added from your dashboard
Usage is deducted automatically per request
Auto-pay is available for uninterrupted service
Usage dashboards show real-time consumption

Thinking Mode Dedicated Endpoints

Documentation Index

​How It Works

OpenAI compatible

Global edge

Managed infrastructure

​Architecture

​API Keys

​Default Limits

​Billing

How It Works

Architecture

API Keys

Default Limits

Billing