Documentation Index
Fetch the complete documentation index at: https://docs.subconscious.dev/llms.txt
Use this file to discover all available pages before exploring further.
Streaming delivers tokens as they’re generated, which enables responsive UIs and real-time output. Subconscious uses the standard OpenAI streaming format: Server-Sent Events (SSE) with ChatCompletionChunk objects.
Basic Streaming
Set stream=True to receive a stream of chunks instead of waiting for the full response:
from openai import OpenAI
client = OpenAI(
api_key="your-api-key",
base_url="https://api.subconscious.dev/v1",
)
stream = client.chat.completions.create(
model="subconscious/tim-qwen3.6-27b",
messages=[{"role": "user", "content": "Explain how neural networks learn."}],
stream=True,
)
for chunk in stream:
content = chunk.choices[0].delta.content
if content:
print(content, end="", flush=True)
print()
Each event in the stream is a data: line containing a JSON ChatCompletionChunk object:
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","model":"subconscious/tim-qwen3.6-27b","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","model":"subconscious/tim-qwen3.6-27b","choices":[{"index":0,"delta":{"content":" world"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","model":"subconscious/tim-qwen3.6-27b","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
data: [DONE]
The final chunk has finish_reason: "stop" and is followed by data: [DONE].
Error Handling
Errors during streaming are delivered as SSE events. The OpenAI SDK raises exceptions automatically:
from openai import APIError
try:
stream = client.chat.completions.create(
model="subconscious/tim-qwen3.6-27b",
messages=[{"role": "user", "content": "Hello"}],
stream=True,
)
for chunk in stream:
content = chunk.choices[0].delta.content
if content:
print(content, end="", flush=True)
except APIError as e:
print(f"API error: {e.status_code} - {e.message}")
Usage Statistics
To receive token usage in the stream, include stream_options:
stream = client.chat.completions.create(
model="subconscious/tim-qwen3.6-27b",
messages=[{"role": "user", "content": "Hello"}],
stream=True,
stream_options={"include_usage": True},
)
for chunk in stream:
if chunk.usage:
print(f"Input: {chunk.usage.prompt_tokens}, Output: {chunk.usage.completion_tokens}")
elif chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
Usage data is included in the final chunk of the stream.