Streaming - Subconscious Docs

Streaming delivers tokens as they’re generated, which enables responsive UIs and real-time output. Subconscious streams using whichever wire format you call: the OpenAI format emits Server-Sent Events (SSE) with ChatCompletionChunk objects, and the Anthropic Messages format emits the Anthropic event protocol (message_start, content_block_delta, message_stop).

Basic Streaming

Set stream=True (or use client.messages.stream(...) with the Anthropic SDK) to receive a stream of chunks instead of waiting for the full response:

from openai import OpenAI

client = OpenAI(
    api_key="your-api-key",
    base_url="https://api.subconscious.dev/v1",
)

stream = client.chat.completions.create(
    model="subconscious/tim-qwen3.6-27b",
    messages=[{"role": "user", "content": "Explain how neural networks learn."}],
    stream=True,
)

for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="", flush=True)
print()

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "your-api-key",
  baseURL: "https://api.subconscious.dev/v1",
});

const stream = await client.chat.completions.create({
  model: "subconscious/tim-qwen3.6-27b",
  messages: [{ role: "user", content: "Explain how neural networks learn." }],
  stream: true,
});

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content;
  if (content) process.stdout.write(content);
}
console.log();

from anthropic import Anthropic

client = Anthropic(
    auth_token="your-api-key",
    base_url="https://api.subconscious.dev",
)

with client.messages.stream(
    model="subconscious/tim-qwen3.6-27b",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Explain how neural networks learn."}],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)
print()

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  authToken: "your-api-key",
  baseURL: "https://api.subconscious.dev",
});

const stream = client.messages.stream({
  model: "subconscious/tim-qwen3.6-27b",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Explain how neural networks learn." }],
});

for await (const event of stream) {
  if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
    process.stdout.write(event.delta.text);
  }
}
console.log();

curl https://api.subconscious.dev/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "subconscious/tim-qwen3.6-27b",
    "messages": [{"role": "user", "content": "Explain how neural networks learn."}],
    "stream": true
  }'

curl https://api.subconscious.dev/v1/messages \
  -H "x-api-key: YOUR_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "subconscious/tim-qwen3.6-27b",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Explain how neural networks learn."}],
    "stream": true
  }'

SSE Format

Chat Completions

Each event in the stream is a data: line containing a JSON ChatCompletionChunk object:

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","model":"subconscious/tim-qwen3.6-27b","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","model":"subconscious/tim-qwen3.6-27b","choices":[{"index":0,"delta":{"content":" world"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","model":"subconscious/tim-qwen3.6-27b","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

The final chunk has finish_reason: "stop" and is followed by data: [DONE].

Messages

The Messages endpoint emits the Anthropic event protocol. Each SSE message has an event: type and a data: JSON payload, progressing through message_start, one or more content blocks (content_block_start → content_block_delta → content_block_stop), then message_delta and message_stop:

event: message_start
data: {"type":"message_start","message":{"id":"msg_abc123","type":"message","role":"assistant","content":[],"model":"subconscious/tim-qwen3.6-27b","stop_reason":null,"stop_sequence":null,"usage":{"input_tokens":12,"output_tokens":0,"cache_read_input_tokens":0}}}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Hello"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" world"}}

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn","stop_sequence":null},"usage":{"output_tokens":18}}

event: message_stop
data: {"type":"message_stop"}

Unlike the Chat Completions format, there is no data: [DONE] sentinel — the stream ends with message_stop.

Error Handling

Errors during streaming are delivered as SSE events. Both SDKs raise exceptions automatically:

from openai import APIError

try:
    stream = client.chat.completions.create(
        model="subconscious/tim-qwen3.6-27b",
        messages=[{"role": "user", "content": "Hello"}],
        stream=True,
    )
    for chunk in stream:
        content = chunk.choices[0].delta.content
        if content:
            print(content, end="", flush=True)
except APIError as e:
    print(f"API error: {e.status_code} - {e.message}")

import OpenAI from "openai";

try {
  const stream = await client.chat.completions.create({
    model: "subconscious/tim-qwen3.6-27b",
    messages: [{ role: "user", content: "Hello" }],
    stream: true,
  });
  for await (const chunk of stream) {
    const content = chunk.choices[0]?.delta?.content;
    if (content) process.stdout.write(content);
  }
} catch (error) {
  if (error instanceof OpenAI.APIError) {
    console.error(`API error: ${error.status} - ${error.message}`);
  }
}

from anthropic import APIError

try:
    with client.messages.stream(
        model="subconscious/tim-qwen3.6-27b",
        max_tokens=1024,
        messages=[{"role": "user", "content": "Hello"}],
    ) as stream:
        for text in stream.text_stream:
            print(text, end="", flush=True)
except APIError as e:
    print(f"API error: {e.status_code} - {e.message}")

import Anthropic from "@anthropic-ai/sdk";

try {
  const stream = client.messages.stream({
    model: "subconscious/tim-qwen3.6-27b",
    max_tokens: 1024,
    messages: [{ role: "user", content: "Hello" }],
  });
  for await (const event of stream) {
    if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
      process.stdout.write(event.delta.text);
    }
  }
} catch (error) {
  if (error instanceof Anthropic.APIError) {
    console.error(`API error: ${error.status} - ${error.message}`);
  }
}

Usage Statistics

For the Chat Completions format, include stream_options to receive token usage in the final chunk. For the Messages format, usage is built in: input_tokens arrives on message_start and output_tokens on message_delta.

stream = client.chat.completions.create(
    model="subconscious/tim-qwen3.6-27b",
    messages=[{"role": "user", "content": "Hello"}],
    stream=True,
    stream_options={"include_usage": True},
)

for chunk in stream:
    if chunk.usage:
        print(f"Input: {chunk.usage.prompt_tokens}, Output: {chunk.usage.completion_tokens}")
    elif chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

with client.messages.stream(
    model="subconscious/tim-qwen3.6-27b",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello"}],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)
    final = stream.get_final_message()
    print(f"\nInput: {final.usage.input_tokens}, Output: {final.usage.output_tokens}")

For Chat Completions, usage data is included in the final chunk of the stream.

​Basic Streaming

​SSE Format

​Chat Completions

​Messages

​Error Handling

​Usage Statistics