OpenAI launches GPT-5.3 Codex Spark on Cerebras WSE-3

OpenAI released GPT-5.3-Codex-Spark, a faster coding model running on Cerebras Wafer Scale Engine 3 that delivers over 1,000 tokens per second and about 15x speed for ChatGPT Pro users.

OpenAI released GPT-5.3-Codex-Spark, a smaller, lower-latency version of its coding model built to provide real-time suggestions as developers type. The model runs on Cerebras Systems’ Wafer Scale Engine 3 and is available now to ChatGPT Pro subscribers.

Codex-Spark is designed for immediate, line-by-line assistance rather than end-to-end codebase design. The model produces more than 1,000 tokens per second and delivers roughly a 15x speed improvement compared with the standard GPT-5.3-Codex, according to OpenAI. Users can interrupt generation mid-stream to stop or change output while iterating on code.

Access to Codex-Spark is offered through the Codex desktop app, a command-line interface and a Visual Studio Code extension. Each access channel uses its own usage meter, reflecting different expected patterns across graphical editors, terminal workflows and integrated development environments.

The model runs exclusively on Cerebras’ WSE-3 wafer-scale processors rather than on clustered discrete GPUs. A single WSE-3 chip contains hundreds of thousands of cores on one large piece of silicon, a design that reduces the communication overhead involved in networking many separate GPUs. OpenAI attributes Spark’s lower latency and high token throughput to both software changes and the chip’s architecture.

OpenAI and Cerebras are privately held and do not issue public stock or tokens, so the release has no direct effect on public equity or crypto markets. OpenAI has not announced plans to extend Codex-Spark beyond ChatGPT Pro or provided a timetable for broader availability.

The company indicated future Codex releases will seek to combine Spark-style real-time responsiveness with the longer-horizon planning capabilities of the full Codex model. Developers working on smart contracts and security-sensitive code may use the faster, interruptible suggestions to speed iteration and code review.

Articles by this author