Integration Guide: Connecting Codex CLI to Pomex

This guide explains how to configure OpenAI Codex CLI (v0.128.0+) to use the Pomex API gateway as its backend. Pomex implements the OpenAI Responses API (/v1/responses) with full streaming support, making it compatible with Codex's latest architecture.

Important (2026 Update): Starting from Codex CLI v0.128.0, the simple OPENAI_BASE_URL environment variable is no longer sufficient. Codex now requires explicit provider configuration via ~/.codex/config.toml and uses the Responses API (wire_api = "responses") by default. The legacy /v1/chat/completions endpoint is no longer supported by Codex.

What is Codex CLI?

Codex CLI is OpenAI's open-source coding agent that runs in your terminal. It uses the /v1/responses endpoint with streaming SSE to power an interactive coding assistant that can read, write, and execute code in your project.

By pointing Codex at Pomex, you can use supported GPT models as the backend while benefiting from Pomex's unified billing, prompt caching, and multi-provider failover.

Prerequisites

  1. A Pomex Account: You need an active account with a valid API key.
  2. Your API Key: Locate your unique API key in the Pomex dashboard.
  3. Codex CLI Installed: Install via npm:
    npm install -g @openai/codex

API Compatibility

Codex CLI v0.128.0+ exclusively uses the Responses API (/v1/responses) with streaming SSE. The legacy /v1/chat/completions endpoint is no longer used by Codex.

Endpoint Status Notes
/v1/responses Required Must support streaming SSE with typed events (response.created, response.output_text.delta, response.completed). Pomex fully supports this.
/v1/chat/completions Not Used Codex no longer calls this endpoint. Do not configure Codex to use it.

Configuration: config.toml (Required)

Since v0.128.0, Codex CLI uses a TOML configuration file for provider management. Simple environment variables like OPENAI_BASE_URL are no longer sufficient — Codex will ignore them and attempt to connect to the official OpenAI WebSocket endpoint instead.

Step 1: Create the Config File

mkdir -p ~/.codex && nano ~/.codex/config.toml

Step 2: Add Pomex as a Provider

# ~/.codex/config.toml

model_provider = "pomex"
model = "openai/gpt-5.3-codex"

# Define Pomex as a custom model provider
[model_providers.pomex]
name = "Pomex API"
base_url = "https://api.pomex.ai/v1"
env_key = "POMEX_API_KEY"   # Name of the env var holding your API key
wire_api = "responses"          # Use the Responses API protocol (required)

Step 3: Set the API Key Environment Variable

# Add to your ~/.zshrc or ~/.bashrc
export POMEX_API_KEY="sk-rh-xxxxxxxxxxxx"

Step 4: Launch Codex

source ~/.zshrc   # reload env
codex

Key insight: The wire_api = "responses" setting is critical. It tells Codex to use HTTP-based SSE streaming to your base_url instead of attempting a WebSocket connection to OpenAI's default wss:// endpoint.

Configuration Reference

Top-level fields

Field Type Description
model_provider string Required Must match the key in [model_providers.<name>] (e.g. "pomex").
model string Required Default model to use (e.g. openai/gpt-5.3-codex).

[model_providers.<name>]

Field Type Description
name string Optional Human-readable display name for this provider.
base_url string Required API base URL. Must include /v1. Codex appends /responses to this.
env_key string Required The name of the environment variable that holds your API key. Do NOT put the actual key here.
wire_api string Required Must be "responses". Tells Codex to use the Responses API over HTTP SSE instead of WebSocket.

Selecting a Model

Override the default model per session:

# Use a specific model for this session
codex --model openai/gpt-5.3-codex

Recommended Models for Codex

Model Best For
openai/gpt-5.3-codex Optimized for coding tasks. Best balance of speed and quality.

How It Works

Under the hood, the integration works as follows:

  1. Codex reads ~/.codex/config.toml and resolves the provider's base_url and API key.
  2. Because wire_api = "responses", Codex sends POST /v1/responses with "stream": true over HTTPS (not WebSocket).
  3. Pomex converts the request to the appropriate provider format internally (Claude, Gemini, GPT, etc.).
  4. The response is streamed back as typed SSE events (response.created, response.output_text.delta, response.completed).
  5. Codex receives the stream and renders output in your terminal in real time.

Pomex's /v1/responses endpoint implements the full Responses API streaming protocol, including the response.completed terminal event that Codex requires to confirm the stream has finished.

Verifying the Configuration

After setting up, verify the integration works:

# Quick test — should produce a response from Pomex
codex "What is 2 + 2?"

If successful, you'll see the model respond in your terminal. You can also verify in your Pomex dashboard that usage is being logged under the /v1/responses endpoint.

Troubleshooting

Codex Still Connects to wss://api.openai.com

This means the config.toml is not being read. Check:

Error: "stream disconnected before completion"

The SSE stream closed without sending a response.completed event. Ensure you are using Pomex v00083+ which includes the terminal event fallback.

Error: "Invalid API Key" or 401 Unauthorized

Verify:

Error: "model not found" or 400 Bad Request

Ensure the model name follows Pomex's naming convention: provider/model-name (e.g. openai/gpt-5.3-codex). Check the Models page for the full list.

Why OPENAI_BASE_URL No Longer Works

In Codex v0.128.0+, the priority order is:

  1. ~/.codex/config.toml — highest priority, controls provider routing and protocol.
  2. Legacy OPENAI_BASE_URL — ignored when config.toml is present or when Codex defaults to WebSocket mode.

The config.toml approach is required because Codex needs to know both the base URL and the wire protocol (wire_api). A simple URL environment variable cannot express both.

If you continue to experience issues, please contact our support team at support@pomex.ai.