Image Generation

Pomex supports image generation through three API interfaces on this page:

Interface	Endpoint	Request Shape
Chat Completions	`POST /v1/chat/completions`	OpenAI Chat format — works with OpenAI SDK directly
Gemini Native	`POST /v1beta/models/{model}:generateContent`	Google `contents / parts` format
OpenAI Images	`POST /v1/images/generations`	OpenAI-style `model / prompt / size` JSON

Chat Completions Interface

POST /v1/chat/completions

Generate images using the standard OpenAI Chat Completions endpoint. This is the recommended approach if you already use the OpenAI SDK—no code changes needed beyond switching the model name. Both GPT Image and Gemini Image models are supported.

This endpoint is non-streaming only for image models. Setting stream: true will return an error. Features like tools, response_format, reasoning, and logprobs are not supported for image generation.

Available Models

Model ID	Description
openai/gpt-image-1	OpenAI GPT Image 1
openai/gpt-image-1.5	OpenAI GPT Image 1.5
openai/gpt-image-2	OpenAI GPT Image 2
google/gemini-2.5-flash-image	Gemini 2.5 Flash with image generation
google/gemini-3-pro-image-preview	Gemini 3 Pro image generation preview
google/gemini-3.1-flash-image-preview	Gemini 3.1 Flash image generation preview

Request Body

Field	Type		Description
model	string	Required	An image model ID from the table above.
messages	array	Required	Array of message objects. The last `user` message's text content is used as the image prompt. Content can be a plain string or an array of content parts.
n	number	Optional	Number of images to generate (default: `1`). Only applies to GPT Image models.

Response Format

The response follows the standard Chat Completions shape. Generated images are returned as image_url content parts with base64 data URIs:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1735689600,
  "model": "openai/gpt-image-2",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": [
          {
            "type": "image_url",
            "image_url": {
              "url": "data:image/png;base64,iVBORw0KGgo..."
            }
          }
        ]
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 1290,
    "total_tokens": 1302
  }
}

Examples — GPT Image

# Generate an image and save it to a file
curl https://api.pomex.ai/v1/chat/completions \
  -H "Authorization: Bearer $POMEX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-image-2",
    "messages": [
      {"role": "user", "content": "A cute baby otter wearing a tiny top hat, watercolor style"}
    ]
  }' | jq -r '.choices[0].message.content[0].image_url.url' \
     | sed 's/^data:image\/png;base64,//' \
     | base64 -d > otter.png

import base64
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.pomex.ai/v1",
)

response = client.chat.completions.create(
    model="openai/gpt-image-2",
    messages=[
        {"role": "user", "content": "A cute baby otter wearing a tiny top hat, watercolor style"}
    ],
)

# Find the image part in the response (may be preceded by text)
for part in response.choices[0].message.content:
    if part["type"] == "image_url":
        img_b64 = part["image_url"]["url"].replace("data:image/png;base64,", "")
        img_bytes = base64.b64decode(img_b64)
        with open("otter.png", "wb") as f:
            f.write(img_bytes)
        print("Saved otter.png")
        break

Examples — Gemini Image

# Generate an image with Gemini and save it
curl https://api.pomex.ai/v1/chat/completions \
  -H "Authorization: Bearer $POMEX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/gemini-3.1-flash-image-preview",
    "messages": [
      {"role": "user", "content": "A serene mountain landscape at sunset, oil painting style"}
    ]
  }' | jq -r '.choices[0].message.content[0].image_url.url' \
     | sed 's/^data:image\/png;base64,//' \
     | base64 -d > landscape.png

import base64
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.pomex.ai/v1",
)

response = client.chat.completions.create(
    model="google/gemini-3.1-flash-image-preview",
    messages=[
        {"role": "user", "content": "A serene mountain landscape at sunset, oil painting style"}
    ],
)

# Find the image part in the response (may be preceded by text)
for part in response.choices[0].message.content:
    if part["type"] == "image_url":
        img_b64 = part["image_url"]["url"].replace("data:image/png;base64,", "")
        img_bytes = base64.b64decode(img_b64)
        with open("landscape.png", "wb") as f:
            f.write(img_bytes)
        print("Saved landscape.png")
        break

Gemini Native Interface

POST /v1beta/models/{model}:generateContent

Generate images using Google's Gemini image generation models. This endpoint uses the native Gemini generateContent API format—not the OpenAI Images format.

This endpoint is non-streaming only. The response is returned as a single JSON object once generation is complete.

For text-only Gemini requests (:generateContent on any text model, plus :streamGenerateContent) see Generate Content. The same URL pattern serves both flows — Pomex selects the handler based on the resolved model class.

Available Gemini Models

Model ID	Description
google/gemini-2.5-flash-image	Gemini 2.5 Flash with image generation
google/gemini-3-pro-image-preview	Gemini 3 Pro image generation preview
google/gemini-3.1-flash-image-preview	Gemini 3.1 Flash image generation preview

The google/ prefix is optional in the URL. For example, both /v1beta/models/gemini-2.5-flash-image:generateContent and /v1beta/models/google/gemini-2.5-flash-image:generateContent work.

Authentication

This endpoint supports two authentication methods:

Method	Header	Example
Bearer token	`Authorization`	`Authorization: Bearer rh_your_api_key`
Google-style API key	`x-goog-api-key`	`x-goog-api-key: rh_your_api_key`

Request Body

Field	Type		Description
contents	array	Required	Array of `Content` objects with `role` and `parts`. See Content Object.
generationConfig	object	Optional	Generation parameters (temperature, responseModalities, etc.). Defaults to `responseModalities: ["TEXT", "IMAGE"]` if omitted.
safetySettings	array	Optional	Safety filter thresholds for content categories.
systemInstruction	object	Optional	System-level instruction as a `Content` object.
cachedContent	string	Optional	Resource name of cached content (e.g., `projects/my-project/cachedContents/abc123`).

Content Object

Field	Type		Description
role	string	Required	`"user"` or `"model"`
parts	array	Required	Array of parts. Each part contains one of: `text` (string) or `inlineData` (object with `mimeType` and base64-encoded `data`).

Response Body

Field	Type	Description
candidates	array	Array of candidate responses. Each has `content` (with `parts`), `finishReason`, and optional `safetyRatings`.
modelVersion	string	The model version used.
usageMetadata	object	Token usage: `promptTokenCount`, `candidatesTokenCount`, `totalTokenCount`.
promptFeedback	object	Present if the prompt was blocked. Contains `blockReason` and `safetyRatings`.

Generated images appear as inlineData parts in the candidate's content, with mimeType (e.g., image/png) and base64-encoded data.

Examples

Text-to-Image

curl https://api.pomex.ai/v1beta/models/gemini-2.5-flash-image:generateContent \
  -H "Authorization: Bearer $POMEX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "A cute baby otter wearing a tiny top hat, watercolor style"}]
      }
    ]
  }'

import requests
import base64

response = requests.post(
    "https://api.pomex.ai/v1beta/models/gemini-2.5-flash-image:generateContent",
    headers={
        "Authorization": "Bearer YOUR_API_KEY",
        "Content-Type": "application/json",
    },
    json={
        "contents": [
            {
                "role": "user",
                "parts": [{"text": "A cute baby otter wearing a tiny top hat, watercolor style"}],
            }
        ],
    },
)

data = response.json()
for part in data["candidates"][0]["content"]["parts"]:
    if "inlineData" in part:
        img_bytes = base64.b64decode(part["inlineData"]["data"])
        with open("otter.png", "wb") as f:
            f.write(img_bytes)
        print("Saved otter.png")
    elif "text" in part:
        print(part["text"])

Image Editing

Send an existing image along with an editing instruction:

# Base64-encode your image first
IMG_B64=$(base64 -w0 photo.png)

curl https://api.pomex.ai/v1beta/models/gemini-3-pro-image-preview:generateContent \
  -H "Authorization: Bearer $POMEX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [
          {"inlineData": {"mimeType": "image/png", "data": "'${IMG_B64}'"}},
          {"text": "Remove the background and replace it with a beach scene"}
        ]
      }
    ]
  }'

import requests
import base64

with open("photo.png", "rb") as f:
    img_b64 = base64.b64encode(f.read()).decode()

response = requests.post(
    "https://api.pomex.ai/v1beta/models/gemini-3-pro-image-preview:generateContent",
    headers={
        "Authorization": "Bearer YOUR_API_KEY",
        "Content-Type": "application/json",
    },
    json={
        "contents": [
            {
                "role": "user",
                "parts": [
                    {"inlineData": {"mimeType": "image/png", "data": img_b64}},
                    {"text": "Remove the background and replace it with a beach scene"},
                ],
            }
        ],
    },
)

data = response.json()
for part in data["candidates"][0]["content"]["parts"]:
    if "inlineData" in part:
        img_bytes = base64.b64decode(part["inlineData"]["data"])
        with open("edited.png", "wb") as f:
            f.write(img_bytes)
        print("Saved edited.png")

With Generation Config

curl https://api.pomex.ai/v1beta/models/gemini-3.1-flash-image-preview:generateContent \
  -H "Authorization: Bearer $POMEX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "A serene mountain landscape at sunset"}]
      }
    ],
    "generationConfig": {
      "temperature": 1.0,
      "responseModalities": ["TEXT", "IMAGE"]
    }
  }'

Sample Response

{
  "candidates": [
    {
      "content": {
        "role": "model",
        "parts": [
          {
            "inlineData": {
              "mimeType": "image/png",
              "data": "iVBORw0KGgo..."
            }
          },
          {
            "text": "Here is your otter wearing a top hat!"
          }
        ]
      },
      "finishReason": "STOP"
    }
  ],
  "modelVersion": "gemini-2.5-flash-image",
  "usageMetadata": {
    "promptTokenCount": 12,
    "candidatesTokenCount": 1290,
    "totalTokenCount": 1302
  }
}

Gemini Error Format

This endpoint returns errors in Google API format, different from the OpenAI and Anthropic error formats used by other endpoints.

{
  "error": {
    "code": 400,
    "message": "body must be valid JSON",
    "status": "INVALID_ARGUMENT"
  }
}

See Errors for the full list of HTTP status codes and retry guidance.

OpenAI Images Interface

POST /v1/images/generations

Generate images using OpenAI-compatible request and response shapes. This interface is different from Gemini native :generateContent and should be called via /v1/images/generations.

Available OpenAI Image Models

Model ID	Description
openai/gpt-image-1	OpenAI-compatible image generation model
openai/gpt-image-1.5	OpenAI-compatible image generation model (v1.5)
openai/gpt-image-2	OpenAI-compatible image generation model (v2)

Request Body

Field	Type		Description
model	string	Required	One of `openai/gpt-image-1`, `openai/gpt-image-1.5`, or `openai/gpt-image-2`.
prompt	string	Required	Text prompt describing the image to generate.
size	string	Optional	Output resolution, such as `1024x1024`.
quality	string	Optional	Quality tier, such as `medium`.
output_compression	number	Optional	Output compression level (for supported formats).
output_format	string	Optional	Output image format, such as `png` or `jpeg`.
n	number	Optional	Number of images to generate. Commonly `1`.

Examples

This example uses openai/gpt-image-2 and returns base64 image data at data[0].b64_json.

curl -v https://api.pomex.ai/v1/images/generations \
  -H "Authorization: Bearer $POMEX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
     "model": "openai/gpt-image-2",
     "prompt" : "A cute baby otter wearing a tiny top hat, watercolor style",
     "size" : "1024x1024",
     "quality" : "medium",
     "output_compression" : 100,
     "output_format" : "png",
     "n" : 1
    }' | jq -r '.data[0].b64_json' | base64 --decode > generated_image_gpt_image_2.png

import base64
import requests

response = requests.post(
    "https://api.pomex.ai/v1/images/generations",
    headers={
        "Authorization": "Bearer YOUR_API_KEY",
        "Content-Type": "application/json",
    },
    json={
        "model": "openai/gpt-image-2",
        "prompt": "A cute baby otter wearing a tiny top hat, watercolor style",
        "size": "1024x1024",
        "quality": "medium",
        "output_compression": 100,
        "output_format": "png",
        "n": 1,
    },
)

response.raise_for_status()
payload = response.json()

img_b64 = payload["data"][0]["b64_json"]
img_bytes = base64.b64decode(img_b64)
with open("generated_image_gpt_image_2.png", "wb") as f:
    f.write(img_bytes)

print("Saved generated_image_gpt_image_2.png")

Sample Response

{
  "created": 1735689600,
  "data": [
    {
      "b64_json": "iVBORw0KGgoAAAANSUhEUgAA..."
    }
  ]
}

OpenAI Images Error Format

This interface returns OpenAI-style errors:

{
  "error": {
    "message": "Invalid value for 'model'",
    "type": "invalid_request_error",
    "param": "model",
    "code": "invalid_model"
  }
}