Image Generation
Pomex supports image generation through three API interfaces on this page:
| Interface | Endpoint | Request Shape |
|---|---|---|
| Chat Completions | POST /v1/chat/completions |
OpenAI Chat format — works with OpenAI SDK directly |
| Gemini Native | POST /v1beta/models/{model}:generateContent |
Google contents / parts format |
| OpenAI Images | POST /v1/images/generations |
OpenAI-style model / prompt / size JSON |
Chat Completions Interface
Generate images using the standard OpenAI Chat Completions endpoint. This is the recommended approach if you already use the OpenAI SDK—no code changes needed beyond switching the model name. Both GPT Image and Gemini Image models are supported.
This endpoint is non-streaming only for image models. Setting stream: true will return an error. Features like tools, response_format, reasoning, and logprobs are not supported for image generation.
Available Models
| Model ID | Description |
|---|---|
| openai/gpt-image-1 | OpenAI GPT Image 1 |
| openai/gpt-image-1.5 | OpenAI GPT Image 1.5 |
| openai/gpt-image-2 | OpenAI GPT Image 2 |
| google/gemini-2.5-flash-image | Gemini 2.5 Flash with image generation |
| google/gemini-3-pro-image-preview | Gemini 3 Pro image generation preview |
| google/gemini-3.1-flash-image-preview | Gemini 3.1 Flash image generation preview |
Request Body
| Field | Type | Description | |
|---|---|---|---|
| model | string | Required | An image model ID from the table above. |
| messages | array | Required | Array of message objects. The last user message's text content is used as the image prompt. Content can be a plain string or an array of content parts. |
| n | number | Optional | Number of images to generate (default: 1). Only applies to GPT Image models. |
Response Format
The response follows the standard Chat Completions shape. Generated images are returned as image_url content parts with base64 data URIs:
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1735689600,
"model": "openai/gpt-image-2",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": [
{
"type": "image_url",
"image_url": {
"url": "data:image/png;base64,iVBORw0KGgo..."
}
}
]
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 1290,
"total_tokens": 1302
}
}Examples — GPT Image
# Generate an image and save it to a file
curl https://api.pomex.ai/v1/chat/completions \
-H "Authorization: Bearer $POMEX_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-image-2",
"messages": [
{"role": "user", "content": "A cute baby otter wearing a tiny top hat, watercolor style"}
]
}' | jq -r '.choices[0].message.content[0].image_url.url' \
| sed 's/^data:image\/png;base64,//' \
| base64 -d > otter.pngimport base64
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://api.pomex.ai/v1",
)
response = client.chat.completions.create(
model="openai/gpt-image-2",
messages=[
{"role": "user", "content": "A cute baby otter wearing a tiny top hat, watercolor style"}
],
)
# Find the image part in the response (may be preceded by text)
for part in response.choices[0].message.content:
if part["type"] == "image_url":
img_b64 = part["image_url"]["url"].replace("data:image/png;base64,", "")
img_bytes = base64.b64decode(img_b64)
with open("otter.png", "wb") as f:
f.write(img_bytes)
print("Saved otter.png")
breakExamples — Gemini Image
# Generate an image with Gemini and save it
curl https://api.pomex.ai/v1/chat/completions \
-H "Authorization: Bearer $POMEX_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "google/gemini-3.1-flash-image-preview",
"messages": [
{"role": "user", "content": "A serene mountain landscape at sunset, oil painting style"}
]
}' | jq -r '.choices[0].message.content[0].image_url.url' \
| sed 's/^data:image\/png;base64,//' \
| base64 -d > landscape.pngimport base64
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://api.pomex.ai/v1",
)
response = client.chat.completions.create(
model="google/gemini-3.1-flash-image-preview",
messages=[
{"role": "user", "content": "A serene mountain landscape at sunset, oil painting style"}
],
)
# Find the image part in the response (may be preceded by text)
for part in response.choices[0].message.content:
if part["type"] == "image_url":
img_b64 = part["image_url"]["url"].replace("data:image/png;base64,", "")
img_bytes = base64.b64decode(img_b64)
with open("landscape.png", "wb") as f:
f.write(img_bytes)
print("Saved landscape.png")
breakGemini Native Interface
Generate images using Google's Gemini image generation models. This endpoint uses the native Gemini generateContent API format—not the OpenAI Images format.
This endpoint is non-streaming only. The response is returned as a single JSON object once generation is complete.
For text-only Gemini requests (:generateContent on any text model, plus :streamGenerateContent) see Generate Content. The same URL pattern serves both flows — Pomex selects the handler based on the resolved model class.
Available Gemini Models
| Model ID | Description |
|---|---|
| google/gemini-2.5-flash-image | Gemini 2.5 Flash with image generation |
| google/gemini-3-pro-image-preview | Gemini 3 Pro image generation preview |
| google/gemini-3.1-flash-image-preview | Gemini 3.1 Flash image generation preview |
The google/ prefix is optional in the URL. For example, both /v1beta/models/gemini-2.5-flash-image:generateContent and /v1beta/models/google/gemini-2.5-flash-image:generateContent work.
Authentication
This endpoint supports two authentication methods:
| Method | Header | Example |
|---|---|---|
| Bearer token | Authorization |
Authorization: Bearer rh_your_api_key |
| Google-style API key | x-goog-api-key |
x-goog-api-key: rh_your_api_key |
Request Body
| Field | Type | Description | |
|---|---|---|---|
| contents | array | Required | Array of Content objects with role and parts. See Content Object. |
| generationConfig | object | Optional | Generation parameters (temperature, responseModalities, etc.). Defaults to responseModalities: ["TEXT", "IMAGE"] if omitted. |
| safetySettings | array | Optional | Safety filter thresholds for content categories. |
| systemInstruction | object | Optional | System-level instruction as a Content object. |
| cachedContent | string | Optional | Resource name of cached content (e.g., projects/my-project/cachedContents/abc123). |
Content Object
| Field | Type | Description | |
|---|---|---|---|
| role | string | Required | "user" or "model" |
| parts | array | Required | Array of parts. Each part contains one of: text (string) or inlineData (object with mimeType and base64-encoded data). |
Response Body
| Field | Type | Description |
|---|---|---|
| candidates | array | Array of candidate responses. Each has content (with parts), finishReason, and optional safetyRatings. |
| modelVersion | string | The model version used. |
| usageMetadata | object | Token usage: promptTokenCount, candidatesTokenCount, totalTokenCount. |
| promptFeedback | object | Present if the prompt was blocked. Contains blockReason and safetyRatings. |
Generated images appear as inlineData parts in the candidate's content, with mimeType (e.g., image/png) and base64-encoded data.
Examples
Text-to-Image
curl https://api.pomex.ai/v1beta/models/gemini-2.5-flash-image:generateContent \
-H "Authorization: Bearer $POMEX_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [
{
"role": "user",
"parts": [{"text": "A cute baby otter wearing a tiny top hat, watercolor style"}]
}
]
}'import requests
import base64
response = requests.post(
"https://api.pomex.ai/v1beta/models/gemini-2.5-flash-image:generateContent",
headers={
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json",
},
json={
"contents": [
{
"role": "user",
"parts": [{"text": "A cute baby otter wearing a tiny top hat, watercolor style"}],
}
],
},
)
data = response.json()
for part in data["candidates"][0]["content"]["parts"]:
if "inlineData" in part:
img_bytes = base64.b64decode(part["inlineData"]["data"])
with open("otter.png", "wb") as f:
f.write(img_bytes)
print("Saved otter.png")
elif "text" in part:
print(part["text"])Image Editing
Send an existing image along with an editing instruction:
# Base64-encode your image first
IMG_B64=$(base64 -w0 photo.png)
curl https://api.pomex.ai/v1beta/models/gemini-3-pro-image-preview:generateContent \
-H "Authorization: Bearer $POMEX_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [
{
"role": "user",
"parts": [
{"inlineData": {"mimeType": "image/png", "data": "'${IMG_B64}'"}},
{"text": "Remove the background and replace it with a beach scene"}
]
}
]
}'import requests
import base64
with open("photo.png", "rb") as f:
img_b64 = base64.b64encode(f.read()).decode()
response = requests.post(
"https://api.pomex.ai/v1beta/models/gemini-3-pro-image-preview:generateContent",
headers={
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json",
},
json={
"contents": [
{
"role": "user",
"parts": [
{"inlineData": {"mimeType": "image/png", "data": img_b64}},
{"text": "Remove the background and replace it with a beach scene"},
],
}
],
},
)
data = response.json()
for part in data["candidates"][0]["content"]["parts"]:
if "inlineData" in part:
img_bytes = base64.b64decode(part["inlineData"]["data"])
with open("edited.png", "wb") as f:
f.write(img_bytes)
print("Saved edited.png")With Generation Config
curl https://api.pomex.ai/v1beta/models/gemini-3.1-flash-image-preview:generateContent \
-H "Authorization: Bearer $POMEX_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [
{
"role": "user",
"parts": [{"text": "A serene mountain landscape at sunset"}]
}
],
"generationConfig": {
"temperature": 1.0,
"responseModalities": ["TEXT", "IMAGE"]
}
}'Sample Response
{
"candidates": [
{
"content": {
"role": "model",
"parts": [
{
"inlineData": {
"mimeType": "image/png",
"data": "iVBORw0KGgo..."
}
},
{
"text": "Here is your otter wearing a top hat!"
}
]
},
"finishReason": "STOP"
}
],
"modelVersion": "gemini-2.5-flash-image",
"usageMetadata": {
"promptTokenCount": 12,
"candidatesTokenCount": 1290,
"totalTokenCount": 1302
}
}Gemini Error Format
This endpoint returns errors in Google API format, different from the OpenAI and Anthropic error formats used by other endpoints.
{
"error": {
"code": 400,
"message": "body must be valid JSON",
"status": "INVALID_ARGUMENT"
}
}See Errors for the full list of HTTP status codes and retry guidance.
OpenAI Images Interface
Generate images using OpenAI-compatible request and response shapes. This interface is different from Gemini native :generateContent and should be called via /v1/images/generations.
Available OpenAI Image Models
| Model ID | Description |
|---|---|
| openai/gpt-image-1 | OpenAI-compatible image generation model |
| openai/gpt-image-1.5 | OpenAI-compatible image generation model (v1.5) |
| openai/gpt-image-2 | OpenAI-compatible image generation model (v2) |
Request Body
| Field | Type | Description | |
|---|---|---|---|
| model | string | Required | One of openai/gpt-image-1, openai/gpt-image-1.5, or openai/gpt-image-2. |
| prompt | string | Required | Text prompt describing the image to generate. |
| size | string | Optional | Output resolution, such as 1024x1024. |
| quality | string | Optional | Quality tier, such as medium. |
| output_compression | number | Optional | Output compression level (for supported formats). |
| output_format | string | Optional | Output image format, such as png or jpeg. |
| n | number | Optional | Number of images to generate. Commonly 1. |
Examples
This example uses openai/gpt-image-2 and returns base64 image data at data[0].b64_json.
curl -v https://api.pomex.ai/v1/images/generations \
-H "Authorization: Bearer $POMEX_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-image-2",
"prompt" : "A cute baby otter wearing a tiny top hat, watercolor style",
"size" : "1024x1024",
"quality" : "medium",
"output_compression" : 100,
"output_format" : "png",
"n" : 1
}' | jq -r '.data[0].b64_json' | base64 --decode > generated_image_gpt_image_2.pngimport base64
import requests
response = requests.post(
"https://api.pomex.ai/v1/images/generations",
headers={
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json",
},
json={
"model": "openai/gpt-image-2",
"prompt": "A cute baby otter wearing a tiny top hat, watercolor style",
"size": "1024x1024",
"quality": "medium",
"output_compression": 100,
"output_format": "png",
"n": 1,
},
)
response.raise_for_status()
payload = response.json()
img_b64 = payload["data"][0]["b64_json"]
img_bytes = base64.b64decode(img_b64)
with open("generated_image_gpt_image_2.png", "wb") as f:
f.write(img_bytes)
print("Saved generated_image_gpt_image_2.png")Sample Response
{
"created": 1735689600,
"data": [
{
"b64_json": "iVBORw0KGgoAAAANSUhEUgAA..."
}
]
}OpenAI Images Error Format
This interface returns OpenAI-style errors:
{
"error": {
"message": "Invalid value for 'model'",
"type": "invalid_request_error",
"param": "model",
"code": "invalid_model"
}
}