GPT Image 生成

概述

OpenAI API 允许使用 GPT Image模型（包括最新的 gpt-image-2）根据文本提示生成和编辑图像。可以通过两个 API 访问图像生成功能：

图像 API

从 gpt-image-1 及更高版本的模型开始，图像 API 提供两个端点，每个端点都具有不同的功能：

生成：根据文本提示从头开始生成图像
编辑：使用新的提示修改现有图像，可以部分或全部修改

图像 API 还包含一个变体端点，适用于支持该功能的模型，例如 DALL·E 2。

Responses API

Responses API 允许你在对话或多步骤流程中生成图像。它将图像生成作为内置功能，并支持在上下文中处理图像输入和输出。

与 Image API 相比，它增加了以下功能：

多轮编辑：通过提示词对图像进行迭代式的高保真编辑
灵活的输入：不仅支持字节流，还支持将图像文件 ID 作为输入图像

Responses API 的图像生成工具使用其专有的 GPT Image 模型选择机制。有关支持调用此工具的主线模型的详细信息，请参阅下方的支持模型列表。

选择合适的 API

如果你只需根据一个提示生成或编辑单张图像，Image API 是你的最佳选择。
如果你希望使用 GPT Image 构建具有对话交互性和可编辑性的图像体验，请选择 Responses API。

这两款 API 均支持通过调整质量、尺寸、格式和压缩程度来自定义输出结果。透明背景功能取决于模型是否支持。

本指南重点介绍 GPT Image。

生成图片

可以使用图片生成端点根据文本提示创建图片，或者使用 Responses API 中的图片生成工具在对话过程中生成图片。

如需了解有关自定义输出（尺寸、质量、格式、压缩）的更多信息，请参阅下文的“自定义图片输出”部分。

可以设置 n 参数，以便在单次请求中一次性生成多张图片（默认情况下，API 返回单张图片）。

Image API

from openai import OpenAI
import base64
client = OpenAI()

prompt = """
A children's book drawing of a veterinarian using a stethoscope to 
listen to the heartbeat of a baby otter.
"""

result = client.images.generate(
    model="gpt-image-2",
    prompt=prompt
)

image_base64 = result.data[0].b64_json
image_bytes = base64.b64decode(image_base64)

# Save the image to a file
with open("otter.png", "wb") as f:
    f.write(image_bytes)

Responses API

from openai import OpenAI
import base64

client = OpenAI() 

response = client.responses.create(
    model="gpt-5.5",
    input="Generate an image of gray tabby cat hugging an otter with an orange scarf",
    tools=[{"type": "image_generation"}],
)

# Save the image to a file
image_data = [
    output.result
    for output in response.output
    if output.type == "image_generation_call"
]

if image_data:
    image_base64 = image_data[0]
    with open("otter.png", "wb") as f:
        f.write(base64.b64decode(image_base64))