gpt-image-2多轮图像生成

半兽人 发表于: 2026-05-11   最后更新时间: 2026-05-15 19:55:35  
{{totalSubscript}} 订阅, 26 游览

多轮图像生成

借助 Responses API,可以通过在上下文中提供图像生成调用的输出(也可以仅使用Image ID),或使用 previous_response_id 参数,构建涉及图像生成的多轮对话。这你能够在多个对话轮次中对图像进行迭代——随着对话的进行,不断优化提示词、应用新指令,并改进视觉输出。

借助 Responses API 图像生成工具,受支持的模型可选择生成新图像或编辑对话中已有的图像。可选的 action 参数用于控制此行为:将 action 设为 “auto” 由模型自行决定,设为 ‘generate’ 始终生成新图像,或设为 “edit” 在上下文中存在图像时强制进行编辑。

from openai import OpenAI
import base64

client = OpenAI() 

response = client.responses.create(
    model="gpt-5.5",
    input="Generate an image of gray tabby cat hugging an otter with an orange scarf",
    tools=[{"type": "image_generation", "action": "generate"}],
)

# Save the image to a file
image_data = [
    output.result
    for output in response.output
    if output.type == "image_generation_call"
]

if image_data:
    image_base64 = image_data[0]
    with open("otter.png", "wb") as f:
        f.write(base64.b64decode(image_base64))

如果在未提供上下文图片的情况下强制编辑,该调用将返回错误。将 action 保留为 auto,由模型决定何时生成编辑

使用之前的 response ID

from openai import OpenAI
import base64

client = OpenAI()

response = client.responses.create(
    model="gpt-5.5",
    input="Generate an image of gray tabby cat hugging an otter with an orange scarf",
    tools=[{"type": "image_generation"}],
)

image_data = [
    output.result
    for output in response.output
    if output.type == "image_generation_call"
]

if image_data:
    image_base64 = image_data[0]

    with open("cat_and_otter.png", "wb") as f:
        f.write(base64.b64decode(image_base64))


# Follow up

response_fwup = client.responses.create(
    model="gpt-5.5",
    previous_response_id=response.id,
    input="Now make it look realistic",
    tools=[{"type": "image_generation"}],
)

image_data_fwup = [
    output.result
    for output in response_fwup.output
    if output.type == "image_generation_call"
]

if image_data_fwup:
    image_base64 = image_data_fwup[0]
    with open("cat_and_otter_realistic.png", "wb") as f:
        f.write(base64.b64decode(image_base64))

使用 image ID

import openai
import base64

response = openai.responses.create(
    model="gpt-5.5",
    input="Generate an image of gray tabby cat hugging an otter with an orange scarf",
    tools=[{"type": "image_generation"}],
)

image_generation_calls = [
    output
    for output in response.output
    if output.type == "image_generation_call"
]

image_data = [output.result for output in image_generation_calls]

if image_data:
    image_base64 = image_data[0]

    with open("cat_and_otter.png", "wb") as f:
        f.write(base64.b64decode(image_base64))


# Follow up

response_fwup = openai.responses.create(
    model="gpt-5.5",
    input=[
        {
            "role": "user",
            "content": [{"type": "input_text", "text": "Now make it look realistic"}],
        },
        {
            "type": "image_generation_call",
            "id": image_generation_calls[0].id,
        },
    ],
    tools=[{"type": "image_generation"}],
)

image_data_fwup = [
    output.result
    for output in response_fwup.output
    if output.type == "image_generation_call"
]

if image_data_fwup:
    image_base64 = image_data_fwup[0]
    with open("cat_and_otter_realistic.png", "wb") as f:
        f.write(base64.b64decode(image_base64))

结果

“生成一幅画面:一只灰色的虎斑猫正抱着一只戴着橙色围巾的水獭。”

A cat and an otter

“现在让它看起来更逼真一些。”

A cat and an otter

更新于 2026-05-15
在线,10小时前登录

查看OpenAI更多相关的文章或提一个关于OpenAI的问题,也可以与我们一起分享文章