You need to enable JavaScript to run this app.
导航
知识库输出图文混排样例
最近更新时间:2025.04.25 11:52:07首次发布时间:2025.04.25 11:52:07
我的收藏
有用
有用
无用
无用

概述

本篇文档阐述了实现控制台同款图文混排效果的最佳实践方案
Image

  • 涉及接口:search_knowledge(新)chat_completions(新)
  • 实现原理:
    1. 通过 prompt 调优让大模型输出参考回答的引用切片 id,以及明确其对应的图片是否适合作为配图
    2. 解析大模型输出结果,找到相应引用标识,下载缓存对应图片,前端实现对应样式渲染
  • 方案限制:
    1. 建议使用 VLM 图文理解系列模型,其他模型可能会影响配图准确性

前提条件
  1. 您需要完 “签名鉴权与调用示例“ 页面的注册账号、实名认证、AK/SK 密钥获取和签名获取,完成知识库的创建及文档上传后,可借助下方代码进行知识问答,且回答内容实现图文混排。

函数说明
  1. 请求准备

通过构造 HTTP 请求并生成签名,确保请求的合法性和安全性。

def prepare_request(method, path, params=None, data=None, doseq=0):
    if params:
        for key in params:
            if (
                    isinstance(params[key], int)
                    or isinstance(params[key], float)
                    or isinstance(params[key], bool)
            ):
                params[key] = str(params[key])
            elif isinstance(params[key], list):
                if not doseq:
                    params[key] = ",".join(params[key])
    r = Request()
    r.set_shema("http")
    r.set_method(method)
    r.set_connection_timeout(10)
    r.set_socket_timeout(10)
    headers = {
        "Accept": "application/json",
        "Content-Type": "application/json; charset=utf-8",
        "Host": g_knowledge_base_domain,
        "V-Account-Id": account_id,
    }
    r.set_headers(headers)
    if params:
        r.set_query(params)
    r.set_host(g_knowledge_base_domain)
    r.set_path(path)
    if data is not None:
        r.set_body(json.dumps(data))

    # 生成签名
    credentials = Credentials(ak, sk, "air", "cn-north-1")
    SignerV4.sign(r, credentials)
    return r
  1. 知识检索:向知识库服务发送查询请求,获取与用户问题相关的参考资料。您需要将 get_attachment_link 的参数值设置为 True。
def search_knowledge():
    method = "POST"
    path = "/api/knowledge/collection/search_knowledge"
    request_params = {
    "project": "default",
    "name": name,
    "query": query,
    "limit": 10,
    "pre_processing": {
        "need_instruction": True,
        "return_token_usage": True,
        "messages": [
            {
                "role": "system",
                "content": ""
            },
            {
                "role": "user",
                "content": query
            }
        ],
        "rewrite": False
    },
    "dense_weight": 0.5,
    "post_processing": {
        "get_attachment_link": True,
        "rerank_only_chunk": False,
        "rerank_switch": False,
        "chunk_group": True,
        "chunk_diffusion_count": 0
    }
}
    info_req = prepare_request(method=method, path=path, data=request_params)
    rsp = requests.request(
        method=info_req.method,
        url="http://{}{}".format(g_knowledge_base_domain, info_req.path),
        headers=info_req.headers,
        data=info_req.body
    )
    #您可选择将search_knowledge接口返回值打印,用于检验检索结果是否符合预期
    # print("search res = {}".format(rsp.text))  
    return rsp.text
  1. 生成 Prompt:根据知识库的占位符规则,将检索结果和基础 prompt 拼接生成完整的 system prompt 输入大模型
def generate_prompt(rsp_txt):
    rsp = json.loads(rsp_txt)
    if rsp["code"] != 0:
        return "", []
    prompt = ""
    image_urls = []
    rsp_data = rsp["data"]
    points = rsp_data["result_list"]
    using_vlm = is_vision_model("Doubao-1-5-vision-pro-32k")
    image_cnt = 0

    for point in points:
        # 提取图片链接
        if using_vlm and "chunk_attachment" in point:
            image_link = point["chunk_attachment"][0]["link"]
            if image_link:
                image_urls.append(image_link)
                image_cnt += 1
        # 先拼接系统字段
        doc_info = point["doc_info"]
        for system_field in ["doc_name","title","chunk_title","content","point_id"] : 
            if system_field == 'doc_name' or system_field == 'title':
                if system_field in doc_info:
                    prompt += f"{system_field}: {doc_info[system_field]}\n"
            else:
                if system_field in point:
                    if system_field == "content":
                        prompt += f"content: {get_content_for_prompt(point, image_cnt)}\n"
                    else:
                        prompt += f"{system_field}: {point[system_field]}\n"
        if "table_chunk_fields" in point:
            table_chunk_fields = point["table_chunk_fields"]
            for self_field in [] : 
                # 使用 next() 从 table_chunk_fields 中找到第一个符合条件的项目
                find_one = next((item for item in table_chunk_fields if item["field_name"] == self_field), None)
                if find_one:
                    prompt += f"{self_field}: {find_one['field_value']}\n"

        prompt += "---\n"

    return base_prompt.format(prompt), image_urls
  1. 回答生成:通过 chat_completion 接口请求 VLM 模型,生成基于用户问题和参考资料的回答。
def chat_completion(message, stream=False, return_token_usage=True, temperature=0.7, max_tokens=4096):
    method = "POST"
    path = "/api/knowledge/chat/completions"
    request_params = {
        "messages": message,
        "stream": False,
        "return_token_usage": True,
        "model": "Doubao-1-5-vision-pro-32k",
        "max_tokens": 4096,
        "temperature": 0.7,
        "model_version": "250115"
    }



    info_req = prepare_request(method=method, path=path, data=request_params)
    rsp = requests.request(
        method=info_req.method,
        url="http://{}{}".format(g_knowledge_base_domain, info_req.path),
        headers=info_req.headers,
        data=info_req.body
    )
    rsp.encoding = "utf-8"
    return rsp.text  # 返回响应文本
  1. 处理原始回答内容:根据原始回答内容,提取出 标签,通过标签中 data-ref 和 data-img-ref 两部分在 search_knowledge 接口的返回值中匹配适合作为配图的图片 URL。具体实现步骤如下:
    1. 解析响应数据:函数接收模型的响应文本(response_text)和搜索结果文本(search_result_text)两个关键参数,并将这两个文本字符串解析为 Python 字典对象
    2. 提取模型生成的回答:从 response_data 中提取生成的回答文本,这个回答文本中可能包含形如<reference data-ref="XXXX" data-img-ref="true"></reference>的引用标签
    3. 利用正则表达式模式捕获出 point_id 和是否为配图引用:
      1. 使用正则表达式r'<reference data-ref="([^"]+)" data-img-ref="([^"]+)"></reference>'
      2. 这个模式能够捕获两个关键信息:
        1. 第一个捕获组(([^"]+)):提取 data-ref 属性值,即 point_id
        2. 第二个捕获组(([^"]+)):提取 data-img-ref 属性值,表示是否适合作为配图
    4. 图片链接的定位和提取:
      1. 检查 img_ref 是否为 True,确保这是一个配图引用
      2. 使用 point_id 在 point_map 中查找对应的知识点对象
      3. 检查该知识点是否包含"chunk_attachment"字段
      4. 遍历 chunk_attachment 列表,查找包含"link"字段的附件
      5. 从附件中提取 image_link ,这就是图片的 URL
def process_chat_completion_response(response_text, search_result_text):
    """处理chat completion的返回结果,提取生成的答案并处理引用标签"""
    response_data = json.loads(response_text)
    
    if response_data["code"] != 0:
        return "抱歉,处理回答时出现错误。"

    # 检查返回的数据结构
    if "data" not in response_data or not response_data["data"]:
        return "抱歉,返回的数据格式不正确。"

    # 根据实际的数据结构获取生成的答案
    try:
        generated_answer = None
        if "choices" in response_data["data"]:
            generated_answer = response_data["data"]["choices"][0]["message"]["content"]
        elif "message" in response_data["data"]:
            generated_answer = response_data["data"]["message"]["content"]
        elif "generated_answer" in response_data["data"]:
            generated_answer = response_data["data"]["generated_answer"]
        else:
            return "抱歉,无法从返回数据中提取答案。"
    except (KeyError, IndexError) as e:
        print("Error extracting answer:", str(e))
        return "抱歉,提取答案时出现错误。"

    # 解析引用标签
    import re
    reference_pattern = r'<reference data-ref="([^"]+)" data-img-ref="([^"]+)"></reference>'

    # 解析搜索结果,获取所有切片信息
    search_result = json.loads(search_result_text)
    points = []
    if search_result["code"] == 0 and "data" in search_result and "result_list" in search_result["data"]:
        points = search_result["data"]["result_list"]

    # 创建point_id到切片的映射,方便快速查找
    point_map = {point.get("point_id"): point for point in points if "point_id" in point}

    def replace_reference(match):
        point_id = match.group(1)
        img_ref = match.group(2).lower() == 'true'

        # 只有当data-img-ref为true时才处理图片
        if img_ref and point_id in point_map:
            point = point_map[point_id]
            if "chunk_attachment" in point and point["chunk_attachment"]:
                for attachment in point["chunk_attachment"]:
                    if "link" in attachment:
                        image_link = attachment["link"]
                        # 转换为Markdown格式的图片链接(可选),此步骤会以 markdown 格式输出图片引用
                        return f'![图片]({image_link})'
                        
        # 如果不满足条件或未找到图片,返回空字符串
        return ''

    # 替换所有引用标签
    processed_answer = re.sub(reference_pattern, replace_reference, generated_answer)

    # 修改原始响应中的generated_answer
    if "generated_answer" in response_data["data"]:
        response_data["data"]["generated_answer"] = processed_answer
    elif "choices" in response_data["data"]:
        response_data["data"]["choices"][0]["message"]["content"] = processed_answer
    elif "message" in response_data["data"]:
        response_data["data"]["message"]["content"] = processed_answer

    # 返回修改后的完整响应
    return json.dumps(response_data, ensure_ascii=False, indent=2)
  1. 主流程控制:将知识检索、Prompt生成、模型回答、回答内容后处理串联为完整的问答流程,支持图文混排。
def search_knowledge_and_chat_completion():
    # 1.执行search_knowledge
    search_result_text = search_knowledge()
    # 2.生成prompt
    prompt, image_urls = generate_prompt(search_result_text)
    # todo:用户需要本地缓存对话信息,并按照顺序依次加入到messages中
    # 3.拼接message对话, 问题对应role为user,系统对应role为system, 答案对应role为assistant, 内容对应content
    if image_urls:
          multi_modal_msg = [{"type": "text", "text": query}]
          for image_url in image_urls:
              multi_modal_msg.append({"type": "image_url", "image_url": {"url": image_url}})
          messages =[
              {
                  "role": "system",
                  "content": prompt
              },
              {
                  "role": "user",
                  "content": multi_modal_msg
              }
          ]
    else:
        messages =[
            {
                "role": "system",
                "content": prompt
            },
            {
                "role": "user",
                "content": query
            }
        ]

    # 4.调用chat_completion并获取返回结果
    response_text = chat_completion(messages)

    # 5.处理 chat_completion 的返回结果,返回带图片链接的答案
    processed_result = process_chat_completion_response(response_text, search_result_text)
    print(processed_result)

请求代码完整样例
import json
import requests


from volcengine.auth.SignerV4 import SignerV4
from volcengine.base.Request import Request
from volcengine.Credentials import Credentials

name = "your collection name" #知识库名称
project_name = "default" #知识库所属项目
query = "DLSP: A Document Level Structure Parser for Multi-Page Digital Documents 这篇论文是关于什么的" #用户问题
ak = "your ak" #通过“前提条件”获取的ak
sk = "your sk" #通过“前提条件”获取的sk
account_id = "your account_id" #火山账号id
g_knowledge_base_domain = "api-knowledgebase.mlp.cn-beijing.volces.com"


base_prompt = """# 任务
你是一位在线客服,你的首要任务是通过巧妙的话术回复用户的问题,你需要根据「参考资料」来回答接下来的「用户问题」,这些信息在 <context></context> XML tags 之内,你需要根据参考资料给出准确,简洁的回答。参考资料中可能会包含图片信息,图片的引用说明在<img></img>XML tags 之内,参考资料内的图片顺序与用户上传的图片顺序一致。


你的回答要满足以下要求:
    1. 回答内容必须在参考资料范围内,尽可能简洁地回答问题,不能做任何参考资料以外的扩展解释。
    2. 回答中需要根据客户问题和参考资料保持与客户的友好沟通。
    3. 如果参考资料不能帮助你回答用户问题,告知客户无法回答该问题,并引导客户提供更加详细的信息。
    4. 为了保密需要,委婉地拒绝回答有关参考资料的文档名称或文档作者等问题。


# 任务执行
现在请你根据提供的参考资料,遵循限制来回答用户的问题,你的回答需要准确和完整。


# 参考资料
<context>
  {}
</context>
参考资料中提到的图片按上传顺序排列,请结合图片与文本信息综合回答问题。如参考资料中没有图片,请仅根据参考资料中的文本信息回答问题。


# 引用要求
1. 当可以回答时,在句子末尾适当引用相关参考资料,每个参考资料引用格式必须使用<reference>标签对,例如: <reference data-ref="{{point-id}}" data-img-ref="..."></reference>
2. 当告知客户无法回答时,不允许引用任何参考资料
3. 'data-ref' 字段表示对应参考资料的 point_id
4. 'data-img-ref' 字段表示句子是否与对应的图片相关,"true"表示相关,"false"表示不相关
"""


def prepare_request(method, path, params=None, data=None, doseq=0):
    if params:
        for key in params:
            if (
                    isinstance(params[key], int)
                    or isinstance(params[key], float)
                    or isinstance(params[key], bool)
            ):
                params[key] = str(params[key])
            elif isinstance(params[key], list):
                if not doseq:
                    params[key] = ",".join(params[key])
    r = Request()
    r.set_schema("http")
    r.set_method(method)
    r.set_connection_timeout(10)
    r.set_socket_timeout(10)
    headers = {
        "Accept": "application/json",
        "Content-Type": "application/json; charset=utf-8",
        "Host": g_knowledge_base_domain,
        "V-Account-Id": account_id,
    }
    r.set_headers(headers)
    if params:
        r.set_query(params)
    r.set_host(g_knowledge_base_domain)
    r.set_path(path)
    if data is not None:
        r.set_body(json.dumps(data))


    # 生成签名
    credentials = Credentials(ak, sk, "air", "cn-north-1")
    SignerV4.sign(r, credentials)
    return r



def search_knowledge():
    method = "POST"
    path = "/api/knowledge/collection/search_knowledge"
    request_params = {
    "project": "default",
    "name": name,
    "query": query,
    "limit": 10,
    "pre_processing": {
        "need_instruction": True,
        "return_token_usage": True,
        "messages": [
            {
                "role": "system",
                "content": ""
            },
            {
                "role": "user",
                "content": query
            }
        ],
        "rewrite": False
    },
    "dense_weight": 0.5,
    "post_processing": {
        "get_attachment_link": True,
        "rerank_only_chunk": False,
        "rerank_switch": False,
        "chunk_group": True,
        "chunk_diffusion_count": 0
    }
}
    info_req = prepare_request(method=method, path=path, data=request_params)
    rsp = requests.request(
        method=info_req.method,
        url="http://{}{}".format(g_knowledge_base_domain, info_req.path),
        headers=info_req.headers,
        data=info_req.body
    )
    # print("search res = {}".format(rsp.text))  
    return rsp.text

def is_vision_model(model_name):
    if model_name is None:
        return False
    return "vision" in model_name
    
def get_content_for_prompt(point: dict, image_num: int) -> str:
    content = point["content"]
    original_question = point.get("original_question")
    if original_question:
        # faq 召回场景,content 只包含答案,需要把原问题也拼上
        return f'当询问到相似问题时,请参考对应答案进行回答:问题:"{original_question}"。答案:"{content}"'
    if image_num > 0 and "chunk_attachment" in point and point["chunk_attachment"][0]["link"] : 
        placeholder = f"<img>图片{image_num}</img>"
        return content + placeholder
    return content


def generate_prompt(rsp_txt):
    rsp = json.loads(rsp_txt)
    if rsp["code"] != 0:
        return "", []
    prompt = ""
    image_urls = []
    rsp_data = rsp["data"]
    points = rsp_data["result_list"]
    using_vlm = is_vision_model("Doubao-1-5-vision-pro-32k")
    image_cnt = 0


    for point in points:
        # 提取图片链接
        if using_vlm and "chunk_attachment" in point:
            image_link = point["chunk_attachment"][0]["link"]
            if image_link:
                image_urls.append(image_link)
                image_cnt += 1
        # 先拼接系统字段
        doc_info = point["doc_info"]
        for system_field in ["doc_name","title","chunk_title","content","point_id"] : 
            if system_field == 'doc_name' or system_field == 'title':
                if system_field in doc_info:
                    prompt += f"{system_field}: {doc_info[system_field]}\n"
            else:
                if system_field in point:
                    if system_field == "content":
                        prompt += f"content: {get_content_for_prompt(point, image_cnt)}\n"
                    else:
                        prompt += f"{system_field}: {point[system_field]}\n"
        if "table_chunk_fields" in point:
            table_chunk_fields = point["table_chunk_fields"]
            for self_field in [] : 
                # 使用 next() 从 table_chunk_fields 中找到第一个符合条件的项目
                find_one = next((item for item in table_chunk_fields if item["field_name"] == self_field), None)
                if find_one:
                    prompt += f"{self_field}: {find_one['field_value']}\n"


        prompt += "---\n"


    return base_prompt.format(prompt), image_urls


def (message, stream=False, return_token_usage=True, temperature=0.7, max_tokens=4096):
    method = "POST"
    path = "/api/knowledge/chat/completions"
    request_params = {
        "messages": message,
        "stream": False,  # 修改为False
        "return_token_usage": True,
        "model": "Doubao-1-5-vision-pro-32k",
        "max_tokens": 4096,
        "temperature": 0.7,
        "model_version": "250115"
    }


    info_req = prepare_request(method=method, path=path, data=request_params)
    rsp = requests.request(
        method=info_req.method,
        url="http://{}{}".format(g_knowledge_base_domain, info_req.path),
        headers=info_req.headers,
        data=info_req.body
    )
    rsp.encoding = "utf-8"
    return rsp.text  # 返回响应文本


def process_chat_completion_response(response_text, search_result_text):
    """处理chat completion的返回结果,提取生成的答案并处理引用标签"""
    response_data = json.loads(response_text)


    if response_data["code"] != 0:
        return "抱歉,处理回答时出现错误。"


    # 检查返回的数据结构
    if "data" not in response_data or not response_data["data"]:
        return "抱歉,返回的数据格式不正确。"


    # 根据实际的数据结构获取生成的答案
    try:
        generated_answer = None
        if "choices" in response_data["data"]:
            generated_answer = response_data["data"]["choices"][0]["message"]["content"]
        elif "message" in response_data["data"]:
            generated_answer = response_data["data"]["message"]["content"]
        elif "generated_answer" in response_data["data"]:
            generated_answer = response_data["data"]["generated_answer"]
        else:
            return "抱歉,无法从返回数据中提取答案。"
    except (KeyError, IndexError) as e:
        print("Error extracting answer:", str(e))
        return "抱歉,提取答案时出现错误。"


    # 解析引用标签
    import re
    reference_pattern = r'<reference data-ref="([^"]+)" data-img-ref="([^"]+)"></reference>'


    # 解析搜索结果,获取所有切片信息
    search_result = json.loads(search_result_text)
    points = []
    if search_result["code"] == 0 and "data" in search_result and "result_list" in search_result["data"]:
        points = search_result["data"]["result_list"]


    # 创建point_id到切片的映射,方便快速查找
    point_map = {point.get("point_id"): point for point in points if "point_id" in point}


    def replace_reference(match):
        point_id = match.group(1)
        img_ref = match.group(2).lower() == 'true'


        # 只有当data-img-ref为true时才处理图片
        if img_ref and point_id in point_map:
            point = point_map[point_id]
            if "chunk_attachment" in point and point["chunk_attachment"]:
                for attachment in point["chunk_attachment"]:
                    if "link" in attachment:
                        image_link = attachment["link"]
                        # 转换为Markdown格式的图片链接
                        return f'![图片]({image_link})'


        # 如果不满足条件或未找到图片,返回空字符串
        return ''


    # 替换所有引用标签
    processed_answer = re.sub(reference_pattern, replace_reference, generated_answer)


    # 修改原始响应中的generated_answer
    if "generated_answer" in response_data["data"]:
        response_data["data"]["generated_answer"] = processed_answer
    elif "choices" in response_data["data"]:
        response_data["data"]["choices"][0]["message"]["content"] = processed_answer
    elif "message" in response_data["data"]:
        response_data["data"]["message"]["content"] = processed_answer


    # 返回修改后的完整响应
    return json.dumps(response_data, ensure_ascii=False, indent=2)


def search_knowledge_and_chat_completion():
    # 1.执行search_knowledge
    search_result_text = search_knowledge()
    # 2.生成prompt
    prompt, image_urls = generate_prompt(search_result_text)
    # todo:用户需要本地缓存对话信息,并按照顺序依次加入到messages中
    # 3.拼接message对话, 问题对应role为user,系统对应role为system, 答案对应role为assistant, 内容对应content
    if image_urls:
          multi_modal_msg = [{"type": "text", "text": query}]
          for image_url in image_urls:
              multi_modal_msg.append({"type": "image_url", "image_url": {"url": image_url}})
          messages =[
              {
                  "role": "system",
                  "content": prompt
              },
              {
                  "role": "user",
                  "content": multi_modal_msg
              }
          ]
    else:
        messages =[
            {
                "role": "system",
                "content": prompt
            },
            {
                "role": "user",
                "content": query
            }
        ]


    # 4.调用chat_completion并获取返回结果
    response_text = chat_completion(messages)


    # 5.处理 chat_completion 的返回结果,返回带图片链接的答案
    processed_result = process_chat_completion_response(response_text, search_result_text)
    print(processed_result)


if __name__ == "__main__":
    search_knowledge_and_chat_completion()

search_knowledge 接口返回值说明

search_knowledge 接口会返回检索后最相似的切片信息,以及对应的切片内容。如果切片中含有图片,在您将 search_knowledge 接口的 get_attachment_link 参数设置为 True 时,返回消息会在 chunk_attachment 中的 link 部分提供图片的临时下载链接,注意链接有效期 10 分钟。

{
    "code": 0,
    "data": {
        "collection_name": "test_pdf_zy",
        "count": 10,
        "token_usage": {
            "embedding_token_usage": {
                "prompt_tokens": 35,
                "completion_tokens": 0,
                "total_tokens": 35
            },
            "rerank_token_usage": 0
        },
        "result_list": [
            {
                "id": "_sys_auto_gen_doc_id-14474546439936949726-0",
                "content": "Seed-Thinking-v1.5: Advancing Superb Reasoning Models with Reinforcement Learning\nByteDance Seed\nFull author list in Contributions\nAbstract\nWe introduce Seed-Thinking-v1.5, capable of reasoning through thinking before responding, resulting in improved performance on a wide range of benchmarks. Seed-Thinking-v1.5 achieves 86.7 on AIME 2024, 55.0 on Codeforces and 77.3 on GPQA, demonstrating excellent reasoning abilities in STEM and coding. Beyond reasoning tasks, the method demonstrates notable generalization across diverse domains. For instance, it surpasses DeepSeek R1 by 8% in win rate on non-reasoning tasks, indicating its broader applicability. Compared to other state-of-the-art reasoning models, Seed-Thinking-v1.5 is a Mixture-of-Experts (MoE) model with a relatively small size, featuring 20B activated and 200B total parameters. As part of our effort to assess generalized reasoning, we develop two internal benchmarks, BeyondAIME and Codeforces, both of which will be publicly released to support future research.\n[Date: April 10, 2025\nFigure 1 Benchmark performance on reasoning tasks]Seed-Thinking-v1.5 DeepSeek R1 Gemini 2.5 pro OpenAI o3-mini-high 100 92.0 87.3 86.7 86.7 86.5 84.0 79.8 79.7 80 77.3 74.0 71.5 67.5 65.0 63.8 63.6 58.8 60 Accuracy (%) 56.3 55.0 49.2 49.3 48.0 47.0 45.0 42.4 39.9 40 27.6 25.8 20 18.3 AIME 2024 AIME 2025 Codeforces SWE-bench GPQA Diamond ARC-AGI Beyond AIME",
                "score": 0.4485875368118286,
                "point_id": "_sys_auto_gen_doc_id-14474546439936949726-0",
                "chunk_title": "Seed-Thinking-v1.5: Advancing Superb Reasoning Models with Reinforcement LearningAbstractDate: April 10, 2025\nFigure 1 Benchmark performance on reasoning tasks",
                "chunk_id": 0,
                "process_time": 1744807812,
                "doc_info": {
                    "doc_id": "_sys_auto_gen_doc_id-14474546439936949726",
                    "doc_name": "seed-thinking-v1.5.pdf",
                    "create_time": 1744807788,
                    "doc_type": "pdf",
                    "doc_meta": "[{\"field_name\":\"doc_id\",\"field_type\":\"string\",\"field_value\":\"_sys_auto_gen_doc_id-14474546439936949726\"}]",
                    "source": "tos_fe",
                    "title": "Seed-Thinking-v1.5: Advancing Superb Reasoning Models with Reinforcement Learning"
                },
                "recall_position": 2,
                "chunk_type": "image",
                "chunk_source": "document",
                "update_time": 1744807812,
                "chunk_attachment": [
                    {
                        "uuid": "c68dede9-c362-4fa9-b254-666893a92b33",
                        "caption": "Date: April 10, 2025\nFigure 1 Benchmark performance on reasoning tasks",
                        "type": "image",
                        "link": "https://knowledgebase-image.tos-cn-beijing.volces.com/c68dede9-c362-****-b254-666893a92b33?X-Tos-Algorithm=TOS4-HMAC-SHA256\u0026X-Tos-Credential=AKLTYzA0NGUwNTUzMjlhNGZjN2E5ZTQwYjlmZWJiYTMwODM%2F20250424%2Fcn-beijing%2Ftos%2Frequest\u0026X-Tos-Date=20250424T035401Z\u0026X-Tos-Expires=600\u0026X-Tos-Signature=****a4d950b661cb4dcc2e726146730eb817421bd95891e646de3caf4e0bddbc\u0026X-Tos-SignedHeaders=host"
                    }
                ],
                "FromRecallQueue": "master_recall",
                "original_coordinate": {
                    "page_no": [
                        0,
                        0,
                        0,
                        0,
                        0,
                        0
                    ],
                    "bbox": [
                        [
                            0.1568839869281046,
                            0.13181597727272726,
                            0.8431165032679738,
                            0.17835829545454546
                        ],
                        [
                            0.4337450980392157,
                            0.2216444696969697,
                            0.5670680555555556,
                            0.2342235732323232
                        ],
                        [
                            0.3855555392156863,
                            0.24445758838383838,
                            0.6152536764705883,
                            0.25703661616161616
                        ],
                        [
                            0.4569754901960784,
                            0.303469797979798,
                            0.5430257026143791,
                            0.3185647474747475
                        ],
                        [
                            0.15502287581699345,
                            0.3281949747474748,
                            0.8464071241830066,
                            0.4766287878787879
                        ],
                        [
                            0.12087380652334176,
                            0.5504872485844776,
                            0.8730337105545343,
                            0.89134986954506
                        ]
                    ]
                }
            },
#省略后续九个引用切片
... ...
    },
    "message": "success",
    "request_id": "02174546684150800000000000000000000ffff0a00580c55b970"
}

chat_completions 接口返回值说明

在本样例代码,chat_completions 接口原始返回值的 generated_answer 部分,会通过 标签来引入生成回答所参考的图片来源。

  • 一个完整的 标签示例为:\u003creference data-ref="_sys_auto_gen_doc_id-426769*****41885639-0" data-img-ref="true"\u003e
  • data-ref="_sys_auto_gen_doc_id-42676952*****885639-0":这里给出引用切片的唯一标识符(point-id), 可以用它来标识和查找对应的切片。
  • data-img-ref="true/false":这个属性表明回答内容是否与图片相关的。若为 true,则表明该切片中的图片与回答内容相关,需要处理并渲染相应的图像。若为 false,则表明模型判断该切片中的图片与回答内容无关,不需要处理并渲染。
{
    "code":0,
    "data":{
        "generated_answer":""Seed-Thinking-v1.5 是一种能够在回复前通过思考进行推理的模型,在广泛的基准测试中性能有所提升。\u003creference data-ref=\"_sys_auto_gen_doc_id-14474546439936949726-0\" data-img-ref=\"true\"\u003e\u003c/reference\u003e\u003creference data-ref=\"_sys_auto_gen_doc_id-14474546439936949726-1\" data-img-ref=\"false\"\u003e\u003c/reference\u003e。  ",
        "usage":"{
            \"prompt_tokens\":5841,
            \"completion_tokens\":94,
            \"total_tokens\":5935,
            \"prompt_tokens_details\":{\"cached_tokens\":0},
            \"completion_tokens_details\":{\"reasoning_tokens\":0}
            }
    \n"},
    "message":"success",
    "request_id":"02174531357890000000000000000000000ffff0a0040921bc0f9"
}