自定义大模型上下文--实时音视频-火山引擎

文档中心

导航

自定义大模型上下文

最近更新时间：2025.04.09 13:59:42首次发布时间：2025.04.01 17:09:08

在用户与智能体进行语音互动时，你可能需要动态传入自定义上下文信息，以帮助大模型更准确地理解用户状态，从而提升对话的自然度。自定义上下文信息可以包括用户的实时生理数据、游戏数据、角色设定等。这些信息将根据你设定的优先级，作为 UserMessage 传入大模型，并在当前对话轮次中生效。

应用场景

场景	描述	示例
游戏陪玩助手	结合用户实时游戏数据生成合适的回答，为用户提供更贴心的陪伴。	自定义输入："当前用户战绩 0-14，金币落后，为其提供出装建议"。智能体输出："我观察到你处于逆风局，建议出防御装。"
健康咨询助手	结合用户实时生理数据生成个性化建议，提升健康指导的精准度。	自定义输入："当前用户心率 130，面色发白，建议休息"。智能体输出："监测到您心率持续偏高，建议暂停运动并补充电解质。"

前提条件

你已参考场景搭建构建一个完整的 AI 应用。

实现方法

通过服务端或客户端均可实现自定义传入大模型上下文信息，你可根据业务请求端的类型选择对应的方式。例如你在开发 AI 应用时，选择服务端响应请求，建议使用服务端实现传入大模型上下文，降低请求延迟。

通过服务端传入大模型上下文

调用 UpdateVoiceChat接口，设置以下参数自定义传入大模型上下文：

参数	类型	描述
AppId	String	RTC 应用 AppId，参看获取 AppId。
RoomId	String	自定义会话房间 ID。
TaskId	String	自定义会话任务 ID。
Command	String	填入`ExternalTextToLLM`，表示自定义传入大模型上下文。
Message	String	填入自定义传入大模型上下文内容，长度不超过 200 个字符。
InterruptMode	Integer	传入大模型上下文内容处理优先级。 `1`：高优先级。智能体会终止当前交互，将文本送入大模型进行处理并输出结果。 `2`：中优先级。智能体会在当前交互结束后，将文本送入大模型进行处理并输出结果。 `3`：低优先级。如果此时智能体正在交互，智能体会直接丢弃传入的文本内容。如果未在交互，智能体会将文本送入大模型进行处理并输出结果。

你可参考以下示例从服务端实现自定义传入大模型上下文内容：

POST https://rtc.volcengineapi.com?Action=UpdateVoiceChat&Version=2024-12-01
{
    "AppId": "661e****543cf", 
    "RoomId": "Room1", 
    "TaskId": "task1", 
    "Command": "ExternalTextToLLM",
    "Message": "给一些出装建议。用户当前的背景是：金币落后，法师，输出高，脆皮",
    "InterruptMode": 2
}

通过客户端传入大模型上下文

使用 SendUserBinaryMessage 接口实现自定义传入大模型上下文。该接口的 buffer 参数需要传入特定格式的内容，下图展示了 buffer 参数的格式：

alt

参数名	类型	描述
magic_number	binary	消息格式标识符，当前场景消息格式固定为 `ctrl`，用于标识该消息为控制消息。
length	binary	传入大模型上下文消息长度，单位为字节，采用大端序（Big-endian）存储方式，用于说明 `control_message` 字段的字节长度。
control_message	binary	传入大模型上下文配置信息，采用 JSON 格式，具体内容格式参看 control_message 格式。

control_message:

参数名	类型	描述
Command	String	控制命令，此处填入 `ExternalTextToLLM`。
Message	String	传入大模型上下文内容，长度不超过 200 个字符。
InterruptMode	Int	传入大模型上下文内容处理优先级。 `1`：高优先级。智能体会终止当前交互，将文本送入大模型进行处理并输出结果。 `2`：中优先级。智能体会在当前交互结束后，将文本送入大模型进行处理并输出结果。 `3`：低优先级。如果此时智能体正在交互，智能体会直接丢弃传入的文本内容。如果未在交互，智能体会将文本送入大模型进行处理并输出结果。

你可参看以下示例从客户端实现自定义传入大模型上下文。

C++

Java

TypeScript

// 传入大模型上下文信息
void sendLLMMessage(const std::string &uid, const std::string& content) {
    nlohmann::json json_data;
    json_data["Command"] = "ExternalTextToLLM";
    json_data["Message"] = content;
    json_data["InterruptMode"] = 1; 
    sendUserBinaryMessage(uid, json_data.dump());
}

void buildBinaryMessage(const std::string& magic_number, const std::string& message, size_t& binary_message_length, std::shared_ptr<uint8_t[]>& binary_message) {
    auto magic_number_length = magic_number.size();
    auto message_length = message.size();

    binary_message_length = magic_number_length + 4 + message_length;
    binary_message = std::shared_ptr<uint8_t[]>(new uint8_t[binary_message_length]);
    std::memcpy(binary_message.get(), magic_number.data(), magic_number_length);
    binary_message[magic_number_length] = static_cast<uint8_t>((message_length >> 24) & 0xFF);
    binary_message[magic_number_length+1] = static_cast<uint8_t>((message_length >> 16) & 0xFF);
    binary_message[magic_number_length+2] = static_cast<uint8_t>((message_length >> 8) & 0xFF);
    binary_message[magic_number_length+3] = static_cast<uint8_t>(message_length & 0xFF);
    std::memcpy(binary_message.get()+magic_number_length+4, message.data(), message_length);
}

int sendUserBinaryMessage(const std::string &uid, const std::string& message) {
    if (rtcRoom_ != nullptr)
    {
        size_t length = 0;
        std::shared_ptr<uint8_t[]> binary_message = nullptr;
        buildBinaryMessage("ctrl", message, length, binary_message);
        return rtcRoom_->sendUserBinaryMessage(uid.c_str(), static_cast<int>(length), binary_message.get());
    }
    return -1;
}

// 传入大模型上下文信息
public void sendLLMMessage(String userId, String content) {
    JSONObject json = new JSONObject();
    try {
        json.put("Command", "ExternalTextToLLM");
        json.put("Message", content);
        json.put("InterruptMode", 1); // InterruptMode 可选值1,2,3
    } catch (JSONException e) {
        throw new RuntimeException(e);
    }
    String jsonString = json.toString();
    byte[] buildBinary = buildBinaryMessage("ctrl", jsonString);
    sendUserBinaryMessage(userId, buildBinary);
}

private byte[] buildBinaryMessage(String magic_number, String content) {
    byte[] prefixBytes = magic_number.getBytes(StandardCharsets.UTF_8);
    byte[] contentBytes = content.getBytes(StandardCharsets.UTF_8);
    int contentLength = contentBytes.length;

    ByteBuffer buffer = ByteBuffer.allocate(prefixBytes.length + 4 + contentLength);
    buffer.order(ByteOrder.BIG_ENDIAN);
    buffer.put(prefixBytes);
    buffer.putInt(contentLength);
    buffer.put(contentBytes);
    return buffer.array();
}

public void sendUserBinaryMessage(String userId, byte[] buffer) {
    if (rtcRoom_ != null) {
        rtcRoom_.sendUserBinaryMessage(userId, buffer, MessageConfig.RELIABLE_ORDERED);
    }
}

import VERTC from '@volcengine/rtc';
const BotName = 'RobotMan_';
const CommandKey = 'ctrl';
const engine = VERTC.createEngine('Your AppID');

/**
 * @brief 指令类型
 */
enum COMMAND {
  /**
   * @brief 传入大模型上下文信息
   */
  EXTERNAL_TEXT_TO_LLM = 'ExternalTextToLLM',
};

/**
 * @brief 打断的类型
 */
enum INTERRUPT_PRIORITY {
  /**
   * @brief 占位
   */
  NONE,
  /**
   * @brief 高优先级。智能体会终止当前交互，将文本送入大模型进行处理并输出结果。
   */
  HIGH,
  /**
   * @brief 中优先级。智能体会在当前交互结束后，将文本送入大模型进行处理并输出结果。
   */
  MEDIUM,
  /**
   * @brief 低优先级。如果此时智能体正在交互，智能体会直接丢弃传入的文本内容。如果未在交互，智能体会将文本送入大模型进行处理并输出结果。
   */
  LOW,
};

/**
 * @brief 将字符串包装成 TLV
 */
function stringToTLV(inputString: string, type = '') {
  const typeBuffer = new Uint8Array(4);

  for (let i = 0; i < type.length; i++) {
    typeBuffer[i] = type.charCodeAt(i);
  }

  const lengthBuffer = new Uint32Array(1);
  const valueBuffer = new TextEncoder().encode(inputString);

  lengthBuffer[0] = valueBuffer.length;

  const tlvBuffer = new Uint8Array(typeBuffer.length + 4 + valueBuffer.length);

  tlvBuffer.set(typeBuffer, 0);

  tlvBuffer[4] = (lengthBuffer[0] >> 24) & 0xff;
  tlvBuffer[5] = (lengthBuffer[0] >> 16) & 0xff;
  tlvBuffer[6] = (lengthBuffer[0] >> 8) & 0xff;
  tlvBuffer[7] = lengthBuffer[0] & 0xff;

  tlvBuffer.set(valueBuffer, 8);

  return tlvBuffer.buffer;
};

/**
 * @brief 传入大模型上下文信息
 */
engine.sendUserBinaryMessage(
  BotName,
  stringToTLV(
    JSON.stringify({
      Command: COMMAND.EXTERNAL_TEXT_TO_LLM,
      Message: '给一些出装建议。用户当前的背景是：金币落后，法师，输出高，脆皮',
      InterruptMode: INTERRUPT_PRIORITY.HIGH,
    }),
    CommandKey,
  )
);