You need to enable JavaScript to run this app.
导航
场景搭建(Web)
最近更新时间:2024.09.25 17:59:10首次发布时间:2024.07.17 15:35:25

本文介绍如何集成火山引擎 RTC WebSDK,并调用对应服务端接口实现 AI 实时交互能力。

整体业务流程图

AI 实时互动的实现流程如下图所示:

前提条件

开始前,你需要参看开通服务获取临时 Token 开通需要服务、配置策略并获取 Token。

核心功能实现

下文展示如何集成 RTC Web SDK 及 服务端 OpenAPI 接口,实现进退房、监听回调、调用智能体实现 AI 通话逻辑。完整的时序图可以参看API 时序图

以下示例代码基于 RTC Web SDK v4.58.9

1. 前置定义

定义 RtcClient 类,用于管理 RTC 相关操作,包括创建 RTC 引擎、监听事件、管理音频设备和操作智能体。

import VERTC, {
  IRTCEngine,
  RoomProfileType,
  onUserJoinedEvent,
  MediaType,
} from '@volcengine/rtc';

/**
 * @brief Event Listeners
 */
export interface IEventListener {
  [VERTC.events.onUserJoined]: (e: onUserJoinedEvent) => void;
  [VERTC.events.onUserLeave]: (e: onUserLeaveEvent) => void;
  [VERTC.events.onUserPublishStream]: (e: { userId: string; mediaType: MediaType }) => void;
  [VERTC.events.onUserUnpublishStream]: (e: { userId: string; mediaType: MediaType; reason: StreamRemoveReason }) => void;
  [VERTC.events.onUserStartAudioCapture]: (e: { userId: string }) => void;
  [VERTC.events.onUserStopAudioCapture]: (e: { userId: string }) => void;
  [VERTC.events.onRoomBinaryMessageReceived]: (e: { userId: string; message: ArrayBuffer }) => void;
}

/**
 * @brief Basic options
 */
export interface BasicOptions {
  appId: string;
  token?: string;
  userId: string;
  roomId: string;
}

/**
 * @brief RTC Client
 */
export class RtcClient {
  
  /**
   * @brief RTC Engine
   */
  engine!: IRTCEngine;
  /**
   * @brief 相关基础配置
   */
  config!: BasicOptions;
  /**
   * @brief 当前 AI Bot 是否启用
   */
  audioBotEnabled = false;

  /**
   * @brief 引擎初始化
   */
  createEngine = (props: BasicOptions) => {
    this.config = props;
    this.engine = VERTC.createEngine(this.config.appId);
  };

  /**
   * @brief 监听事件
   */
  addEventListeners = (events: IEventListener) => {
    for (const event of Object.keys(events)) {
      this.engine.on(event, events[event]);
    }
  };
  
  /**
   * @brief 获取可用的音频设备列表
   */
  async getDevices(): Promise<{
    audioInputs: MediaDeviceInfo[];
  }> {
    const audioInputs = await VERTC.enumerateAudioCaptureDevices();
    return {
      audioInputs: audioInputs.map((input) => input.deviceId),
    };
  }

  /**
   * @brief 开启内部音频采集
   */
  startAudioCapture = async (mic: string) => {
    await this.engine.startAudioCapture(mic);
  };

  /**
   * @brief 停止内部音频采集
   */
  stopAudioCapture = async () => {
    await this.engine.stopAudioCapture();
  };

  /**
   * @brief 推流
   */
  publishStream = (mediaType: MediaType) => {
    this.engine.publishStream(mediaType);
  };

  /**
   * @brief 停止推流
   */
  unpublishStream = (mediaType: MediaType) => {
    this.engine.unpublishStream(mediaType);
  };

  /**
   * @brief 进房
   */
  joinRoom = ({ token, username }: {
    token: string;
    username?: string;
  }): Promise<void> => {
    return this.engine.joinRoom(
      token,
      `${this.config.roomId!}`,
      {
        userId: this.config.userId,
        extraInfo: JSON.stringify({
          user_name: username || this.config.userId,
          user_id: this.config.userId,
        }),
      },
      {
        /** 可设置为进房后自动推流 */
        isAutoPublish: true,
        isAutoSubscribeAudio: true,
        roomProfileType: RoomProfileType.chat,
      }
    );
  };

  /**
   * @brief 离房
   */
  leaveRoom = () => {
    this.stopAudioBot(this.config.roomId, this.config.userId);
    this.engine.leaveRoom();
  };

  /**
   * @brief 启用 AI bot
   */
  startAudioBot = async (
    roomId: string,
    userId: string,
    config,
  ) => {
    if (this.audioBotEnabled) {
      await this.stopAudioBot(roomId, userId);
    }
    /** 调用 openapi 启用 AI bot, 此处详情请参考 openapi 文档 */
    await openAPIs.StartVoiceChat(config); //openAPIs 中封装了对服务端 OpenAPI 的调用逻辑,可根据业务实际逻辑自行实现。
    this.audioBotEnabled = true;
  };

  /**
   * @brief 停止 AI bot
   */
  stopAudioBot = async (roomId: string, userId: string) => {
    if (this.audioBotEnabled) {
      await openAPIs.StopVoiceChat({
        AppId: 'Your App ID',
        BusinessId: 'Your Business ID',
        RoomId: roomId,
        TaskId: userId,
      });
      this.audioBotEnabled = false;
    }
  };

  /**
   * @brief 打断智能体说话
   */
  stopAudioVoice = async (roomId: string, userId: string) => {
    if (this.audioBotEnabled) {
      const res = await openAPIs.UpdateVoiceChat({
        AppId: 'Your App ID',
        BusinessId: 'Your Business ID',
        RoomId: roomId,
        TaskId: userId,
        Command: 'interrupt',
      });
      return res;
    }
    return Promise.reject(new Error('AI 打断失败'));
  };
}

export default new RtcClient();

2. 初始化并监听回调

初始化 RTC Engine 并添加相关事件监听器。

import VERTC, {
    onUserJoinedEvent, onUserLeaveEvent,
    MediaType,
    StreamRemoveReason,
} from '@volcengine/rtc';
import RtcClient from '/path/RtcClient';

...

RtcClient.createEngine();
RtcClient.addEventListeners({
    [VERTC.events.onUserJoined]: (e: onUserJoinedEvent) => { ... },// 用户进房回调
    [VERTC.events.onUserLeave]: (e: onUserLeaveEvent) => { ... },// 用户离房回调
    [VERTC.events.onUserPublishStream]: (e: { userId: string; mediaType: MediaType }) => { ... },// 用户发布音视频流回调
    [VERTC.events.onUserUnpublishStream]: (e: { userId: string; mediaType: MediaType, reason: StreamRemoveReason }) => { ... },// 用户取消发布音视频流回调
    [VERTC.events.onUserStartAudioCapture]: (e: { userId: string }) => { ... },// 用户开启音频采集回调
    [VERTC.events.onUserStopAudioCapture]: (e: { userId: string }) => { ... },// 用户关闭音频采集回调
    ...,
})

3. 加入房间

用户加入房间,并开启音频采集。

import RtcClient from '/path/RtcClient';

const token = 'Your token';
const username = 'Username';

...

const audioInputs = await RtcClient.getDevices();
const defaultUsableDevice = audioInputs?.[0];

await RtcClient.joinRoom({ token, username });
await RtcClient.startAudioCapture(defaultUsableDevice);

4. 调用智能体

调用智能体实现 AI 通话。为了实现该模块功能,你需要调用相关 AIGC-对话式 AI 实时交互 OpenAPI 接口实现智能体开启、关闭、打断功能。

import RtcClient from '/path/RtcClient';

const roomId = 'Your room id';
const userId = 'Your user id';
const config = {
    // 参考 OpenAPI 文档进行配置
};
...

await RtcClient.startAudioBot(roomId, userId, config);// 开启智能体
await RtcClient.stopAudioBot(roomId, userId);// 关闭智能体
await RtcClient.stopAudioVoice(roomId, userId);// 打断智能体

5. 离开房间

用户关闭音频采集,离开房间。

import RtcClient from '/path/RtcClient';

const roomId = 'Your room id';
const userId = 'Your user id';

...

await RtcClient.stopAudioCapture();
await RtcClient.leaveRoom();

API 时序图

核心功能 API 与回调参考

API

功能点API
创建 RTCEngine 实例createEngine()
设置视频发布参数setVideoEncoderConfig()
开启本地音频采集startAudioCapture()
开启本地视频采集startVideoCapture()
加入 RTC 房间joinRoom()
离开房间leaveRoom()
销毁引擎实例对象destroyEngine()
发布本地通过麦克风采集的媒体流publishStream()
取消发布本地通过麦克风采集的媒体流unpublishStream()
枚举可用的麦克风设备(enumerateAudioCaptureDevices)

回调

功能点回调
远端可见用户加入房间onUserJoined
远端可见用户离开房间onUserLeave
远端用户麦克风采集音视频流onUserPublishStream
远端用户麦克风采集的媒体流移除onUserUnpublishStream
远端用户麦克风采集开启onUserStartAudioCapture
远端用户麦克风采集关闭onUserStopAudioCapture