本文介绍如何集成火山引擎 RTC WebSDK,并调用对应服务端接口实现 AI 实时交互能力。
你也可以参照本文逻辑调用其他平台客户端 API 实现该场景。
开始前,你需要参看开通服务和 获取临时 Token 开通需要服务、配置策略并获取 Token。
AI 实时互动的实现流程如下图所示:
完整的时序图参看API 时序图。
下文展示如何集成 RTC Web SDK 及 服务端 OpenAPI 接口,实现进退房、监听回调、调用智能体实现 AI 通话逻辑。
以下示例代码基于 RTC Web SDK v4.58.9。
定义 RtcClient 类,用于管理 RTC 相关操作,包括创建 RTC 引擎、监听事件、管理音频设备和操作智能体。
import VERTC, { IRTCEngine, onUserJoinedEvent, MediaType, } from '@volcengine/rtc'; /** * @brief Event Listeners */ export interface IEventListener { [VERTC.events.onUserJoined]: (e: onUserJoinedEvent) => void; [VERTC.events.onUserLeave]: (e: onUserLeaveEvent) => void; [VERTC.events.onUserPublishStream]: (e: { userId: string; mediaType: MediaType }) => void; [VERTC.events.onUserUnpublishStream]: (e: { userId: string; mediaType: MediaType; reason: StreamRemoveReason }) => void; [VERTC.events.onUserStartAudioCapture]: (e: { userId: string }) => void; [VERTC.events.onUserStopAudioCapture]: (e: { userId: string }) => void; [VERTC.events.onRoomBinaryMessageReceived]: (e: { userId: string; message: ArrayBuffer }) => void; } /** * @brief Basic options */ export interface BasicOptions { appId: string; token?: string; userId: string; roomId: string; } /** * @brief RTC Client */ export class RtcClient { /** * @brief RTC Engine */ engine!: IRTCEngine; /** * @brief 相关基础配置 */ config!: BasicOptions; /** * @brief 当前 AI Bot 是否启用 */ audioBotEnabled = false; /** * @brief 引擎初始化 */ createEngine = (props: BasicOptions) => { this.config = props; this.engine = VERTC.createEngine(this.config.appId); }; /** * @brief 监听事件 */ addEventListeners = (events: IEventListener) => { for (const event of Object.keys(events)) { this.engine.on(event, events[event]); } }; /** * @brief 获取可用的音频设备列表 */ async getDevices(): Promise<{ audioInputs: MediaDeviceInfo[]; }> { const audioInputs = await VERTC.enumerateAudioCaptureDevices(); return { audioInputs: audioInputs.map((input) => input.deviceId), }; } /** * @brief 开启内部音频采集 */ startAudioCapture = async (mic: string) => { await this.engine.startAudioCapture(mic); }; /** * @brief 停止内部音频采集 */ stopAudioCapture = async () => { await this.engine.stopAudioCapture(); }; /** * @brief 推流 */ publishStream = (mediaType: MediaType) => { this.engine.publishStream(mediaType); }; /** * @brief 停止推流 */ unpublishStream = (mediaType: MediaType) => { this.engine.unpublishStream(mediaType); }; /** * @brief 进房 */ joinRoom = ({ token, username }: { token: string; username?: string; }): Promise<void> => { return this.engine.joinRoom( token, `${this.config.roomId!}`, { userId: this.config.userId, extraInfo: JSON.stringify({ user_name: username || this.config.userId, user_id: this.config.userId, }), }, { /** 可设置为进房后自动推流 */ isAutoPublish: true, isAutoSubscribeAudio: true, } ); }; /** * @brief 离房 */ leaveRoom = () => { this.stopAudioBot(this.config.roomId, this.config.userId); this.engine.leaveRoom(); }; /** * @brief 启用 AI bot */ startAudioBot = async ( roomId: string, userId: string, config, ) => { if (this.audioBotEnabled) { await this.stopAudioBot(roomId, userId); } /** 调用 openapi 启用 AI bot, 此处详情请参考 openapi 文档 */ await openAPIs.StartVoiceChat(config); //openAPIs 中封装了对服务端 OpenAPI 的调用逻辑,可根据业务实际逻辑自行实现。 this.audioBotEnabled = true; }; /** * @brief 停止 AI bot */ stopAudioBot = async (roomId: string, userId: string) => { if (this.audioBotEnabled) { await openAPIs.StopVoiceChat({ AppId: 'Your App ID', BusinessId: 'Your Business ID', RoomId: roomId, TaskId: userId, }); this.audioBotEnabled = false; } }; /** * @brief 打断智能体说话 */ stopAudioVoice = async (roomId: string, userId: string) => { if (this.audioBotEnabled) { const res = await openAPIs.UpdateVoiceChat({ AppId: 'Your App ID', BusinessId: 'Your Business ID', RoomId: roomId, TaskId: userId, Command: 'interrupt', }); return res; } return Promise.reject(new Error('AI 打断失败')); }; } export default new RtcClient();
初始化 RTC Engine 并添加相关事件监听器。
import VERTC, { onUserJoinedEvent, onUserLeaveEvent, MediaType, StreamRemoveReason, } from '@volcengine/rtc'; import RtcClient from '/path/RtcClient'; ... RtcClient.createEngine(); RtcClient.addEventListeners({ [VERTC.events.onUserJoined]: (e: onUserJoinedEvent) => { ... },// 用户进房回调 [VERTC.events.onUserLeave]: (e: onUserLeaveEvent) => { ... },// 用户离房回调 [VERTC.events.onUserPublishStream]: (e: { userId: string; mediaType: MediaType }) => { ... },// 用户发布音视频流回调 [VERTC.events.onUserUnpublishStream]: (e: { userId: string; mediaType: MediaType, reason: StreamRemoveReason }) => { ... },// 用户取消发布音视频流回调 [VERTC.events.onUserStartAudioCapture]: (e: { userId: string }) => { ... },// 用户开启音频采集回调 [VERTC.events.onUserStopAudioCapture]: (e: { userId: string }) => { ... },// 用户关闭音频采集回调 ..., })
用户加入房间,并开启音频采集。
import RtcClient from '/path/RtcClient'; const token = 'Your token'; const username = 'Username'; ... const audioInputs = await RtcClient.getDevices(); const defaultUsableDevice = audioInputs?.[0]; await RtcClient.joinRoom({ token, username }); await RtcClient.startAudioCapture(defaultUsableDevice);
调用智能体实现 AI 通话。为了实现该模块功能,你需要调用相关 实时对话式 AI OpenAPI 接口实现智能体开启、关闭、打断功能。
import RtcClient from '/path/RtcClient'; const roomId = 'Your room id'; const userId = 'Your user id'; const config = { // 参考 OpenAPI 文档进行配置 }; ... await RtcClient.startAudioBot(roomId, userId, config);// 开启智能体 await RtcClient.stopAudioBot(roomId, userId);// 关闭智能体 await RtcClient.stopAudioVoice(roomId, userId);// 打断智能体
用户关闭音频采集,离开房间。
import RtcClient from '/path/RtcClient'; const roomId = 'Your room id'; const userId = 'Your user id'; ... await RtcClient.stopAudioCapture(); await RtcClient.leaveRoom();
功能点 | API |
---|---|
创建 RTCEngine 实例 | createEngine() |
设置视频发布参数 | setVideoEncoderConfig() |
开启本地音频采集 | startAudioCapture() |
开启本地视频采集 | startVideoCapture() |
加入 RTC 房间 | joinRoom() |
离开房间 | leaveRoom() |
销毁引擎实例对象 | destroyEngine() |
发布本地通过麦克风采集的媒体流 | publishStream() |
取消发布本地通过麦克风采集的媒体流 | unpublishStream() |
枚举可用的麦克风设备 | (enumerateAudioCaptureDevices) |
功能点 | 回调 |
---|---|
远端可见用户加入房间 | onUserJoined |
远端可见用户离开房间 | onUserLeave |
远端用户麦克风采集音视频流 | onUserPublishStream |
远端用户麦克风采集的媒体流移除 | onUserUnpublishStream |
远端用户麦克风采集开启 | onUserStartAudioCapture |
远端用户麦克风采集关闭 | onUserStopAudioCapture |