Adding Built-in Voice Chat Integration to Demo Project

Integrate voice chat into demo project for interactive avatar interactions

Welcome to Voice Chat Integration!

This guide walks you through the next step of enhancing your existing Vite demo project using the KonPro SDK by integrating built-in voice chat functionality. Building on the initial setup, we'll show you how to enable voice mode for real-time interaction with the avatar, allowing users to switch seamlessly between text and voice input.

Please note: The built-in TTS/voice mode is tightly integrated with KonPro's internal LLM/knowledge base. Currently, custom avatar speech input is not supported when voice chat is enabled. If you want to provide custom input, you will need to integrate your own STT solution.

1. Update index.html Structure

Add buttons to switch between text and voice modes, and include a section to manage voice controls.

html
<!-- Add mode switching buttons -->
<div class="chat-modes" role="group">
  <button id="textModeBtn" class="active">Text Mode</button>
  <button id="voiceModeBtn" disabled>Voice Mode</button>
</div>

<!-- Add voice mode controls section -->
<section id="voiceModeControls" role="group" style="display: none">
  <div id="voiceStatus"></div>
</section>

2. Update main.ts Code

Add New DOM References

Update the DOM to include references for the new buttons and controls.

typescript
// Add these DOM elements
const textModeBtn = document.getElementById("textModeBtn") as HTMLButtonElement;
const voiceModeBtn = document.getElementById("voiceModeBtn") as HTMLButtonElement;
const textModeControls = document.getElementById("textModeControls") as HTMLElement;
const voiceModeControls = document.getElementById("voiceModeControls") as HTMLElement;
const voiceStatus = document.getElementById("voiceStatus") as HTMLElement;

// Add mode tracking
let currentMode: "text" | "voice" = "text";

Update Avatar Initialization

Modify avatar initialization to handle voice chat events and display appropriate status updates.

typescript
async function initializeAvatarSession() {
  const token = await fetchAccessToken();
  avatar = new StreamingAvatar({ token });

  sessionData = await avatar.createStartAvatar({
    quality: AvatarQuality.High,
    avatarName: "default",
    disableIdleTimeout: true,
    language: "en",  // Use correct language code
  });

  // Add voice chat event listeners
  avatar.on(StreamingEvents.USER_START, () => {
    voiceStatus.textContent = "Listening...";
  });
  avatar.on(StreamingEvents.USER_STOP, () => {
    voiceStatus.textContent = "Processing...";
  });
  avatar.on(StreamingEvents.AVATAR_START_TALKING, () => {
    voiceStatus.textContent = "Avatar is speaking...";
  });
  avatar.on(StreamingEvents.AVATAR_STOP_TALKING, () => {
    voiceStatus.textContent = "Waiting for you to speak...";
  });
}

Add Voice Chat Functions

Create functions to manage the switching between modes and starting the voice chat.

typescript
async function startVoiceChat() {
  if (!avatar) return;
  
  try {
    await avatar.startVoiceChat({
      useSilencePrompt: false
    });
    voiceStatus.textContent = "Waiting for you to speak...";
  } catch (error) {
    console.error("Error starting voice chat:", error);
    voiceStatus.textContent = "Error starting voice chat";
  }
}

async function switchMode(mode: "text" | "voice") {
  if (currentMode === mode) return;
  
  currentMode = mode;
  
  if (mode === "text") {
    textModeBtn.classList.add("active");
    voiceModeBtn.classList.remove("active");
    textModeControls.style.display = "block";
    voiceModeControls.style.display = "none";
    if (avatar) {
      await avatar.closeVoiceChat();
    }
  } else {
    textModeBtn.classList.remove("active");
    voiceModeBtn.classList.add("active");
    textModeControls.style.display = "none";
    voiceModeControls.style.display = "block";
    if (avatar) {
      await startVoiceChat();
    }
  }
}

Enable Voice Mode Button

Enable the voice mode button once the avatar stream is ready.

typescript
function handleStreamReady(event: any) {
  if (event.detail && videoElement) {
    videoElement.srcObject = event.detail;
    videoElement.onloadedmetadata = () => {
      videoElement.play().catch(console.error);
    };
    voiceModeBtn.disabled = false;  // Enable voice mode after stream is ready
  }
}

Add Event Listeners

Make sure to add event listeners for switching modes.

typescript
// Add these with your other event listeners
textModeBtn.addEventListener("click", () => switchMode("text"));
voiceModeBtn.addEventListener("click", () => switchMode("voice"));

Important Notes:

  • Voice mode button starts disabled and enables only after stream is ready
  • Always use language code "en" instead of "English"
  • Voice chat status updates automatically through events
  • Voice chat starts when switching to voice mode
  • Make sure to handle cleanup when switching modes

Conclusion

In this guide, we've walked through the process of integrating built-in voice chat into the Vite demo project using KonPro's Streaming API. By following these steps, you can seamlessly switch between text and voice modes, allowing for a more interactive experience. Keep in mind that for custom speech input with voice chat, you will need to integrate your own STT solution.

Updated: 4 months ago

Did this page help you?

Table of Contents