Streaming API Integration: using LiveKit

Boost real-time video streaming with KonPro and LiveKit integration

Welcome to LiveKit Integration!

This guide demonstrates how to use the KonPro streaming API endpoints and the LiveKit client SDK for real-time video streaming, which provides a simpler development interface.

Key Features

  • • This native API integration enables real-time video generation using KonPro avatars over a streaming connection.
  • • It uses LiveKit SDK for establishing a WebRTC-based stream, allowing continuous, low-latency video output as text is streamed in.
  • • Ideal for real-time or dynamic avatar communication experiences where latency matters, such as virtual meetings or live assistants.
⚠️

Note

For Node.js environments, we strongly recommend using the Streaming Avatar SDK package, as it offers a more robust solution. This guide focuses on the raw LiveKit implementation, intended for basic use cases as well as for developers who require more customization options or wish to integrate with existing LiveKit infrastructure.

Implementation Guide

Overview

In this guide:

  • We will use the LiveKit CDN client for easy setup without requiring npm packages.
  • Simplified WebSocket handling.
  • Support for both Talk (LLM) and Repeat modes.
  • Real-time event monitoring for both WebSocket and LiveKit events.

Prerequisites

  • API Token from KonPro
  • Basic understanding of JavaScript and LiveKit

Step 1: Basic HTML Setup

Create an HTML file with the necessary elements and include the LiveKit JS Client SDK minified CDN version:

HTML

<!DOCTYPE html>
<html lang="en">
  <head>
    <script src="https://cdn.jsdelivr.net/npm/livekit-client/dist/livekit-client.umd.min.js"></script>
  </head>

  <body>
    <div>
      <div>
        <div>
          <button id="startBtn">Start</button>
          <button id="closeBtn">Close</button>
        </div>
      </div>
      <div>
        <input id="taskInput" type="text" placeholder="Enter text" />
        <button id="talkBtn">Talk</button>
      </div>
    </div>
    <video id="mediaElement" autoplay></video>

    <script>
      // JavaScript code goes here
    </script>
  </body>
</html>

Step 2: Configuration

javascript
const API_CONFIG = {
  serverUrl: "https://api.konpro.ai",
  token: "YOUR_API_TOKEN"
};

// Global state
let sessionInfo = null;
let room = null;
let mediaStream = null;

// DOM elements
const mediaElement = document.getElementById("mediaElement");
const taskInput = document.getElementById("taskInput");

Step 3: Core Implementation

3.1 Create and Start Session

javascript
async function createSession() {
  // Create new session
  const response = await fetch(`${API_CONFIG.serverUrl}/v1/streaming.new`, {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      "Authorization": `Bearer ${API_CONFIG.token}`
    },
    body: JSON.stringify({
      version: "v1",
      avatar_id: "YOUR_AVATAR_ID"
    })
  });
  
  sessionInfo = await response.json();
  
  // Start streaming
  await fetch(`${API_CONFIG.serverUrl}/v1/streaming.start`, {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      "Authorization": `Bearer ${API_CONFIG.token}`
    },
    body: JSON.stringify({
      session_id: sessionInfo.session_id
    })
  });
  
  // Connect to LiveKit room
  room = new LiveKitClient.Room();
  await room.connect(sessionInfo.url, sessionInfo.access_token);
  
  // Handle media streams
  room.on(LiveKitClient.RoomEvent.TrackSubscribed, (track) => {
    if (track.kind === "video" || track.kind === "audio") {
      mediaStream.addTrack(track.mediaStreamTrack);
      mediaElement.srcObject = mediaStream;
    }
  });
}

Note: The LiveKit CDN version is accessed through the LivekitClient global namespace. All LiveKit classes and constants must be prefixed with LivekitClient (e.g., LivekitClient.Room, LivekitClient.RoomEvent)

3.2 Send Text to Avatar

JavaScript
async function sendText(text) {
  await fetch(`${API_CONFIG.serverUrl}/v1/streaming.task`, {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      "Authorization": `Bearer ${API_CONFIG.token}`
    },
    body: JSON.stringify({
      session_id: sessionInfo.session_id,
      text: text,
      task_type: "talk"  // or "repeat" to make avatar repeat exactly what you say
    })
  });
}

3.3 Close Session

JavaScript
async function closeSession() {
  await fetch(`${API_CONFIG.serverUrl}/v1/streaming.stop`, {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      "Authorization": `Bearer ${API_CONFIG.token}`
    },
    body: JSON.stringify({
      session_id: sessionInfo.session_id
    })
  });
  
  if (room) {
    room.disconnect();
  }
  
  mediaElement.srcObject = null;
  sessionInfo = null;
  room = null;
  mediaStream = null;
}

Step 4: Event Listeners

JavaScript

// Start session
document.querySelector("#startBtn").addEventListener("click", async () => {
  await createSession();
});

// Close session
document.querySelector("#closeBtn").addEventListener("click", closeSession);

// Send text
document.querySelector("#talkBtn").addEventListener("click", () => {
  const text = taskInput.value.trim();
  if (text) {
    sendText(text);
    taskInput.value = "";
  }
});

Further Features

1. Task Types

The task endpoint supports different task types:

  • talk: Avatar processes text through LLM before speaking
  • repeat: Avatar repeats the exact input text

2. WebSocket Events

Monitor KonPro Avatar state through WebSocket events:

javascript
const wsUrl = `wss://api.konpro.ai/v1/ws/streaming.chat?session_id=${sessionId}&session_token=${token}&silence_response=false`;
const ws = new WebSocket(wsUrl);

ws.addEventListener("message", (event) => {
  const data = JSON.parse(event.data);
  console.log("Event:", data);
});

3. LiveKit Room Events

Monitor room state and media tracks:

JavaScript

room.on(LivekitClient.RoomEvent.DataReceived, (message) => {
  const data = new TextDecoder().decode(message);
  console.log("Room message:", JSON.parse(data));
});

System Flow

  1. Session setup (steps 1-3)
  2. Video streaming (step 4)
  3. Avatar interaction loop (step 5)
  4. Session closure (step 6)

Complete Demo Code

For the complete working implementation, please refer to our GitHub repository or use the Streaming Avatar SDK package for a more robust solution.

📝

Note

The complete demo code contains extensive JavaScript with complex template literals. For production use, we recommend using the @konpro/streaming-avatar npm package which provides a more robust and maintainable solution.

LiveKit Client SDKs

Here's the list of LiveKit client SDK repositories:

client-sdk-flutter

Dart, Flutter Client SDK for LiveKit

client-sdk-js

TypeScript, LiveKit browser client SDK (JavaScript)

client-sdk-swift

Swift, LiveKit Swift Client SDK for iOS, macOS, tvOS, and visionOS

client-sdk-android

Kotlin, LiveKit SDK for Android

client-sdk-unity

C#, Official Unity SDK for LiveKit

client-sdk-react-native

TypeScript, Official React Native SDK for LiveKit

client-sdk-react-native-expo-plugin

TypeScript, Expo plugin for the React Native SDK

client-sdk-unity-web

C#, Official LiveKit SDK for Unity WebGL

client-sdk-cpp

C++, C++ SDK for LiveKit

You can explore these repos for more detailed information.

Conclusion

The LiveKit-based implementation of KonPro's Streaming API provides a streamlined approach to integrating interactive avatars into web applications. While this guide covers the basics of browser-side implementation, remember that for production Node.js environments, the @konpro/streaming-avatar npm package offers a more comprehensive solution.

Table of Contents