Skip to main content

Vonage ML Transformers

Vonage ML transformers is a library that implements machine learning algorithms for the web. This library is based on @vonage/media-processor, MediaPipe and TFLite

@vonage/media-processor

Media Processor library is Vonage implementation for insertable streams for supported browsers. Documentation can be found here.

MediaPipe

MediaPipe library is an open source library under MIT license, this library use for video enhancements. For our solution of background blur/replacement we use the Selfie Segmentation solution of MediaPipe. The library adds the support for all MediaPipe JS solutions. This helps developers create cool things with any MediaPipe JS module.

For example:

  • Funny hats
  • Dynamic zoom
  • Eyes glaze
  • Hands detection
  • And much more...

Sample applications

Sample applications can be found here.

Background visual effects (out-of-the-box solution)

This sample uses the Vonage Video web SDK (OpenTok). OT.Publisher API (setVideoMediaProcessorConnector) to use the Vonage Media Processor Library in a Vonage Video (OpenTok) web application.

Implementation details:

  • Uses the MediaPipe Selfie Segmentation solution.
  • The process runs in a web worker.
  • MediaPipe solutions are based on WebGL and wasm (SIMD).
  • The solution does not come with MediaPipe binaries bundled. We added static assets under AWS Cloud Front CDN. Here are white-listed IPs for cloud front.
  • MediaProcessorConfig allows you to define mediapipeBaseAssetsUri which allows the user to self-host MediaPipe assets. However, we do NOT recommend this.

Configure

Configure post process action.

Blur:

let config: MediaProcessorConfig
config = {
mediapipeBaseAssetsUri: 'https://example.com', //This is optional, the library by default provides static assets.
transformerType: 'BackgroundBlur',
radius: BlurRadius.Low | BlurRadius.High | number //Low=5px High=10px number=(number)px
}

Silhouette:

let config: MediaProcessorConfig
config = {
mediapipeBaseAssetsUri: 'https://example.com', // mediapipeBaseAssetsUri is optional Vonage provide static assets for it
transformerType: 'SilhouetteBlur',
radius: BlurRadius.Low | BlurRadius.High | number //Low=5px High=10px number=(number)px
}

Virtual (image):

let config: MediaProcessorConfig
config = {
mediapipeBaseAssetsUri: 'https://example.com', // mediapipeBaseAssetsUri is optional Vonage provide static assets for it
transformerType: 'VirtualBackground',
backgroundAssetUri: 'https://some-url-to-image.com'
}

Video:

let config: MediaProcessorConfig
config = {
mediapipeBaseAssetsUri: 'https://example.com', // mediapipeBaseAssetsUri is optional Vonage provide static assets for it
transformerType: 'VideoBackground',
backgroundAssetUri: 'https://some-url-to-video.com'
}

Create Media Processor

After configuring which post process is needed, use the helper function to create it VonageMediaProcessor

const processor = await createVonageMediaProcessor(config);
publisher.setVideoMediaProcessorConnector(processor.getConnector());

Change configuration

To change the post process config in-flight, you can call this method without involving the publisher setBackgroundOptions

await processor.setBackgroundOptions(newConfig);

Disable/enable processing

You can disable the postprocessing using enable/disable functions.

const processor = await createVonageMediaProcessor(config);
processor.disable();
processor.enable();

Errors, Warnings and Statistics

isSupported

Checks if the current browser can run our library.

try {
await isSupported();
} catch(e) {
console.error(e);
}

Emitter Registration

This solution supports Emittery You can listen event directly on VonageMediaProcessor

processor.on('error', ((eventData: ErrorData) => {
console.error(eventData);
}))
processor.on('warn', ((eventData: WarnData) => {
console.warn(eventData);
}))
processor.on('pipelineInfo', ( (eventData: PipelineInfoData) => {
console.info(eventData)
}))

Frame Drop warning

If you like to be notified about frame rate drop use setTrackExpectedRate(number) for the expected rate of the process.

processor.setTrackExpectedRate(30)//or any other value.

Statistics

The API collect statistics for usage and debugging purposes. However, it is up to the user to activate it.

Turn statistics on:
const  metadata: VonageMetadata = {
appId: 'video SDK app id',
sourceType: 'video',
proxyUrl: 'https://some-proxy.com' //optional
};
setVonageMetadata(metadata)
Turn statistics off: (by default the statistics are off)
setVonageMetadata(null)

That's all you need to do in order to use our out-of-the-box background solution

MediaPipe Helper

The library provide helper class for all MediaPipe JS solutions.

  • Face Mesh
  • Face Detection
  • Hands
  • Holistic
  • Objectron
  • Pose
  • Selfie Segmentation

Configure MediaPipe solution

Each configuration is up to the user.

Face Mesh:

let  option: FaceMeshOptions = {
...
}

Face Detection:

let  option: FaceDetectionOptions = {
...
}

Hands:

let  option: HandsOptions = {
...
}

Holistic:

let  option: HolisticOptions = {
...
}

Objectron:

let  option: ObjectronOptions = {
...
}

Pose:

let  option: PoseOptions = {
...
}

Selfie Segmentation:

let  option: SelfieSegmentationOptions = {
...
}

MediaPipe Helper

MediapipeHelper - Helper class that initiate and run MediaPipe modules. This class must be initialized on the application main thread

Create MediaPipe helper:

In this example we will use face mash, but it is the same for all the other models.

mediaPipeListener(results: FaceMeshResults): void {
//Do something with the results.
}
let mediapipeConfig: MediapipeConfig = {
modelType: "face_mesh"
listener: (results: FaceMeshResults): void => {
},
options: FaceMeshOptions,
assetsUri: 'https://some-url-to-facemash-binaries.com' //Optional - Vonage provides static assets to all MediaPipe modules.
}
let mediapipeHelper: MediapipeHelper = new MediapipeHelper()
mediapipeHelper.initialize(mediapipeConfig).then( () => {
}).catch( e => {
})

Using MediaPipe helper class:

In this example we will demonstrate how to use the MediaPipe helper with a transformer running on the main application thread. However, we have two sample apps that run the MediaPipe helper on the main application thread and, concurrently, the transformer in a Web worker thread.

  1. Auto zoom - Using face detection to create zoom on the main person. here.
  2. Custom MediaPipe: MediaPipe can run both on application main thread and Web worker thread here.

Create transformer:

class  MedipipeTransformer  implements  Transformer {
mediapipeHelper: MediapipeHelper
results?: FaceMeshResults
constructor(message: string) {
this.mediapipeHelper = new MediapipeHelper()
}

init():Promise<void>{
return new Promise<void>((resolve, reject) => {
let mediapipeConfig: MediapipeConfig = {
modelType: "face_mesh"
listener: (results: FaceMeshResults): void => {
this.results = results
},
options: FaceMeshOptions,
assetsUri: 'https://some-url-to-facemash-binaries.com' //Optional - Vonage provides static assets to all MediaPipe modules.
}
mediapipeHelper.initialize(mediapipeConfig).then( () => {
resolve()
}).catch( e => {
reject(e)
})
})
}

//start function is optional.
start(controller:TransformStreamDefaultController) {
//In this sample nothing needs to be done.
}

//transform function is mandatory.
transform(frame: VideoFrame, controller: TransformStreamDefaultController) {
createImageBitmap(frame).then( image => {
let timestamp = frame.timestamp
frame.close()
this.mediapipeHelper_.send(image).then( () => {
if(this.results){
//Do something
controller.enqueue(/*new video frame*/, {timestamp})
}
}).catch( e => {
console.error(e)
controller.enqueue(frame)
})
this.processFrame(image, timestamp, controller)
}).catch(e => {
console.error(e)
controller.enqueue(frame)
})
}

//When using MediaPipe helper close function must be called to avoid memory leaks.
flush(controller:TransformStreamDefaultController) {
this.mediapipeHelper_.close().then( () => {
}).catch( e => {
console.error(e)
})
}
}
export default MedipipeTransformer;

Use the transformer:

const mediapipeTransformer: MedipipeTransformer = new MedipipeTransformer()
mediapipeTransformer.init().then( () => {
const mediaProcessor: MediaProcessor = new MediaProcessor()
const transformers = [ mediapipeTransformer ]
mediaProcessor.setTransformers(transformers)
const connector: MediaProcessorConnector = new MediaProcessorConnector(mediaProcessor)
...
publisher.setVideoMediaProcessorConnector(connector)
...
}).catch(e => {
})

License

This project is licensed under the terms of the MIT license and is available for free.