Vonage ML Transformers
Vonage ML transformers is a library that implements machine learning algorithms for the web. This library is based on @vonage/media-processor, MediaPipe and TFLite
@vonage/media-processor
Media Processor library is Vonage implementation for insertable streams for supported browsers. Documentation can be found here.
MediaPipe
MediaPipe library is an open source library under MIT license, this library use for video enhancements. For our solution of background blur/replacement we use the Selfie Segmentation solution of MediaPipe. The library adds the support for all MediaPipe JS solutions. This helps developers create cool things with any MediaPipe JS module.
For example:
- Funny hats
- Dynamic zoom
- Eyes glaze
- Hands detection
- And much more...
Sample applications
Sample applications can be found here.
Background visual effects (out-of-the-box solution)
This sample uses the Vonage Video web SDK (OpenTok). OT.Publisher API (setVideoMediaProcessorConnector) to use the Vonage Media Processor Library in a Vonage Video (OpenTok) web application.
Implementation details:
- Uses the MediaPipe Selfie Segmentation solution.
- The process runs in a web worker.
- MediaPipe solutions are based on WebGL and wasm (SIMD).
- The solution does not come with MediaPipe binaries bundled. We added static assets under AWS Cloud Front CDN. Here are white-listed IPs for cloud front.
MediaProcessorConfig
allows you to definemediapipeBaseAssetsUri
which allows the user to self-host MediaPipe assets. However, we do NOT recommend this.
Configure
Configure post process action.
Blur:
let config: MediaProcessorConfig
config = {
mediapipeBaseAssetsUri: 'https://example.com', //This is optional, the library by default provides static assets.
transformerType: 'BackgroundBlur',
radius: BlurRadius.Low | BlurRadius.High | number //Low=5px High=10px number=(number)px
}
Silhouette:
let config: MediaProcessorConfig
config = {
mediapipeBaseAssetsUri: 'https://example.com', // mediapipeBaseAssetsUri is optional Vonage provide static assets for it
transformerType: 'SilhouetteBlur',
radius: BlurRadius.Low | BlurRadius.High | number //Low=5px High=10px number=(number)px
}
Virtual (image):
let config: MediaProcessorConfig
config = {
mediapipeBaseAssetsUri: 'https://example.com', // mediapipeBaseAssetsUri is optional Vonage provide static assets for it
transformerType: 'VirtualBackground',
backgroundAssetUri: 'https://some-url-to-image.com'
}
Video:
let config: MediaProcessorConfig
config = {
mediapipeBaseAssetsUri: 'https://example.com', // mediapipeBaseAssetsUri is optional Vonage provide static assets for it
transformerType: 'VideoBackground',
backgroundAssetUri: 'https://some-url-to-video.com'
}
Create Media Processor
After configuring which post process is needed, use the helper function to create it VonageMediaProcessor
const processor = await createVonageMediaProcessor(config);
publisher.setVideoMediaProcessorConnector(processor.getConnector());
Change configuration
To change the post process config in-flight, you can call this method without involving the publisher setBackgroundOptions
await processor.setBackgroundOptions(newConfig);
Disable/enable processing
You can disable the postprocessing using enable/disable functions.
const processor = await createVonageMediaProcessor(config);
processor.disable();
processor.enable();
Errors, Warnings and Statistics
isSupported
Checks if the current browser can run our library.
try {
await isSupported();
} catch(e) {
console.error(e);
}
Emitter Registration
This solution supports Emittery
You can listen event directly on VonageMediaProcessor
processor.on('error', ((eventData: ErrorData) => {
console.error(eventData);
}))
processor.on('warn', ((eventData: WarnData) => {
console.warn(eventData);
}))
processor.on('pipelineInfo', ( (eventData: PipelineInfoData) => {
console.info(eventData)
}))
Frame Drop warning
If you like to be notified about frame rate drop use setTrackExpectedRate(number)
for the expected rate of the process.
processor.setTrackExpectedRate(30)//or any other value.
Statistics
The API collect statistics for usage and debugging purposes. However, it is up to the user to activate it.
Turn statistics on:
const metadata: VonageMetadata = {
appId: 'video SDK app id',
sourceType: 'video',
proxyUrl: 'https://some-proxy.com' //optional
};
setVonageMetadata(metadata)
Turn statistics off: (by default the statistics are off)
setVonageMetadata(null)
That's all you need to do in order to use our out-of-the-box background solution
MediaPipe Helper
The library provide helper class for all MediaPipe JS solutions.
- Face Mesh
- Face Detection
- Hands
- Holistic
- Objectron
- Pose
- Selfie Segmentation
Configure MediaPipe solution
Each configuration is up to the user.
Face Mesh:
let option: FaceMeshOptions = {
...
}
Face Detection:
let option: FaceDetectionOptions = {
...
}
Hands:
let option: HandsOptions = {
...
}
Holistic:
let option: HolisticOptions = {
...
}
Objectron:
let option: ObjectronOptions = {
...
}
Pose:
let option: PoseOptions = {
...
}
Selfie Segmentation:
let option: SelfieSegmentationOptions = {
...
}
MediaPipe Helper
MediapipeHelper
- Helper class that initiate and run MediaPipe modules.
This class must be initialized on the application main thread
Create MediaPipe helper:
In this example we will use face mash, but it is the same for all the other models.
mediaPipeListener(results: FaceMeshResults): void {
//Do something with the results.
}
let mediapipeConfig: MediapipeConfig = {
modelType: "face_mesh"
listener: (results: FaceMeshResults): void => {
},
options: FaceMeshOptions,
assetsUri: 'https://some-url-to-facemash-binaries.com' //Optional - Vonage provides static assets to all MediaPipe modules.
}
let mediapipeHelper: MediapipeHelper = new MediapipeHelper()
mediapipeHelper.initialize(mediapipeConfig).then( () => {
}).catch( e => {
})
Using MediaPipe helper class:
In this example we will demonstrate how to use the MediaPipe helper with a transformer running on the main application thread. However, we have two sample apps that run the MediaPipe helper on the main application thread and, concurrently, the transformer in a Web worker thread.
- Auto zoom - Using face detection to create zoom on the main person. here.
- Custom MediaPipe: MediaPipe can run both on application main thread and Web worker thread here.
Create transformer:
class MedipipeTransformer implements Transformer {
mediapipeHelper: MediapipeHelper
results?: FaceMeshResults
constructor(message: string) {
this.mediapipeHelper = new MediapipeHelper()
}
init():Promise<void>{
return new Promise<void>((resolve, reject) => {
let mediapipeConfig: MediapipeConfig = {
modelType: "face_mesh"
listener: (results: FaceMeshResults): void => {
this.results = results
},
options: FaceMeshOptions,
assetsUri: 'https://some-url-to-facemash-binaries.com' //Optional - Vonage provides static assets to all MediaPipe modules.
}
mediapipeHelper.initialize(mediapipeConfig).then( () => {
resolve()
}).catch( e => {
reject(e)
})
})
}
//start function is optional.
start(controller:TransformStreamDefaultController) {
//In this sample nothing needs to be done.
}
//transform function is mandatory.
transform(frame: VideoFrame, controller: TransformStreamDefaultController) {
createImageBitmap(frame).then( image => {
let timestamp = frame.timestamp
frame.close()
this.mediapipeHelper_.send(image).then( () => {
if(this.results){
//Do something
controller.enqueue(/*new video frame*/, {timestamp})
}
}).catch( e => {
console.error(e)
controller.enqueue(frame)
})
this.processFrame(image, timestamp, controller)
}).catch(e => {
console.error(e)
controller.enqueue(frame)
})
}
//When using MediaPipe helper close function must be called to avoid memory leaks.
flush(controller:TransformStreamDefaultController) {
this.mediapipeHelper_.close().then( () => {
}).catch( e => {
console.error(e)
})
}
}
export default MedipipeTransformer;
Use the transformer:
const mediapipeTransformer: MedipipeTransformer = new MedipipeTransformer()
mediapipeTransformer.init().then( () => {
const mediaProcessor: MediaProcessor = new MediaProcessor()
const transformers = [ mediapipeTransformer ]
mediaProcessor.setTransformers(transformers)
const connector: MediaProcessorConnector = new MediaProcessorConnector(mediaProcessor)
...
publisher.setVideoMediaProcessorConnector(connector)
...
}).catch(e => {
})
License
This project is licensed under the terms of the MIT license and is available for free.