Transcription On SM Endpoint
Transcription inference on Amazon SageMaker Inference
This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook.
A near real-time inference for transcription using Whisper model
Table of Contents
Background
Transcribe is the go-to service for transcription in AWS. However, for non-supported languages, we can use other models (in our case Whisper) that will be deployed in Amazon SageMaker for inference. For short audio files that the inference takes up to 60 seconds, we can use real-time inference. For inference that takes longer than 60 seconds, or in the case we want to save on costs by autoscaling the instance count to zero when there are no requests to process, asynchronous inference should be used.
Notebook scope
This notebook provides 2 deployments options for the Whisper model - real-time and asynchronous inference - including auto-scaling setup and asynchronous inference invocation example
We used Data Science image to execute the notebook
1. Prepare the model for inference
Create a customer inference code
In requirements.txt file we put the libraries we will need to run the inference code
Uploading the model to S3
2. Real-time inference
Deploying the model to a real-time inference
Execute inference
3. Asynchronous inference
For inference that takes longer than 60 seconds, or in the case we want to save on costs by autoscaling the instance count to zero when there are no requests to process, asynchronous inference should be used.
Deploying the model to an asynchronous inference
Execute inference
Setting up Autoscale asynchronous endpoint
4. Invoke Whisper on SageMaker Endpoint for Asynchronous inference
In this section we will demonstrate invocation of an Asynchronous inference endpoint by using the Asynchronous endpoint deployed in section #3
Check Output Location
Get Result
Example for multiple invocations (can be used to test the autoscaling)
5. Clean up
Remember to delete your endpoints after use as you will be charged for the instances used in this Demo.
Notebook CI Test Results
This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.