OpenAI Whisper

Whisper

chatgptopenaigpt-4examplesarchiveazureopenai-apiopenai-cookbook

alph-notebooks/openai-cookbook / whisper.ipynb

Export

Run Notebooks

Contents

No cells yet

Add cells to see them here

Azure audio whisper (preview) example

Note: There is a newer version of the openai library available. See https://github.com/openai/openai-python/discussions/742

The example shows how to use the Azure OpenAI Whisper model to transcribe audio files.

Setup

First, we install the necessary dependencies.

[ ]

Next, we'll import our libraries and configure the Python OpenAI SDK to work with the Azure OpenAI service.

Note: In this example, we configured the library to use the Azure API by setting the variables in code. For development, consider setting the environment variables instead:

OPENAI_API_BASE
OPENAI_API_KEY
OPENAI_API_TYPE
OPENAI_API_VERSION

[1]

True

To properly access the Azure OpenAI Service, we need to create the proper resources at the Azure Portal (you can check a detailed guide on how to do this in the Microsoft Docs)

Once the resource is created, the first thing we need to use is its endpoint. You can get the endpoint by looking at the "Keys and Endpoints" section under the "Resource Management" section. Having this, we will set up the SDK using this information:

[2]

Authentication

The Azure OpenAI service supports multiple authentication mechanisms that include API keys and Azure credentials.

[3]

Authentication using API key

To set up the OpenAI SDK to use an Azure API Key, we need to set up the api_type to azure and set api_key to a key associated with your endpoint (you can find this key in "Keys and Endpoints" under "Resource Management" in the Azure Portal)

[4]

Authentication using Azure Active Directory

Let's now see how we can get a key via Microsoft Active Directory Authentication.

[ ]

A token is valid for a period of time, after which it will expire. To ensure a valid token is sent with every request, you can refresh an expiring token by hooking into requests.auth:

[ ]

Audio transcription

Audio transcription, or speech-to-text, is the process of converting spoken words into text. Use the openai.Audio.transcribe method to transcribe an audio file stream to text.

You can get sample audio files from the Azure AI Speech SDK repository at GitHub.

[9]

[ ]