Notebooks
A
Azure
Get Embeddings From Dataset

Get Embeddings From Dataset

azure-openai-samplesBasic_Samplesdotnetembeddingscsharp

Get embeddings

This notebook contains some helpful snippets you can use to embed text with the 'text-embedding-ada-002' model via Azure OpenAI API.

Installation

Install the Azure Open AI SDK using the below command.

[1]
[ ]
[3]

Run this cell, it will prompt you for the apiKey, endPoint, and embedding deployment

[4]

Import namesapaces and create an instance of OpenAiClient using the azureOpenAIEndpoint and the azureOpenAIKey

[5]
[6]

1. Load the dataset

The dataset used in this example is fine-food reviews from Amazon. The dataset contains a total of 568,454 food reviews Amazon users left up to October 2012. We will use a subset of this dataset, consisting of 1,000 most recent reviews for illustration purposes. The reviews are in English and tend to be positive or negative. Each review has a ProductId, UserId, Score, review title (Summary) and review body (Text).

We will combine the review summary and review text into a single combined text. The model will encode this combined text and it will output a single vector embedding.

Let's load the fine_food_reviews_1k.csv dataset using the value kernel

[7]

Loading Microsoft.Data.Analysis lastest package

[8]
[1]
Loading extensions from `C:\Users\dicolomb\.nuget\packages\microsoft.data.analysis\0.21.0\interactive-extensions\dotnet\Microsoft.Data.Analysis.Interactive.dll`
[10]
[11]

use tokenizer to calculate the token count

[12]
[13]
[14]

2. Get embeddings and save them for future reuse

[15]

Use the batch approach when calculating a lot of embeddings.

[16]

save the data for later use

[17]
[18]