Elastic How To Use Jina V2 Embeddings

How To Use Jina V2 Embeddings

openai-chatgptlangchain-pythonchatgptgenaielasticsearchelasticopenaiAIchatlogvectordatabasePythonsearchgenaistacksupporting-blog-contentvectorelasticsearch-labshow-to-use-jina-v2-embeddingslangchainapplications

alph-notebooks/elasticsearch-labs / how-to-use-jina-v2-embeddings.ipynb

Export

Run Notebooks

Contents

No cells yet

Add cells to see them here

Introduction

In this notebook, we will extend the Jina Late Chunking implementation example to index the documents and embeddings to Elasticsearch, and run queries against those documents.

The Jina part of the implementation will be keep untouched.

This is supporting material for the following blog post: https://www.elastic.co/search-labs/blog/how-to-use-jina-v2-embeddings

Late Chunking

This notebooks explains how the "Late Chunking" can be implemented. First you need to install the requirements:

[ ]

Then we load a model which we want to use for the embedding. We choose jinaai/jina-embeddings-v2-base-en but any other model which supports mean pooling is possible. However, models with a large maximum context-length are preferred.

[ ]

Now we define the text which we want to encode and split it into chunks. The chunk_by_sentences function also returns the span annotations. Those specify the number of tokens per chunk which is needed for the chunked pooling.

[ ]

Now let's try to segement a toy example.

[ ]

Now we encode the chunks with the traditional and the context-sensitive late_chunking method:

[ ]

Finally, we compare the similarity of the word "Berlin" with the chunks. The similarity should be higher for the context-sensitive chunked pooling method:

[ ]

Indexing to Elasticsearch

Now, let's index the brand new embeddings to Elasticsearch and run queries

[ ]

Creating the inference endpoint

[ ]

Creating index

[ ]

Loading documents

[ ]

Running semantic search

[ ]