Notebooks
E
Elastic
10 Semantic Reranking Retriever Cohere

10 Semantic Reranking Retriever Cohere

openai-chatgptlangchain-pythonchatgptgenaielasticsearchelasticopenaiAIchatlogvectordatabasenotebooksPythonsearchgenaistackvectorelasticsearch-labslangchainapplications

Semantic Reranking with Cohere Reranker

Open In Colab

This example will show how to combine search and semantic reranking to improve the accuracy of your search results. We'll be using the rerank feature from Cohere.

Note: for a complete integration with Cohere please refer to this notebook. This example focuses on Cohere reranking only through an Elastic retriever query.

Requirements

For this example, you will need:

Create Elastic Cloud deployment

If you don't have an Elastic Cloud deployment, sign up here for a free trial.

Install packages and connect with Elasticsearch Client

To get started, we'll need to connect to our Elastic deployment using the Python client (version 8.15.0 or above). Because we're using an Elastic Cloud deployment, we'll use the Cloud ID to identify our deployment.

First we need to pip install the elasticsearch package:

[ ]

Next, we need to import the modules we need.

🔐 NOTE: getpass enables us to securely prompt the user for credentials without echoing them to the terminal, or storing it in memory.

[ ]

Now we can instantiate the Python Elasticsearch client.

First we prompt the user for their password and Cloud ID. Then we create a client object that instantiates an instance of the Elasticsearch class.

[ ]

Enable telemetry

Knowing that you are using this notebook helps us decide where to invest our efforts to improve our products. We would like to ask you that you run the following code to let us gather anonymous usage statistics. See telemetry.py for details. Thank you!

[ ]

Test the Client

Before you continue, confirm that the client has connected with this test.

[ ]

Refer to the documentation to learn how to connect to a self-managed deployment.

Read this page to learn how to connect using API keys.

Set up Cohere Inference Endpoint

We'll be using the Cohere rerank feature to perform semantic reordering of search hits through an Elasticsearch inference endpoint.

Go to the Cohere website and create an API key, then set it here.

[ ]

Create the Inference Endpoint

Let's create the inference endpoint by using the Create inference API.

For this example we'll use the Cohere service, but the inference API also supports many other inference services.

[ ]

Create the Index

Now we need to create an index. Let's create one that enables us to perform search and semantic reranking on text articles.

[ ]

Populate the Index

Let's populate the index with a couple of random article fragments from Wikipedia.

[ ]

Search without reranking

First let's run a classic search that uses lexical text matching.

Aside: Pretty printing Elasticsearch search results

Your search API calls will return hard-to-read nested JSON. We'll create a little function called pretty_search_response to return nice, human-readable outputs from our examples.

[ ]

Assume we're interested to learn about the solar eclipse, but we don't know the exact name of this phenomenon. We'll perform a classic search that matches the text "the Moon covers the Sun". Let's see what results this finds:

[67]

ID: yyKfN5EB0RS1pNqNt7y0
Score: 2.0718374
Title: Cheshire
Text: Cheshire is a county in England. It is the North West part of the country. It is most famous for making salt and cheese. Cheshire is made up of lots of little towns including the Borough of Macclesfield which covers a large area of plains. The main attraction is in Kerridge where there is the famous landmark 'White Nancy.'

ID: 1CKfN5EB0RS1pNqNt7y0
Score: 0.8610966
Title: Sun Moon Lake
Text: Sun Moon Lake (; Thao: "Zintun") is a lake in Nantou County, Taiwan. It is the largest lake in Taiwan. Sun Moon Lake is one of the Eight Views of Taiwan. The lake was named because the east side of the lake looks like a sun, and the west side of the lake looks like a moon.

ID: 1SKfN5EB0RS1pNqNt7y0
Score: 0.83579814
Title: Unification Church
Text: The Unification Church is a religious movement started by Sun Myung Moon in Korea in the 1940s. It officially began as a church in 1954 in Seoul, South Korea. On October 12, 2009, it was announced that Sun Myung Moon was given the church to his sons, Moon Hyung-jin, Moon Kook-jin, and Moon Hyun-jin.

ID: zyKfN5EB0RS1pNqNt7y0
Score: 0.782294
Title: Phases of the Moon
Text: As the Moon orbits around the Earth, the half of the Moon that faces the Sun will be lit up. The different shapes of the lit portion of the Moon that can be seen from Earth are known as phases of the Moon. Each phase repeats itself every 29.5 days.

ID: ziKfN5EB0RS1pNqNt7y0
Score: 0.77926123
Title: Orbital revolution
Text: Orbital revolution is the movement of a planet around a star, or a moon around a planet. For example, the Earth revolves around the Sun, and the Moon revolves about the Earth.

ID: 0iKfN5EB0RS1pNqNt7y0
Score: 0.75159883
Title: Solar eclipse of December 14, 2020
Text: A total solar eclipse occurred on Monday, December 14, 2020. A solar eclipse occurs when the Moon passes between Earth and the Sun, which will cover the image of the Sun for a viewer on Earth.

ID: 0SKfN5EB0RS1pNqNt7y0
Score: 0.73564994
Title: Solar eclipse
Text: As seen from Earth, a solar eclipse /"ee-klips"/ happens when the Moon is directly between the Earth and the Sun. This makes the Moon fully or partially (partly) cover the sun. Solar eclipses can only happen during a new moon. Every year there are about two solar eclipses. Sometimes there are even five solar eclipses in a year. However, only two of these can be total solar eclipses, and often a year will pass without a total eclipse.

ID: zCKfN5EB0RS1pNqNt7y0
Score: 0.7309638
Title: Mundilfari
Text: Mundilfari (Mundilfäri) (Old Norse, possibly "the one moving according to particular times") is in Norse mythology a father of Sól (Sun) and Máni (Moon). One moon is named after him.

ID: 0CKfN5EB0RS1pNqNt7y0
Score: 0.71921694
Title: Pokémon Ultra Sun and Ultra Moon
Text: Pokémon Ultra Sun and Ultra Moon is a game in the 7th generation of "Pokémon". It was released on the Nintendo 3DS in 2017.

ID: 0yKfN5EB0RS1pNqNt7y0
Score: 0.6899033
Title: Sun and moon letters
Text: In Arabic and Maltese, consonants are divided into two groups: the sun/solar letters ( ', Maltese: konsonanti xemxin) and moon/lunar letters ( ', Maltese: konsonanti qamrin).

The top hits - Cheshire, Sun Moon Lake, Unification Church and so on - all come up because the text has matching words with our query's words, for example "covers" or "sun". However, these contents are unrelated to the meaning of our query. Further down below, result #7 is an article about the solar eclipse, but it's lost among the many other hits.

Can we somehow get more relevant results?

Search with reranking

Enter semantic reranking! We'll instruct Elasticsearch to run the same query, but this time also perform semantic reranking on the top results. For this we need to wrap our query in a text_similarity_reranker retriever, and reference the previously created Cohere inference endpoint that will do the reranking.

[68]

ID: 0SKfN5EB0RS1pNqNt7y0
Score: 0.9812029
Title: Solar eclipse
Text: As seen from Earth, a solar eclipse /"ee-klips"/ happens when the Moon is directly between the Earth and the Sun. This makes the Moon fully or partially (partly) cover the sun. Solar eclipses can only happen during a new moon. Every year there are about two solar eclipses. Sometimes there are even five solar eclipses in a year. However, only two of these can be total solar eclipses, and often a year will pass without a total eclipse.

ID: 0iKfN5EB0RS1pNqNt7y0
Score: 0.9078038
Title: Solar eclipse of December 14, 2020
Text: A total solar eclipse occurred on Monday, December 14, 2020. A solar eclipse occurs when the Moon passes between Earth and the Sun, which will cover the image of the Sun for a viewer on Earth.

ID: zyKfN5EB0RS1pNqNt7y0
Score: 0.41584742
Title: Phases of the Moon
Text: As the Moon orbits around the Earth, the half of the Moon that faces the Sun will be lit up. The different shapes of the lit portion of the Moon that can be seen from Earth are known as phases of the Moon. Each phase repeats itself every 29.5 days.

Much better! Not only are the top results semantically close to our query "the Moon covers the Sun", the irrelevant results with a low score were discarded from the response. As a result, the list of articles we ended up with are indeed those that provide the best answer to our question.

What's also great about reranking is that it can be used on top of existing search solutions out of the box. Under the hood the same lexical search was executed as before - the one that resulted in mixed hits -, then Cohere took the texts from the top articles and reordered them according to their relation to our query's meaning.

Whether your search application uses lexical, vector or hybrid search, reranking can improve your results.

Conclusion

Semantic reranking is an incredibly powerful tool for boosting the performance of a search experience or a RAG tool. It lets us immediately add semantic search capabilities to existing Elasticsearch installations out there.