Notebooks
E
Elastic
11 Semantic Reranking Hugging Face

11 Semantic Reranking Hugging Face

openai-chatgptlangchain-pythonchatgptgenaielasticsearchelasticopenaiAIchatlogvectordatabasenotebooksPythonsearchgenaistackvectorelasticsearch-labslangchainapplications

Semantic reranking with a Hugging Face model

Open In Colab

In this notebook you'll learn how to implement semantic reranking in Elasticsearch by uploading a model from Hugging Face into your cluster. You'll also learn about the retriever abstraction, a simpler syntax for crafting queries and combining different search operations.

You will:

  • Choose a cross-encoder model from Hugging Face to perform semantic reranking
  • Upload the model to your Elasticsearch deployment using Eland
  • Create an inference endpoint to manage your rerank task
  • Query your data using the text_similarity_rerank retriever

馃О Requirements

For this example, you will need:

  • An Elastic deployment:

  • Elasticsearch 8.15.0 or above (for non-serverless deployments)

Install packages

This will take a couple of minutes.

[ ]

Import packages

[4]

Initialize Elasticsearch Python client

You need to connect to a running Elasticsearch instance. In this example we're using an Elastic Cloud deployment.

[5]
Elastic Cloud ID: 路路路路路路路路路路
Elastic Api Key: 路路路路路路路路路路

Test connection

Confirm that the Python client has connected to your Elasticsearch instance with this test.

[ ]

Enable Telemetry

Knowing that you are using this notebook helps us decide where to invest our efforts to improve our products. We would like to ask you that you run the following code to let us gather anonymous usage statistics. See telemetry.py for details. Thank you!

[ ]

Upload sample data

This examples uses a small dataset of movies.

[7]
Done indexing documents into `movies` index!

Upload Hugging Face model using Eland

Now we'll use Eland's eland_import_hub_model command to upload the model to Elasticsearch. For this example we've chosen the cross-encoder/ms-marco-MiniLM-L-6-v2 text similarity model.

Refer to聽the Elastic NLP model reference聽for a list of third-party text similarity models supported by Elasticsearch.

鈩癸笍 This example uses an Elastic Cloud deployment with an API key, but there are more deployment and authentication options.

[8]
2024-08-13 17:04:12,386 INFO : Establishing connection to Elasticsearch
2024-08-13 17:04:12,567 INFO : Connected to serverless cluster 'bd8c004c050e4654ad32fb86ab159889'
2024-08-13 17:04:12,568 INFO : Loading HuggingFace transformer tokenizer and model 'cross-encoder/ms-marco-MiniLM-L-6-v2'
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
tokenizer_config.json: 100% 316/316 [00:00<00:00, 1.81MB/s]
config.json: 100% 794/794 [00:00<00:00, 4.09MB/s]
vocab.txt: 100% 232k/232k [00:00<00:00, 2.37MB/s]
special_tokens_map.json: 100% 112/112 [00:00<00:00, 549kB/s]
pytorch_model.bin: 100% 90.9M/90.9M [00:00<00:00, 135MB/s]
STAGE:2024-08-13 17:04:15 1454:1454 ActivityProfilerController.cpp:312] Completed Stage: Warm Up
STAGE:2024-08-13 17:04:15 1454:1454 ActivityProfilerController.cpp:318] Completed Stage: Collection
STAGE:2024-08-13 17:04:15 1454:1454 ActivityProfilerController.cpp:322] Completed Stage: Post Processing
2024-08-13 17:04:18,789 INFO : Creating model with id 'cross-encoder__ms-marco-minilm-l-6-v2'
2024-08-13 17:04:21,123 INFO : Uploading model definition
100% 87/87 [00:55<00:00,  1.57 parts/s]
2024-08-13 17:05:16,416 INFO : Uploading model vocabulary
2024-08-13 17:05:16,987 INFO : Starting model deployment
2024-08-13 17:05:18,238 INFO : Model successfully imported with id 'cross-encoder__ms-marco-minilm-l-6-v2'

Create inference endpoint

Next we'll create an inference endpoint for the rerank task to deploy and manage our model and, if necessary, spin up the necessary ML resources behind the scenes.

[20]
ObjectApiResponse({'inference_id': 'my-msmarco-minilm-model', 'task_type': 'rerank', 'service': 'elasticsearch', 'service_settings': {'num_allocations': 1, 'num_threads': 1, 'model_id': 'cross-encoder__ms-marco-minilm-l-6-v2'}, 'task_settings': {'return_documents': True}})

Run the following command to confirm your inference endpoint is deployed.

[ ]

鈿狅笍 When you deploy your model, you might need to sync your ML saved objects in the Kibana (or Serverless) UI. Go to Trained Models and select Synchronize saved objects.

Lexical queries

First let's use a standard retriever to test out some lexical (or full-text) searchs and then we'll compare the improvements when we layer in semantic reranking.

Lexical match with query_string query

Let's say we vaguely remember that there is a famous movie about a killer who eats his victims. For the sake of argument, pretend we've momentarily forgotten the word "cannibal".

Let's perform a query_string query to find the phrase "flesh-eating bad guy" in the plot fields of our Elasticsearch documents.

[21]
No search results found

No results! Unfortunately we don't have any near exact matches for "flesh-eating bad guy". Because we don't have any more specific information about the exact phrasing in the Elasticsearch data, we'll need to cast our search net wider.

Simple match query

This lexical query performs a standard keyword search for the term "crime" within the "plot" and "genre" fields of our Elasticsearch documents.

[22]
Title: The Godfather
Plot: An organized crime dynasty's aging patriarch transfers control of his clandestine empire to his reluctant son.

Title: Goodfellas
Plot: The story of Henry Hill and his life in the mob, covering his relationship with his wife Karen Hill and his mob partners Jimmy Conway and Tommy DeVito in the Italian-American crime syndicate.

Title: The Silence of the Lambs
Plot: A young F.B.I. cadet must receive the help of an incarcerated and manipulative cannibal killer to help catch another serial killer, a madman who skins his victims.

Title: Pulp Fiction
Plot: The lives of two mob hitmen, a boxer, a gangster and his wife, and a pair of diner bandits intertwine in four tales of violence and redemption.

Title: Se7en
Plot: Two detectives, a rookie and a veteran, hunt a serial killer who uses the seven deadly sins as his motives.

Title: The Departed
Plot: An undercover cop and a mole in the police attempt to identify each other while infiltrating an Irish gang in South Boston.

Title: The Usual Suspects
Plot: A sole survivor tells of the twisty events leading up to a horrific gun battle on a boat, which began when five criminals met at a seemingly random police lineup.

Title: The Dark Knight
Plot: When the menace known as the Joker wreaks havoc and chaos on the people of Gotham, Batman must accept one of the greatest psychological and physical tests of his ability to fight injustice.

That's better! At least we've got some results now. We broadened our search criteria to increase the chances of finding relevant results.

But these results aren't very precise in the context of our original query "flesh-eating bad guy". We can see that "The Silence of the Lambs" is returned in the middle of the results set with this generic match query. Let's see if we can use our semantic reranking model to get closer to the searcher's original intent.

Semantic reranker

In the following retriever syntax, we wrap our standard match query retriever in a text_similarity_reranker. This allows us to leverage the NLP model we deployed to Elasticsearch to rerank the results based on the phrase "flesh-eating bad guy".

[23]
Title: The Silence of the Lambs
Plot: A young F.B.I. cadet must receive the help of an incarcerated and manipulative cannibal killer to help catch another serial killer, a madman who skins his victims.

Title: Pulp Fiction
Plot: The lives of two mob hitmen, a boxer, a gangster and his wife, and a pair of diner bandits intertwine in four tales of violence and redemption.

Title: Se7en
Plot: Two detectives, a rookie and a veteran, hunt a serial killer who uses the seven deadly sins as his motives.

Title: Goodfellas
Plot: The story of Henry Hill and his life in the mob, covering his relationship with his wife Karen Hill and his mob partners Jimmy Conway and Tommy DeVito in the Italian-American crime syndicate.

Title: The Dark Knight
Plot: When the menace known as the Joker wreaks havoc and chaos on the people of Gotham, Batman must accept one of the greatest psychological and physical tests of his ability to fight injustice.

Title: The Godfather
Plot: An organized crime dynasty's aging patriarch transfers control of his clandestine empire to his reluctant son.

Title: The Departed
Plot: An undercover cop and a mole in the police attempt to identify each other while infiltrating an Irish gang in South Boston.

Title: The Usual Suspects
Plot: A sole survivor tells of the twisty events leading up to a horrific gun battle on a boat, which began when five criminals met at a seemingly random police lineup.

Success! "The Silence of the Lambs" is our top result. Semantic reranking helped us find the most relevant result by parsing a natural language query, overcoming the limitations of lexical search that relies on keyword matching.

Semantic reranking enables semantic search in a few steps, without the need for generating and storing embeddings. This a great tool for testing and building hybrid search systems in Elasticsearch.

Learn more