Notebooks
E
Elastic
08 Learning To Rank

08 Learning To Rank

openai-chatgptlangchain-pythonchatgptgenaielasticsearchelasticopenaiAIchatlogvectordatabasenotebooksPythonsearchgenaistackvectorelasticsearch-labslangchainapplications

How to train and deploy Learning To Rank

Open In Colab

In this notebook, we'll:

  • Connect to an Elasticsearch deployment using the official Python client.
  • Import and index a movie dataset into Elasticsearch.
  • Extract features from our dataset using Elasticsearch's Query DSL, including custom script_score queries.
  • Build a training dataset by combining extracted features with a human curated judgment list.
  • Train a Learning To Rank model using XGBoost.
  • Deploy the trained model to Elasticsearch using Eland.
  • Use the model as a rescorer for second stage re-ranking.
  • Evaluate the impact of the LTR model on search relevance, by comparing search results before and after applying the model.

NOTE:

  • Learning To Rank is generally available for Elastic Stack versions 8.15.0 and newer and requires an Enterprise subscription or higher.

Install required packages

First we must install the packages we need for this notebook.

[18]

Configure your Elasticsearch deployment

For this example, we will be using an Elastic Cloud deployment (available with a free trial).

[ ]

Enable Telemetry

Knowing that you are using this notebook helps us decide where to invest our efforts to improve our products. We would like to ask you that you run the following code to let us gather anonymous usage statistics. See telemetry.py for details. Thank you!

[ ]

Test the Client

Before you continue, confirm that the client has connected with this test.

[ ]

Configure the dataset

We'll use a dataset derived from the MSRD (Movie Search Ranking Dataset).

The dataset is available here and contains the following files:

  • movies_corpus.jsonl.gz: Movie dataset to be indexed.
  • movies_judgements.tsv.gz: Judgment list of relevance judgments for a set of queries.
  • movies_index_settings.json: Settings to be applied to the documents and index.
[20]

Import the document corpus

This step will import the documents of the corpus into the movies index .

Documents contains the following fields:

Field nameDescription
idId of the document
titleMovie title
overviewA short description of the movie
actorsList of actors in the movies
directorDirector of the movie
charactersList of characters that appear in the movie
genresGenres of the movie
yearYear the movie was released
budgetBudget of the movies in USD
votesNumber of votes received by the movie
ratingAverage rating of the movie
popularityNumber use to measure the movie popularity
tagsA list of tags for the movies
[8]
Deleting index if it already exists: movies
Creating index: movies
Loading the corpus from https://raw.githubusercontent.com/elastic/elasticsearch-labs/ltr-notebook/notebooks/search/sample_data/learning-to-rank/movies-corpus.jsonl.gz
Indexing the corpus into movies ...
Indexed 9750 documents into movies

Loading the judgment list

The judgment list contains human evaluations that we'll use to train our Learning To Rank model.

Each row represents a query-document pair with an associated relevance grade and contains the following columns:

ColumnDescription
query_idPairs for the same query are grouped together and received a unique id.
queryActual text of the query.
doc_idID of the document.
gradeThe relevance grade of the document for the query.

Note:

In this example the relevance grade is a binary value (relevant or not relavant). You could also use a number that represents the degree of relevance (e.g. from 0 to 4).

[34]

Configure feature extraction

Features are the inputs to our model. They represent information about the query alone, a result document alone, or a result document in the context of a query, such as BM25 scores.

Features are defined using standard templated queries and the Query DSL.

To streamline the process of defining and refining feature extraction during training, we have incorporated a number of primitives directly in eland.

[35]

Building the training dataset

Now that we have our basic datasets loaded, and feature extraction configured, we'll use our judgment list to come up with the final dataset for training. The dataset will consist of rows containing <query, document> pairs, as well as all of the features we need to train the model. To generate this dataset, we'll run each query from the judgment list and add the extracted features as columns for each of the labelled result documents.

For example, if we have a query q1 with two labelled documents d3 and d9, the training dataset will end up with two rows — one for each of the pairs <q1, d3> and <q1, d9>.

Note that because this executes queries on your Elasticsearch cluster, the time to run this operation will vary depending on where the cluster is hosted and where this notebook runs. For example, if you run the notebook on the same server or host as the Elasticsearch cluster, this operation tends to run very quickly on the sample dataset (< 2 mins).

[36]
  0%|          | 0/16279 [00:00<?, ?it/s]100%|██████████| 16279/16279 [01:38<00:00, 165.18it/s]

Create and train the model

The LTR rescorer supports XGBRanker trained models.

Learn more in the XGBoost documentation.

[37]
[0]	validation_0-ndcg@10:0.85757
[1]	validation_0-ndcg@10:0.86397
[2]	validation_0-ndcg@10:0.86582
[3]	validation_0-ndcg@10:0.86694
[4]	validation_0-ndcg@10:0.86738
[5]	validation_0-ndcg@10:0.86704
[6]	validation_0-ndcg@10:0.86777
[7]	validation_0-ndcg@10:0.86823
[8]	validation_0-ndcg@10:0.86925
[9]	validation_0-ndcg@10:0.86903
[10]	validation_0-ndcg@10:0.86973
[11]	validation_0-ndcg@10:0.87008
[12]	validation_0-ndcg@10:0.86990
[13]	validation_0-ndcg@10:0.87030
[14]	validation_0-ndcg@10:0.87067
[15]	validation_0-ndcg@10:0.87027
[16]	validation_0-ndcg@10:0.87144
[17]	validation_0-ndcg@10:0.87159
[18]	validation_0-ndcg@10:0.87195
[19]	validation_0-ndcg@10:0.87159
[20]	validation_0-ndcg@10:0.87171
[21]	validation_0-ndcg@10:0.87234
[22]	validation_0-ndcg@10:0.87243
[23]	validation_0-ndcg@10:0.87256
[24]	validation_0-ndcg@10:0.87294
[25]	validation_0-ndcg@10:0.87327
[26]	validation_0-ndcg@10:0.87371
[27]	validation_0-ndcg@10:0.87406
[28]	validation_0-ndcg@10:0.87410
[29]	validation_0-ndcg@10:0.87426
[30]	validation_0-ndcg@10:0.87455
[31]	validation_0-ndcg@10:0.87485
[32]	validation_0-ndcg@10:0.87482
[33]	validation_0-ndcg@10:0.87499
[34]	validation_0-ndcg@10:0.87505
[35]	validation_0-ndcg@10:0.87557
[36]	validation_0-ndcg@10:0.87594
[37]	validation_0-ndcg@10:0.87592
[38]	validation_0-ndcg@10:0.87618
[39]	validation_0-ndcg@10:0.87623
[40]	validation_0-ndcg@10:0.87648
[41]	validation_0-ndcg@10:0.87632
[42]	validation_0-ndcg@10:0.87657
[43]	validation_0-ndcg@10:0.87670
[44]	validation_0-ndcg@10:0.87724
[45]	validation_0-ndcg@10:0.87766
[46]	validation_0-ndcg@10:0.87765
[47]	validation_0-ndcg@10:0.87744
[48]	validation_0-ndcg@10:0.87800
[49]	validation_0-ndcg@10:0.87824
[50]	validation_0-ndcg@10:0.87822
[51]	validation_0-ndcg@10:0.87838
[52]	validation_0-ndcg@10:0.87867
[53]	validation_0-ndcg@10:0.87869
[54]	validation_0-ndcg@10:0.87873
[55]	validation_0-ndcg@10:0.87878
[56]	validation_0-ndcg@10:0.87899
[57]	validation_0-ndcg@10:0.87907
[58]	validation_0-ndcg@10:0.87891
[59]	validation_0-ndcg@10:0.87909
[60]	validation_0-ndcg@10:0.87914
[61]	validation_0-ndcg@10:0.87934
[62]	validation_0-ndcg@10:0.87920
[63]	validation_0-ndcg@10:0.87930
[64]	validation_0-ndcg@10:0.87915
[65]	validation_0-ndcg@10:0.87913
[66]	validation_0-ndcg@10:0.87956
[67]	validation_0-ndcg@10:0.87952
[68]	validation_0-ndcg@10:0.88009
[69]	validation_0-ndcg@10:0.88007
[70]	validation_0-ndcg@10:0.87995
[71]	validation_0-ndcg@10:0.87988
[72]	validation_0-ndcg@10:0.88003
[73]	validation_0-ndcg@10:0.88031
[74]	validation_0-ndcg@10:0.88023
[75]	validation_0-ndcg@10:0.88025
[76]	validation_0-ndcg@10:0.88039
[77]	validation_0-ndcg@10:0.88038
[78]	validation_0-ndcg@10:0.88064
[79]	validation_0-ndcg@10:0.88053
[80]	validation_0-ndcg@10:0.88062
[81]	validation_0-ndcg@10:0.88067
[82]	validation_0-ndcg@10:0.88077
[83]	validation_0-ndcg@10:0.88131
[84]	validation_0-ndcg@10:0.88132
[85]	validation_0-ndcg@10:0.88128
[86]	validation_0-ndcg@10:0.88164
[87]	validation_0-ndcg@10:0.88171
[88]	validation_0-ndcg@10:0.88180
[89]	validation_0-ndcg@10:0.88206
[90]	validation_0-ndcg@10:0.88209
[91]	validation_0-ndcg@10:0.88195
[92]	validation_0-ndcg@10:0.88197
[93]	validation_0-ndcg@10:0.88209
[94]	validation_0-ndcg@10:0.88189
[95]	validation_0-ndcg@10:0.88240
[96]	validation_0-ndcg@10:0.88259
[97]	validation_0-ndcg@10:0.88265
[98]	validation_0-ndcg@10:0.88268
[99]	validation_0-ndcg@10:0.88272
[38]
Output

Import the model into Elasticsearch

Once the model is trained we can use Eland to load it into Elasticsearch.

Please note that the MLModel.import_ltr_model method contains the LTRModelConfig object which defines how features should be extracted for the model being imported.

[39]
<eland.ml.ml_model.MLModel at 0x2ae5734c0>

Using the rescorer

Once the model is uploaded to Elasticsearch, you will be able to use it as a rescorer in the _search API, as shown in this example:

	GET /movies/_search
{
   "query" : {
      "multi_match" : {
         "query": "star wars",
         "fields": ["title", "overview", "actors", "director", "tags", "characters"]
      }
   },
   "rescore" : {
      "window_size" : 50,
      "learning_to_rank" : {
         "model_id": "ltr-model-xgboost",
         "params": { 
            "query": "star wars"
         }
      }
   }
}

[40]
[('Star Wars', 10.971989, '11'),
, ('Star Wars: The Clone Wars', 9.923633, '12180'),
, ('Andor: A Disney+ Day Special Look', 8.9880295, '1022100'),
, ("Family Guy Presents: It's a Trap!", 8.845748, '278427'),
, ('Star Wars: The Rise of Skywalker', 8.053349, '181812'),
, ('Star Wars: The Force Awakens', 8.053349, '140607'),
, ('Star Wars: The Last Jedi', 8.053349, '181808'),
, ('Solo: A Star Wars Story', 8.053349, '348350'),
, ('The Star Wars Holiday Special', 8.053349, '74849'),
, ('Phineas and Ferb: Star Wars', 8.053349, '392216')]
[41]
[('Star Wars', 4.1874104, '11'),
, ('Star Wars: The Clone Wars', 2.3627238, '12180'),
, ('Star Wars: The Rise of Skywalker', 1.7667875, '181812'),
, ('Star Wars: The Force Awakens', 1.3336482, '140607'),
, ('Star Wars: The Last Jedi', 1.3336482, '181808'),
, ('Rogue One: A Star Wars Story', 1.1134433, '330459'),
, ('LEGO Star Wars Summer Vacation', 1.082971, '980804'),
, ("Doraemon: Nobita's Little Star Wars 2021", 0.9138395, '782054'),
, ('LEGO Star Wars Terrifying Tales', 0.89640737, '857702'),
, ('Solo: A Star Wars Story', 0.65811557, '348350')]

As also shown in the feature importance graph above, we can see in this results list that the title_bm25 and popularity features are weighted highly in our trained model. Now all results include the query terms in the title, showing the importance of the title_bm25 feature. Similarly, more popular movies now rank higher, for example Rogue One: A Star Wars Story is now in sixth position.